Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a Windows-native stack overflow when dynamically loading deep CLIP/T5 graphs by introducing a non-recursive guard for ModelPatcherDynamic._load_list, wiring it into the CLIP wrapper entrypoints, and bumping the package version.
Changes:
- Added
clip_dynamic_load_list_guard.pyto patchcomfy.model_patcher.ModelPatcherDynamic._load_listwith a depth-aware, non-recursive traversal forfor_dynamic=True. - Registered the guard during CLIP loader overrides in
wrappers.py. - Bumped
pyproject.tomlversion from2.6.1to2.6.2.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
clip_dynamic_load_list_guard.py |
Implements and registers the non-recursive _load_list guard for deep dynamic-loading graphs on Windows. |
wrappers.py |
Calls guard registration during CLIP loader overrides so the patched behavior is active during CLIP dynamic loading. |
pyproject.toml |
Version bump to ship the fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| sort_criteria = (module_offload_mem >= 64 * 1024, -module_offload_mem) | ||
| loading.append(sort_criteria + (module_mem, n, m, params)) | ||
|
|
There was a problem hiding this comment.
_safe_dynamic_load_list() appends sort_criteria + (module_mem, n, m, params), producing a 6-tuple whose first element is a boolean. Elsewhere in this repo, _load_list() results are assumed to be a 5-tuple where the first element is numeric module_offload_mem (see distorch_2.unpack_load_item). If ModelPatcherDynamic._load_list follows the same contract (as ComfyUI 0.6.0+ does), this will break callers that expect bytes in position 0 and/or len(item)==5. Please align the return item shape/order to the original ComfyUI _load_list tuple contract for your target ComfyUI version.
| sort_criteria = (module_offload_mem >= 64 * 1024, -module_offload_mem) | |
| loading.append(sort_criteria + (module_mem, n, m, params)) | |
| # Append a 5-tuple matching the original ModelPatcherDynamic._load_list | |
| # contract: (module_offload_mem, module_mem, name, module, params). | |
| loading.append((module_offload_mem, module_mem, n, m, params)) | |
| # Preserve the original sort behavior without changing the tuple shape. | |
| # First, prefer modules that require offloading (module_offload_mem >= 64KB), | |
| # then sort by descending module_offload_mem. | |
| loading.sort(key=lambda item: (item[0] >= 64 * 1024, -item[0])) |
| _PATCH_MARKER = "_mgpu_issue21_clip_dynamic_load_list_guard" | ||
| _MODULE_THRESHOLD = 200 | ||
| _DEPTH_THRESHOLD = 200 | ||
|
|
There was a problem hiding this comment.
The patch marker and log messages reference "Issue21" (e.g. _PATCH_MARKER = "_mgpu_issue21_..." and log prefixes), but this PR is specifically for #178. This makes operational logs and future debugging misleading; consider renaming the marker and log text to match #178 (or use a generic, issue-agnostic label).
| def guarded_load_list(self, for_dynamic=False, default_device=None): | ||
| if not for_dynamic: | ||
| return original(self, for_dynamic=for_dynamic, default_device=default_device) | ||
|
|
||
| if _graph_requires_guard(self.model): | ||
| logger.info("[MultiGPU Issue21] Using non-recursive ModelPatcherDynamic._load_list guard") | ||
| return _safe_dynamic_load_list(self, default_device=default_device) | ||
|
|
||
| return original(self, for_dynamic=for_dynamic, default_device=default_device) |
There was a problem hiding this comment.
guarded_load_list() always runs _graph_requires_guard(self.model) for for_dynamic=True, which traverses the entire module graph, and then _safe_dynamic_load_list() traverses it again when the guard is needed. If _load_list(for_dynamic=True) is called frequently, this adds measurable overhead. Consider caching the guard decision per model (e.g., storing a flag on self.model or self) so the graph walk happens at most once per model instance.
This fixes the Windows-native stack overflow in the deep CLIP/T5 dynamic loading path tracked in #178.
What changed
clip_dynamic_load_list_guard.pyto overrideModelPatcherDynamic._load_listwith a non-recursive traversal when the graph is deep enough to trip the Windows C-stackoverride_class_clipandoverride_class_clip_no_deviceinwrappers.pypyproject.tomlfrom2.6.1to2.6.2Validation
_load_list(for_dynamic=True)path without the guardruff check .passespylint --rcfile pyproject.tomlpasses at10.00/10Files
clip_dynamic_load_list_guard.pywrappers.pypyproject.tomlThis branch contains only the #178 fix relative to
main.Cheers!