Release v0.55.0 — VRAM diet + lockup mitigation · Deaththegrim/ComfyUI-PromptLibrary

What's new

Detailer and sampler hardening targeting an AMD/HIP hard-freeze fingerprint observed under multi-pass detailer runs (gfxhub page faults → VRAM exhaustion → wedge).

Smart Detailer

Per-bbox soft_empty_cache removed — flushing the allocator every iteration defeated pooling and forced HIP alloc/free roundtrips (the wedge driver). Per-pass flush retained.
fp32→fp16 cast hoisted to detail() so subsequent passes reuse the owned buffer instead of cloning per-pass (saves ~100-200 MB at hires resolution per pass).
sam_mask_cache pruned between passes — drops bboxes no future pass will reuse. A 6-pass disjoint-detection run no longer holds full-resolution mask tensors for the whole call.
SAM predictor offloads to CPU at end of detail(), reclaiming ~600 MB-1.5 GB VRAM between runs without invalidating the model cache.

Samplers (SDXL + Anima)

Tile-step-down OOM retry for pixel-upscale-with-model and VAE encode/decode (256 → 128 → 64) instead of crashing.
sampler_sdxl frees decoded and upscaled between hires iterations (~200 MB of dead weight per iteration).
sampler_anima defragments allocator between sample and decode under fragmentation pressure.

Tests

+9 unit tests covering the new offload helper and the cache-pruning algorithm. 333 pass, 5 skip, 1 deselected (pre-existing unrelated comic_page test).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.55.0 — VRAM diet + lockup mitigation

Choose a tag to compare

Sorry, something went wrong.