0.1.4

Saganaki22 released this 13 Jun 20:02

· 4 commits to main since this release

38d7d66

Zonos2 TTS ComfyUI v0.1.4

Highlights

Added adaptive ComfyUI/AIMDO memory management.
Added genuine AIMDO VBAR paging for low-VRAM GPUs.
Improved model loading, unloading, visualization, and lifecycle handling.

AIMDO and VRAM Management

BF16 main model size is estimated at approximately 14.324 GiB.
A 3 GiB runtime reserve creates an automatic VBAR cutoff of approximately 17.324 GiB total VRAM.
8 GiB, 12 GiB, and 16 GiB GPUs use the real AIMDO CoreModelPatcher and VBAR path when DynamicVRAM is enabled.
Larger GPUs use the faster static CUDA path when the model fits with the runtime reserve.
Dynamic loading pages only selected MoE experts into VRAM.
VBAR allocation, residency, page faults, and eviction are genuine AIMDO operations.
DAC and speaker encoder remain managed as smaller static modules.
Automatic path selection is based on total VRAM rather than currently free VRAM.

Model Lifecycle

Integrated the main model, DAC, and speaker encoder with ComfyUI model management.
Reuses an existing bundle when model, dtype, and attention settings are unchanged.
Changing model, dtype, or attention now performs a complete hard unload.
Improved cleanup of tensors, AIMDO registrations, references, and accelerator caches.
Restored accurate tensor visibility in ComfyUI Memory Visualization.

Performance

Optimized single-token MoE decoding to execute only routed experts.
Avoided scanning all experts during each autoregressive generation step.
Preserved direct CUDA execution on GPUs with sufficient VRAM.
Retained file-backed expert weights for dynamic low-VRAM execution.

Fixes

Progress bars now use the checkpoint's real tensor count.
Fixed resampler caching so it does not retain the speaker encoder indefinitely.
Cached resamplers now follow speaker encoder device changes.
Improved speaker encoder registration with ComfyUI model management.
Improved model unloading and reloading after configuration changes.

Documentation

Expanded English and Chinese AIMDO memory-management documentation.
Documented static and dynamic loading behavior.
Documented the 14.324 GiB model estimate, 3 GiB reserve, and 17.324 GiB cutoff.
Added the complete supported-language tier table.
Added the official Zyphra ZONOS2 blog badge.
Expanded VRAM and out-of-memory troubleshooting guidance.

Testing

Added runtime-management, AIMDO selection, lifecycle, progress, resampler, and MoE module tests.
All 24 tests pass.

Assets 2