Skip to content

0.1.4

Choose a tag to compare

@Saganaki22 Saganaki22 released this 13 Jun 20:02
· 4 commits to main since this release

Zonos2 TTS ComfyUI v0.1.4

Highlights

  • Added adaptive ComfyUI/AIMDO memory management.
  • Added genuine AIMDO VBAR paging for low-VRAM GPUs.
  • Improved model loading, unloading, visualization, and lifecycle handling.

AIMDO and VRAM Management

  • BF16 main model size is estimated at approximately 14.324 GiB.
  • A 3 GiB runtime reserve creates an automatic VBAR cutoff of approximately 17.324 GiB total VRAM.
  • 8 GiB, 12 GiB, and 16 GiB GPUs use the real AIMDO CoreModelPatcher and VBAR path when DynamicVRAM is enabled.
  • Larger GPUs use the faster static CUDA path when the model fits with the runtime reserve.
  • Dynamic loading pages only selected MoE experts into VRAM.
  • VBAR allocation, residency, page faults, and eviction are genuine AIMDO operations.
  • DAC and speaker encoder remain managed as smaller static modules.
  • Automatic path selection is based on total VRAM rather than currently free VRAM.

Model Lifecycle

  • Integrated the main model, DAC, and speaker encoder with ComfyUI model management.
  • Reuses an existing bundle when model, dtype, and attention settings are unchanged.
  • Changing model, dtype, or attention now performs a complete hard unload.
  • Improved cleanup of tensors, AIMDO registrations, references, and accelerator caches.
  • Restored accurate tensor visibility in ComfyUI Memory Visualization.

Performance

  • Optimized single-token MoE decoding to execute only routed experts.
  • Avoided scanning all experts during each autoregressive generation step.
  • Preserved direct CUDA execution on GPUs with sufficient VRAM.
  • Retained file-backed expert weights for dynamic low-VRAM execution.

Fixes

  • Progress bars now use the checkpoint's real tensor count.
  • Fixed resampler caching so it does not retain the speaker encoder indefinitely.
  • Cached resamplers now follow speaker encoder device changes.
  • Improved speaker encoder registration with ComfyUI model management.
  • Improved model unloading and reloading after configuration changes.

Documentation

  • Expanded English and Chinese AIMDO memory-management documentation.
  • Documented static and dynamic loading behavior.
  • Documented the 14.324 GiB model estimate, 3 GiB reserve, and 17.324 GiB cutoff.
  • Added the complete supported-language tier table.
  • Added the official Zyphra ZONOS2 blog badge.
  • Expanded VRAM and out-of-memory troubleshooting guidance.

Testing

  • Added runtime-management, AIMDO selection, lifecycle, progress, resampler, and MoE module tests.
  • All 24 tests pass.