ComfyUI-LongCat-Avatar v0.2.0
Initial public release of ComfyUI LongCat Avatar, a ComfyUI custom node package for LongCat Video Avatar 1.5 audio-driven human video generation.
Highlights
- ComfyUI-native workflow for LongCat Video Avatar 1.5
- Supports single-person audio-driven generation in
ai2vandat2vmodes - Supports two-person / dual-audio generation in
ai2vmode - Uses Whisper-large-v3 audio conditioning for Avatar 1.5
- Supports required Avatar 1.5 DMD/distill LoRA inference with the official 8-step default
- Supports 480p and 720p generation
- Includes a README demo video and ready-to-use example workflow
Model Loading
This release supports three DiT weight modes:
single_file_safetensorsofficial_shardedofficial_int8_sharded
Official sharded and INT8 sharded checkpoints are validated before inference. The node can also perform bounded automatic downloads for known official Avatar 1.5 sharded DiT assets and shared LongCat text encoder assets.
VRAM And Runtime Controls
This release includes practical controls for lower-VRAM setups:
official_int8_shardedmode for lower VRAM usageblock_numlayer-streaming control- CPU offload options for VAE and native text encoder paths
- Optional attention backends:
sdpa,flash_attn_2,flash_attn_3,xformers, andsageattn
For 12GB-class GPUs, start with:
official_int8_sharded480pblock_num = 1- sampler
offload_device = cpu - text encode
offload_device = cpu
Nodes Included
(auto)Load LongCat Avatar ModelLongCat Avatar WhisperLongCat Avatar Text EncodeLongCat Avatar Audio CropLongCat Avatar Audio EncodeLongCat Avatar Audio WindowLongCat Avatar SamplerLongCat Avatar Vocal ModelLongCat Avatar Vocal Extract
Optional Features
- Optional vocal extraction through
requirements-vocal.txt - Optional acceleration packages documented separately in
requirements-acceleration.txt - Audio crop preview support
- Optional muxed MP4 output through
mux_audio_path
Not Supported Yet
- GGUF DiT loading
- CPU inference
- macOS / MPS inference
- Avatar 1.0 / Wav2Vec2 runtime path
- FP16 or FP8 runtime precision switches
- Generic third-party wrapper scheduler modes
Notes
This release is CUDA-oriented and expects a working ComfyUI Python environment with a compatible NVIDIA GPU and CUDA PyTorch installation already available. The default requirements.txt intentionally avoids installing or replacing PyTorch and optional acceleration packages to reduce the chance of breaking an existing ComfyUI setup.