Skip to content

nano-xDiT v0.1.0

Latest

Choose a tag to compare

@Antlera Antlera released this 14 Jun 00:35
· 6 commits to main since this release

Minimal single-GPU Wan video-DiT inference + TeaCache / First-Block-Cache step-skipping, extracted from xDiT with all distributed/sequence-parallel machinery removed.

Highlights:

  • apply_cache_on_transformer hooks a diffusers WanTransformer3DModel via a block-stack wrapper (TeaCache uses e0/e signal; FBCache uses the first-block residual).
  • NanoWanPipeline: explicit, instrumentable denoising loop driving per-CFG-branch caches.
  • Verified on Wan2.1-T2V-1.3B (480x832, 33f, 30 steps): 1.57x @ thr=0.1 (37% skip), 2.26x @ thr=0.2 (57% skip). Forced-compute is bit-exact vs no-cache.

import nanoxdit