-
Notifications
You must be signed in to change notification settings - Fork 65
Description
Thank you for the great work! I recently test max diffusion and it works really well on my TPU setup. (v4)
But I would need Wan 2.1/2.2 LoRA supports to do the related research, as described below.
Summary
Wan 2.1 and Wan 2.2 are currently the most capable open-source video diffusion models available, yet MaxDiffusion still lacks full training and LoRA support for them. This significantly limits the applicability of MaxDiffusion for the video generation community.
Current State
- Wan 2.1: Full DiT finetuning is supported (added Oct 2025), but LoRA training is not supported.
- Wan 2.2: Neither full training nor LoRA training is supported. Wan 2.2 introduces a MoE architecture with high-noise and low-noise experts (A14B), along with a high-compression VAE and a 5B dense model (TI2V-5B).
Request
- Wan 2.2 full training + LoRA training: Wan 2.2 has been available since July 2025. Given its MoE design (dual-expert DiT), supporting both full finetuning and LoRA would require handling the high-noise/low-noise expert routing and the updated VAE.
- Wan 2.1 LoRA training: The existing Wan 2.1 full finetuning implementation should make adding LoRA relatively straightforward.
Context
Both models have been publicly available for 6+ months. Other training frameworks already support LoRA training for Wan 2.1 and Wan 2.2, but none of them provide JAX/XLA-native implementations for TPU, which is MaxDiffusion's core advantage.
Or do you know if there is other Wan 2.1/2.2 LoRA that can run on TPUs? thanks!
Willingness to Contribute
I'm happy to submit a pull request if the maintainers can provide guidance on the preferred implementation approach (e.g., LoRA injection points, config structure, checkpoint compatibility with diffusers).