video.mp4
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Nan Chen1, Mengqi Huang1, Yihao Meng2, Zhendong Mao†,1
1USTC 2HKUST †corresponding author
Existing studies are limited to short-term colorization by fusing overlapping features to achieve smooth transitions, which fails to maintain long-term color consistency. In this study, we propose a dynamic global-local paradigm to achieve ideal long-term color consistency by dynamically extracting global color-consistent features relevant to the current generation.
🎉🎉 Our paper, “LongAnimation: Long Animation Generation with Dynamic Global-Local Memory” accepted by ICCV 2025! Strongly recommend seeing our demo page.
showcase_1.mp4
showcase_2.mp4
showcase_3.mp4
text_1.mp4
text_2.mp4
text_3.mp4
- Release the paper and demo page. Visit https://cn-makers.github.io/long_animation_web/
- Release the code.
The training is conducted on 6 A100 GPUs (80GB VRAM), the inference is tested on 1 A100 GPU.
git clone https://github.com/CN-makers/LongAnimation
cd LongAnimation
All the tests are conducted in Linux. We suggest running our code in Linux. To set up our environment in Linux, please run:
conda create -n LongAnimation python=3.10 -y
conda activate LongAnimation
bash install.sh
-
please download the pre-trained CogVideoX-1.5 I2V checkpoints from here, and put the whole folder under
pretrained_weight, it should look like./pretrained_weights/CogVideoX1.5-5B-I2V -
please download the pre-trained long video understanding model Video-XL checkpoints from here, and put the whole folder under
pretrained_weight, it should look like./pretrained_weights/videoxl -
please download the checkpoint for our SketchDiT and DGLM model from here, and put the whole folder as
./pretrained_weights/longanimation.
To colorize the target lineart sequence with a specific character design, you can run the following command:
bash long_animation_inference.sh
We provide some test cases in test_json folder. You can also try our model with your own data. You can change the lineart sequence and corresponding character design in the script Long_animation_inference.sh.
During the official training and testing, the --height and --weight we used were 576 and 1024 respectively. Additionally, the model can also be compatible with resolutions of 768 in length and 1360 in width respectively.
Don't forget to cite this source if it proves useful in your research!
@misc{chen2025longanimationlonganimationgeneration,
title={LongAnimation: Long Animation Generation with Dynamic Global-Local Memory},
author={Nan Chen and Mengqi Huang and Yihao Meng and Zhendong Mao},
year={2025},
eprint={2507.01945},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.01945},
}