Approach based on AnimateDiff with Kandinsky-2 models and interpolation FILM models.
Some Examples (More examples see below):
WARNING!
Current model version was trained 1 epoch
on 3% ~ 350k video
data from WebVid dataset.
So, it maybe difficult to get a good result.
GPU MEM requirements (RTX 3090 or 4090 at least):
- 512x512 generation ~ 20.5 GB
- 768x768 generation ~ 23.5 GB
PS.
Best image generation with 4 < guidance_scale < 8 and image_size = 768.
git clone https://github.com/TheDenk/Kandimate.git
cd Kandimate
Requirements with pip
pip install -r requiremetns.txt
Or with conda
conda env create -f environment.yaml
conda activate kandimate
git lfs install
git clone https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder ./models/kandinsky-2-2-decoder
git clone https://huggingface.co/kandinsky-community/kandinsky-2-2-prior ./models/kandinsky-2-2-prior
bash download_bashscripts/download-motion-module.sh
bash download_bashscripts/download-interpolation-models.sh
You may also directly download the motion module and interpolation models checkpoints from Google Drive, then put them in models/motion-modules/
folder and models/interpolation-models/
respectively.
Interpolation models also can be found here.
All generation parameters, such as prompt
, negative_prompt
, seed
, etc. stored in config file configs/inference/inference.yaml
.
After downloading the all models, run the following commands to generate animations.
The results will automatically be saved to samples/
folder.
python -m scripts.animate --config ./configs/inference/inference.yaml
It is recommend users to generate animation with 16 frames and 768 resolution. Notably, various resolution/frames may affect the quality more or less.
Original | Interpolated |
Also you can apply interpolation between frames to make gif more smoothness.
Set path to gif and inerpolation parameters in ./configs/interpolate/interpolate.yaml
.
python -m scripts.interpolate --config ./configs/interpolate/interpolate.yaml
PS.
It is not recommended to use interpolation when generating nature videos.
Before training, download the videos files and the .csv
annotations of WebVid10M to the local mechine.
Note that the training script requires all the videos to be saved in a single folder. You may change this by modifying kandimate/data/dataset.py
.
After dataset preparations, update the below data paths in the config .yaml
files in configs/training/
folder:
train_data:
csv_path: [Replace with .csv Annotation File Path]
video_folder: [Replace with Video Folder Path]
sample_size: 256
Other training parameters (lr, epochs, validation settings, etc.) are also included in the config files.
To train motion modules
torchrun --nnodes=1 --nproc_per_node=1 train.py --config configs/training/training.yaml
- Add train and inference scripts (py and jupyter).
- Add interpolation inference scripts (py and jupyter).
- Add Gradio Demo (probably).
- Add controlnet (probably).
Here several best results.
Codebase AnimateDiff and Tune-a-Video.
Diffusion models Kandinsky-2.
Interpolation models FILM.
Issues should be raised directly in the repository. For professional support and recommendations please welcomedenk@gmail.com.