Freehand-Genshin-Diffusion

Transferring Genshin PVs into a freehand style with Diffusion Model.

Plans

Inference codes of image-model
Pretrained weights for 480x320 resolution
Inference codes of video-model incorporating temporal module.
Training scripts

Examples

Here are some results we generated using the pretrained image-model, with the resolution of 480x320.

Here are the results generated by our pretrained video-model.

The model can be generalized to real-world videos:

Limitations

We observe following shortcomings in current version:

The primary issue is the temporal inconsistency in the generated frames, which causes flickering and jittering in the video.
Training and inference for this model are inefficient, requiring substantial computational resources.

Installation

Build Environtment

We Recommend a python version >=3.10 and cuda version =11.7. Then build environment as follows:

conda create -n Genshin python=3.10
conda activate Genshin
# Install requirements with pip:
pip install -r requirements.txt

Download weights

You can download weights manually, which has some steps:

Download our trained weights from BaiduDisk, which include two parts: denoising_unet.pth, reference_unet.pth.
(Optional) Download our newly trained weights from BaiduDisk, which include three parts: denoising_unet-54400.pth, reference_unet-54400.pth, and motion_module-146.pth.
Download pretrained weight of based models and other components:
Download the pretrained motion module weights of AnimateDiff mm_sd_v15_v2

Finally, these weights should be orgnized as follows:

./pretrained_weights/
|-- denoising_unet.pth
|-- reference_unet.pth
|-- denoising_unet-54400.pth
|-- reference_unet-54400.pth
|-- motion_module-146.pth
|-- mm_sd_v15_v2.ckpt
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
`-- stable-diffusion-v1-5
    |-- feature_extractor
    |   `-- preprocessor_config.json
    |-- model_index.json
    |-- unet
    |   |-- config.json
    |   `-- diffusion_pytorch_model.bin
    `-- v1-inference.yaml

Inference

Here is the cli command for running inference scripts:

image-model inference:

python -m scripts.genshin_paint_image --config ./configs/prompts/genshin_paint_image.yaml -W 480 -H 320

video-model inference:

python -m scripts.genshin_paint_video --config ./configs/prompts/genshin_paint_video.yaml -W 480 -H 320

You can refer the format of genshin_paint_image(video).yaml and modify the input_video_path to transfer other Gensin PVs in MP4 format.

Training

The training process involves two steps:

Step 1, train the image-model:

accelerate launch genshin_train_stage_1.py --config ./configs/train/genshin_stage1.yaml

Step 2, train the temporal module of the video-model:

accelerate launch genshin_train_stage_2.py --config ./configs/train/genshin_stage2.yaml

I am sorry that I can't open source the training data.

Disclaimer

This project is intended for academic research, and we explicitly disclaim any responsibility for user-generated content. Users are solely liable for their actions while using the generative model. The project contributors have no legal affiliation with, nor accountability for, users' behaviors. It is imperative to use the generative model responsibly, adhering to both ethical and legal standards.

Acknowledgements

This repository is build based on Moore-AnimateAnyone. We thank them for their excellent work in releasing high-quality code.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
configs		configs
scripts		scripts
src		src
tools		tools
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
genshin_train_stage_1.py		genshin_train_stage_1.py
genshin_train_stage_2.py		genshin_train_stage_2.py
requirements.txt		requirements.txt

License

Kebii/Freehand-Genshin-Diffusion

Folders and files

Latest commit

History

Repository files navigation

Freehand-Genshin-Diffusion

Plans

Examples

Limitations

Installation

Build Environtment

Download weights

Inference

Training

Disclaimer

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages