Skip to content

liujin112/PortraitDiffusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting

Jin Liu, Huaibo Huang, Chao Jin, Ran He.

Pytorch implementation of Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting

arXiv demo

  • The HuggingFace demo runs on a CPU and may be slow. Please clone it and run it on your own GPU device.

Introduction

This paper proposes a trainingfree face stylization framework, named Portrait Diffusion. This framework leverages off-the-shelf text-to-image diffusion models, eliminating the need for fine-tuning specific examples. Specifically, the content and style images are first inverted into latent codes. Then, during image reconstruction using the corresponding latent code, the content and style features in the attention space are delicately blended through a modified self-attention operation called Style Attention Control. Additionally, a Chain-of-Painting method is proposed for the gradual redrawing of unsatisfactory areas from rough adjustments to fine-tuning. Extensive experiments validate the effectiveness of our Portrait Diffusion method and demonstrate the superiority of Chain-ofPainting in achieving precise face stylization.

To-do

  • Release the code.
  • Chain-of-Painting
  • Implementation for CFG>1
  • SDXL Support

Usage

Requirements

We implement our method with diffusers code base with similar code structure to MasaCtrl. The code runs on Python 3.9.17 with Pytorch 2.0.1.

pip install -r requirements.txt

Checkpoints

Stable Diffusion: We mainly conduct expriemnts on Stable Diffusion v1-5. It can auto download by diffusers pipeline with runwayml/stable-diffusion-v1-5.

Personalized Models: Our method also can work on various personalized models, includeing Full finetuned ckpt model, and PEFT model such as LoRA. You can download personlized models from CIVITAI or train one by yourself.

Inference

A simple inference code is provided as following, one can give the corresponding masks for content image and style image for better results. And set --only_mask_region to stylize the masked region only. A good result may need multiple Chain-of-Painting steps with different mask prompts. You can try to reduce num_inference_steps to fewer steps, but you need to set SAC_step to (50%-70%) of the total number of steps (achieve a balance between content preservation and stylization between (50%-70%) of the total steps).

export CUDA_VISIBLE_DEVICES=0
python main.py --style_guidance 1.2 \
              --SAC_step 35 \
              --num_inference_steps 50 \
              --content 'images/content/1.jpg' \
              --content_mask '' \
              --style 'images/style/1.jpg' \
              --style_mask '' \
              --output './results' \

Gradio Demo

Gradio demo provides more controllable settings. We intergrate SegmentAnything Model (SAM) for obtaining the masks directly.

python app.py

For personalized model usage, you should place the full model to models/Stable-diffusion or LoRA model to models/Lora, and select them in the gradio demo. We provide latent-consistency/lcm-lora-sdv1-5 as an additional option, which allows you to generate an image using very few steps (less than 10, but we recommend no less than 4 steps).

Acknowledgements

  • We thank MasaCtrl for their outstanding work!

Citation

@misc{liu2023portrait,
      title={Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting}, 
      author={Jin Liu and Huaibo Huang and Chao Jin and Ran He},
      year={2023},
      eprint={2312.02212},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

If you have any comments or questions, please open a new issue or feel free to contact the authors.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages