Edit Image base on the Mask generated from the stable diffusion itself

Introduction

This is an unofficial implementation of the paper DiffEdit: Diffusion-based semantic image editing with mask guidance based on Stable Diffusion

All the weights and apis are token from Hugging Face Diffusers
weights of the Stable Diffusion img2imgpipeline are from runwayml/stable-diffusion-v1-5, you can get by this command:

cd ckpt
bash runwayml_sd_v1_5.sh

weights of the Stable Diffusion inpaintingpipeline are from stabilityai/stable-diffusion-2-inpainting, you can get by this command:

cd ckpt
bash inpaint_ckpt.sh

Scheduler is DDIMScheduler
Example images are from Google TEDBench
Hugging Face Diffusers has provided api for this paper, see this

Environment

python == 3.9.12
torch == 1.13.1
pillow == 9.4.0
scikit-image == 0.19.2
diffusers == 0.15.0
xformers == 0.0.16
accelerate == 0.17.1

with the following codes, you can use the GPU with 6GB memory to run the codes, edit image with the size of 512x512

pipe.enable_xformers_memory_efficient_attention()
pipe.enable_attention_slicing()
pipe.vae.enable_tiling()
pipe.enable_model_cpu_offload()

Method

v1

Mask is computed from the images generated by img2imgpipeline, and edit operation is implemented by inpaintingpipeline base on the mask image, as shown in the figure below you can run v1 method by this command:

bash diffedit_v1.sh

if you want to use other image or hyperparameters, please edit the diffedit_v1.sh

v2

Mask is computed from the noise residual(you can also use the noise latents, just change the '--not_residual_guide' in .sh files) in latent space by img2imgpipeline, then the mask is resized to the size of the image, edit operation is implemented by inpaintingpipeline base on the resized mask image, as shown in the figure below you can run v2 method by this command:

bash diffedit_v2.sh

if you want to use other image or hyperparameters, please edit the diffedit_v2.sh

v3

Mask is computed from the noise residual(you can also use the noise latents, just change the '--not_residual_guide' in .sh files) in latent space by img2imgpipeline,edit operation is implemented by img2imgpipeline base on the mask, as shown in the figure below you can run v3 method by this command:

bash diffedit_v3.sh

if you want to use other image or hyperparameters, please edit the diffedit_v3.sh

Result

Reference

The following repository also provides the code implementation of this:

What's more

if you have any questions, feel free to contact me with wangruilin.will@foxmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
ckpt		ckpt
examples		examples
modules		modules
png		png
LICENSE		LICENSE
README.md		README.md
diffedit_v1.sh		diffedit_v1.sh
diffedit_v2.sh		diffedit_v2.sh
diffedit_v3.sh		diffedit_v3.sh
get_edit_v3.py		get_edit_v3.py
get_mask_v1.py		get_mask_v1.py
get_mask_v2.py		get_mask_v2.py
inpaint.py		inpaint.py

License

will-wang19/DiffEdit-by-Stable-Diffusion

Folders and files

Latest commit

History

Repository files navigation

Edit Image base on the Mask generated from the stable diffusion itself

Introduction

Environment

Method

v1

v2

v3

Result

Reference

What's more

About

Resources

License

Stars

Watchers

Forks

Languages