# DiffEdit

Image editing typically requires providing a mask of the area to be edited. DiffEdit automatically generates the mask for you based on a text query, making it easier overall to create a mask without image editing software. The DiffEdit algorithm works in three steps:

- the diffusion model denoises an image conditioned on some query text and reference text which produces different noise estimates for different areas of the image;
- the difference is used to infer a mask to identify which area of the image needs to be changed to match the query text
the input image is encoded into latent space with DDIM
- the latents are decoded with the diffusion model conditioned on the text query, using the mask as a guide such that pixels outside the mask remain the same as in the input image


In [1]:
!pip install -qqq torch==2.0.0+cu118 torchvision==0.15.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
!pip install -qqq git+https://github.com/huggingface/diffusers.git
!pip install -qqq transformers accelerate

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 GB[0m [31m799.1 kB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.1/6.1 MB[0m [31m110.1 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchtext 0.15.2 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.[0m[31m
[0m  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.0/302.0 kB[0m [31m6.1 MB/s[0m et

In [2]:
import torch
from diffusers.utils import load_image
from diffusers import DDIMScheduler, DDIMInverseScheduler, StableDiffusionDiffEditPipeline

sd_model_ckpt = "stabilityai/stable-diffusion-2-1"
pipeline = StableDiffusionDiffEditPipeline.from_pretrained(
    sd_model_ckpt,
    torch_dtype=torch.float16,
    safety_checker= None ,
)
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
pipeline.inverse_scheduler = DDIMInverseScheduler.from_config(pipeline.scheduler.config)
pipeline.enable_model_cpu_offload()
pipeline.enable_vae_slicing()
generator = torch.manual_seed( 1000 )

img_url = "/content/pic.png"
raw_image = load_image(img_url).convert( "RGB" ).resize(( 768 , 768 ))

source_prompt = "a lotus in the pod"
target_prompt = "a statue in the pond"
mask_image = pipeline.generate_mask(
    image=raw_image,
    source_prompt=source_prompt,
    target_prompt=target_prompt,
    generator=generator,
)

inv_latents = pipeline.invert(
    prompt=source_prompt,
    image=raw_image,
    generator=generator
).latents

image = pipeline(
    prompt=target_prompt,
    mask_image=mask_image,
    image_latents=inv_latents,
    generator=generator,
    negative_prompt=source_prompt,
).images[ 0 ]


The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

Downloading (…)ain/model_index.json:   0%|          | 0.00/537 [00:00<?, ?B/s]

Fetching 13 files:   0%|          | 0/13 [00:00<?, ?it/s]

Downloading (…)rocessor_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

Downloading (…)_encoder/config.json:   0%|          | 0.00/633 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/460 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/824 [00:00<?, ?B/s]

Downloading (…)tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

Downloading (…)cheduler_config.json:   0%|          | 0.00/345 [00:00<?, ?B/s]

Downloading (…)tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

Downloading (…)4d6/unet/config.json:   0%|          | 0.00/939 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/1.36G [00:00<?, ?B/s]

Downloading (…)44d6/vae/config.json:   0%|          | 0.00/611 [00:00<?, ?B/s]

Downloading (…)ch_model.safetensors:   0%|          | 0.00/3.46G [00:00<?, ?B/s]

Downloading (…)ch_model.safetensors:   0%|          | 0.00/335M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/6 [00:00<?, ?it/s]

The config attributes {'skip_prk_steps': True} were passed to DDIMInverseScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.


ValueError: ignored