# Dreambooth inpainting finetuning

Finetune Stable Diffusion inpainting model on custom images. Replace custom mask in any image with finetuned object.

1. **Fine-tuning Stable Diffusion inpainting model on custom images:** Stable Diffusion is a state-of-the-art machine learning model, used for the purpose of image inpainting. Inpainting is a process that involves filling in the missing part of any image using the existing data. In this snippet, the Stable Diffusion model is being fine-tuned on custom images.

2. **Replace mask in any image with the fine-tuned object:** After the fine-tuning process, the subsequent task is to use the fine-tuned model to replace the mask in any image with the fine-tuned object. This essentially means replacing a part of any image, marked by a mask, with the object generated by the model.

In [None]:
!pip install -U diffusers transformers ftfy gradio accelerate

In [None]:
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

In [8]:
!wget https://raw.githubusercontent.com/huggingface/diffusers/main/examples/research_projects/dreambooth_inpaint/train_dreambooth_inpaint.py

--2024-10-12 02:17:27--  https://raw.githubusercontent.com/huggingface/diffusers/main/examples/research_projects/dreambooth_inpaint/train_dreambooth_inpaint.py
Loaded CA certificate '/usr/ssl/certs/ca-bundle.crt'
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 33292 (33K) [text/plain]
Saving to: 'train_dreambooth_inpaint.py'

     0K .......... .......... .......... ..                   100% 1.22M=0.03s

2024-10-12 02:17:29 (1.22 MB/s) - 'train_dreambooth_inpaint.py' saved [33292/33292]



In [None]:
!git clone https://github.com/davide97l/stable_diffusion_dreambooth_inpainting.git

In [1]:
import inspect
from typing import List, Optional, Union
import numpy as np
import torch
import os
import PIL

from diffusers import StableDiffusionInpaintPipeline
import requests
from io import BytesIO

In [2]:
def image_grid(imgs, rows, cols, resize=256):
    if resize is not None:
        imgs = [img.resize((resize, resize)) for img in imgs]
    w, h = imgs[0].size
    grid = PIL.Image.new("RGB", size=(cols * w, rows * h))
    grid_w, grid_h = grid.size

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i % cols * w, i // cols * h))
    return grid

In [4]:
base_dir = os.getcwd()
model_path = os.path.join(base_dir, './Stable_Diffusion_Inpaint_2')

print("Loading model from:", model_path)

pipe = StableDiffusionInpaintPipeline.from_pretrained(
    model_path,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32
)

pipe.enable_attention_slicing()

# Move the pipeline to the correct device (GPU if available)
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = pipe.to(device)

Loading model from: C:\Users\hassa\Desktop\Uni\Finalized Models\models\inpainting\./Stable_Diffusion_Inpaint_2


Loading pipeline components...:   0%|          | 0/6 [00:00<?, ?it/s]

In [5]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [10]:
!accelerate launch train_dreambooth_inpaint.py \
    --pretrained_model_name_or_path="./Stable_Diffusion_Inpaint_2"  \
    --instance_data_dir="images/Images_jpg" \
    --output_dir="stable-diffusion-inpainting-painting" \
    --instance_prompt="old painting" \
    --resolution=256 \
    --mixed_precision="no" \
    --train_batch_size=1 \
    --learning_rate=5e-6 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps=500 \
    --gradient_accumulation_steps=2 \
    --gradient_checkpointing \
    --train_text_encoder \
    --seed="0" \
    --push_to_hub

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
2024-10-12 02:23:15.151001: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-12 02:23:16.831609: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

  0%|          | 0/500 [00:00<?, ?it/s]
  hidden_states = F.scaled_dot_product_attention(
  with device_autocast_ctx, torch.cpu.amp.