# Fine-tuning with Dreambooth








DreamBooth is a method to personalize text2image models like stable diffusion given just a few(3~5) images of a subject. The train_dreambooth.py script shows how to implement the training procedure and adapt it for stable diffusion.

> More information about this Dreambooth implementation:
https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Installing the Libraries and Requirements

In [12]:
!pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 torchdata==0.5.1 torchtext==0.14.1

Collecting torch==1.13.1
  Using cached torch-1.13.1-cp310-cp310-manylinux1_x86_64.whl (887.5 MB)
Installing collected packages: torch
  Attempting uninstall: torch
    Found existing installation: torch 2.1.2
    Uninstalling torch-2.1.2:
      Successfully uninstalled torch-2.1.2
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
xformers 0.0.23.post1 requires torch==2.1.2, but you have torch 1.13.1 which is incompatible.[0m[31m
[0mSuccessfully installed torch-1.13.1


In [13]:
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/scripts/convert_diffusers_to_original_stable_diffusion.py
%pip install -qq git+https://github.com/ShivamShrirao/diffusers
%pip install -q -U --pre triton
%pip install -q accelerate transformers ftfy bitsandbytes==0.35.0 gradio natsort safetensors xformers

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 0.13.1 requires torch==1.13.1, but you have torch 2.1.2 which is incompatible.
torchdata 0.5.1 requires torch==1.13.1, but you have torch 2.1.2 which is incompatible.
torchtext 0.14.1 requires torch==1.13.1, but you have torch 2.1.2 which is incompatible.
torchvision 0.14.1 requires torch==1.13.1, but you have torch 2.1.2 which is incompatible.[0m[31m
[0m

## Loading the model


In [4]:
model_sd = "stabilityai/stable-diffusion-2-1"

In [5]:
output_dir = "/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/stable_diffusion_weights/xyzt"
!mkdir -p $output_dir

## Training

Three components are needed:
1. unique identifier
2. class name
3. images

Instance prompt
a photo of [unique identifier] [class name]

Class prompt
> a photo of [class name]

The instance prompt will be as follows:
> a photo of zwx person

As the subject is a person, the class prompt will be as follows:
> a photo of a person

In [6]:
concepts_list = [
    {
        "instance_prompt": "xyzt",
        "class_prompt": "photo of a person",
        "instance_data_dir": "/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/data/xyzt",
        "class_data_dir": "/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/data/persont"

    }
]

In [7]:
import json
import os
import random

for c in concepts_list:
  os.makedirs(c["instance_data_dir"], exist_ok=True)

In [8]:
with open("concepts_list.json", "w") as f:
  json.dump(concepts_list, f, indent=4)

### Parameters

In [14]:
num_imgs = 13
num_class_images = num_imgs * 12
max_num_steps = num_imgs * 80
learning_rate = 1e-6 # 0.0000001
lr_warmup_steps = int(max_num_steps / 10)
print(num_imgs, num_class_images, max_num_steps, learning_rate, lr_warmup_steps)

13 156 1040 1e-06 104


In [15]:
!python3 train_dreambooth.py \
  --pretrained_model_name_or_path=$model_sd \
  --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse" \
  --output_dir=$output_dir \
  --revision="fp16" \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --seed=777 \
  --resolution=512 \
  --train_batch_size=1 \
  --train_text_encoder \
  --mixed_precision="fp16" \
  --use_8bit_adam \
  --gradient_accumulation_steps=1 \
  --learning_rate=$learning_rate \
  --lr_scheduler="constant" \
  --lr_warmup_steps=80 \
  --num_class_images=$num_class_images \
  --sample_batch_size=4 \
  --max_train_steps=$max_num_steps \
  --save_interval=10000 \
  --save_sample_prompt="xyzt" \
  --concepts_list="concepts_list.json"

2023-12-21 10:19:06.048928: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-21 10:19:06.049015: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-21 10:19:06.051086: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
  warn(f"Failed to load image Python extension: {e}")

Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
  w

In [None]:
from natsort import natsorted
from glob import glob
import os

weights_dir = natsorted(glob(output_dir + os.sep + '*'))[-1]
print('Weights directory: ', weights_dir)

Weights directory:  /content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/stable_diffusion_weights/xyz/1280


In [None]:
import os
import matplotlib.pyplot as plt
from PIL import Image

def grid_img(imgs, rows, cols, scale):
  assert len(imgs) == rows * cols

  w, h = imgs[0].size
  w, h = int(w*scale), int(h*scale)

  grid = Image.new('RGB', size=(cols*w, rows*h))
  grid_w, grid_h = grid.size

  for i, img in enumerate(imgs):
      img = img.resize((w,h), Image.ANTIALIAS)
      grid.paste(img, box=(i%cols*w, i//cols*h))
  return grid

In [None]:
weights_folder = output_dir
#folders = sorted([f for f in os.listdir(weights_folder) if f != "0"], key = lambda x: int(x))
folders = sorted([f for f in os.listdir(weights_folder) if f != "0" and f.isdigit()], key = lambda x: int(x))

rows = 1
cols = 4

imgs_test = []

for imgs, folder in enumerate(folders):
  folder_path = os.path.join(weights_folder, folder)
  image_folder = os.path.join(folder_path, "samples")
  images = [f for f in os.listdir(image_folder)]

  for i in images:
    img_path = os.path.join(image_folder, i)
    r = Image.open(img_path)
    imgs_test.append(r)

if len(imgs_test) != rows * cols:
  raise Exception("The length of imgs is not equal to rows * cols.")

grid_img(imgs_test, rows=rows, cols=cols, scale=1)

Output hidden; open in https://colab.research.google.com to view.

## Converting the weights into (checkpoint)

In [None]:
ckpt_path = weights_dir + "/model.ckpt"

half_arg = "--half" # fp16

!python convert_diffusers_to_original_stable_diffusion.py --model_path $weights_dir  --checkpoint_path $ckpt_path $half_arg
print(f"Converted to ckpt and saved in {ckpt_path}")

Reshaping encoder.mid.attn_1.q.weight for SD format
Reshaping encoder.mid.attn_1.k.weight for SD format
Reshaping encoder.mid.attn_1.v.weight for SD format
Reshaping encoder.mid.attn_1.proj_out.weight for SD format
Reshaping decoder.mid.attn_1.q.weight for SD format
Reshaping decoder.mid.attn_1.k.weight for SD format
Reshaping decoder.mid.attn_1.v.weight for SD format
Reshaping decoder.mid.attn_1.proj_out.weight for SD format
Converted to ckpt and saved in /content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/stable_diffusion_weights/xyz/1280/model.ckpt


## Inference (tests)

In [None]:
import torch
from torch import autocast
from diffusers import StableDiffusionPipeline, DDIMScheduler
from IPython.display import display

In [None]:
model_path = weights_dir
print(model_path)

/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/stable_diffusion_weights/xyz/1280


In [None]:
pipe = StableDiffusionPipeline.from_pretrained(model_path, torch_dtype=torch.float16).to('cuda')

You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .


In [None]:
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
pipe.enable_xformers_memory_efficient_attention()
seed = 777

In [None]:
#@title Saving the Model to HuggingFace
#@markdown If you wish your model to be avaliable for everyone, add it to the public library. If you prefer to use your model privately, add your own profile.
import sys
import torch
import os
from diffusers import StableDiffusionPipeline
save_concept = True #@param {type:"boolean"}
#@markdown Once you save it you can use your concept by loading the model on any `from_pretrained` function
name_of_your_concept = "xyz" #@param {type:"string"}
where_to_save_concept = "public_library" #@param ["public_library", "privately_to_my_profile"]

#@markdown `hf_token_write`: leave blank if you logged in with a token with `write access` in the [Initial Setup](#scrollTo=KbzZ9xe6dWwf). If not, [go to your tokens settings and create a write access token](https://huggingface.co/settings/tokens)
hf_token_write = "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" #@param {type:"string"}

if hf_token_write =="":
  print('[1;32mYour Hugging Face write access token : ')
  hf_token_write=input()

hf_token = hf_token_write
if(save_concept):
  from slugify import slugify
  from huggingface_hub import HfApi, HfFolder, CommitOperationAdd
  from huggingface_hub import create_repo
  from IPython.display import display_markdown

  api = HfApi()
  your_username = api.whoami(token=hf_token)["name"]


  if(where_to_save_concept == "public_library"):
    repo_id = f"eliftansusunar/{slugify(name_of_your_concept)}"
    #Join the Concepts Library organization if you aren't part of it already
    !curl -X POST -H 'Authorization: Bearer '$hf_token -H 'Content-Type: application/json' https://huggingface.co/organizations/sd-dreambooth-library/share/SSeOwppVCscfTEzFGQaqpfcjukVeNrKNHX
  else:
    repo_id = f"{your_username}/{slugify(name_of_your_concept)}"

  if(not hf_token_write):
    with open(HfFolder.path_token, 'r') as fin: hf_token = fin.read();
  else:
    hf_token = hf_token_write

  images_upload = os.listdir("/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/data/xyz")
  image_string = ""
  #repo_id = f"eliftansusunar/{slugify(name_of_your_concept)}"
  for i, image in enumerate(images_upload):
      image_string = f'''{image_string}![image {i}](https://huggingface.co/{repo_id}/resolve/main/concept_images/{image})
'''
  readme_text = f'''---
license: creativeml-openrail-m
tags:
- text-to-image
---
### {name_of_your_concept} on Stable Diffusion via Dreambooth
#### model by {api.whoami(token=hf_token)["name"]}
This your the Stable Diffusion model fine-tuned the {name_of_your_concept} concept taught to Stable Diffusion with Dreambooth.


You can also train your own concepts and upload them to the library by using [this notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_dreambooth_training.ipynb).
And you can run your new concept via `diffusers`: [Colab Notebook for Inference](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_dreambooth_inference.ipynb), [Spaces with the Public Concepts loaded](https://huggingface.co/spaces/sd-dreambooth-library/stable-diffusion-dreambooth-concepts)

Here are the images used for training this concept:
{image_string}
'''
  #Save the readme to a file
  readme_file = open("README.md", "w")
  readme_file.write(readme_text)
  readme_file.close()
  #Save the token identifier to a file
  text_file = open("token_identifier.txt", "w")
  text_file.close()
  operations = [
    CommitOperationAdd(path_in_repo="token_identifier.txt", path_or_fileobj="token_identifier.txt"),
    CommitOperationAdd(path_in_repo="README.md", path_or_fileobj="README.md"),
    CommitOperationAdd(path_in_repo="model.ckpt", path_or_fileobj="/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/stable_diffusion_weights/xyz/1280/model.ckpt"),
  ]
  create_repo(repo_id,private=True, token=hf_token)

  api.create_commit(
    repo_id=repo_id,
    operations=operations,
    commit_message=f"Upload the concept {name_of_your_concept} embeds and token",
    token=hf_token
  )
  api.upload_folder(
    folder_path="/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/stable_diffusion_weights/xyz/1280",
    path_in_repo="",
    repo_id=repo_id,
    token=hf_token
  )
  api.upload_folder(
    folder_path='/content/drive/MyDrive/Fast-Dreambooth/Sessions/DreamBooth/data/xyz',
    path_in_repo="concept_images",
    repo_id=repo_id,
    token=hf_token
  )
display_markdown(f'''## Your concept was saved successfully. [Click here to access it](https://huggingface.co/{repo_id})
''', raw=True)

## Generating images

In [None]:
prompt = "face portrait of xyz in the snow, realistic, hd, vivid, sunset"
negative_prompt = "bad anatomy, ugly, deformed, desfigured, distorted face, poorly drawn hands, poorly drawn face, poorly drawn feet, blurry, low quality, low definition, lowres, out of frame, out of image, cropped, cut off, signature, watermark"
num_samples = 5
guidance_scale = 10
num_inference_steps = 30
height = 512
width = 512

seed = random.randint(0, 2147483647)
print("Seed: {}".format(str(seed)))
generator = torch.Generator(device='cuda').manual_seed(seed)

with autocast("cuda"), torch.inference_mode():
    imgs = pipe(
        prompt,
        negative_prompt=negative_prompt,
        height=height, width=width,
        num_images_per_prompt=num_samples,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        generator=generator
    ).images

for img in imgs:
    display(img)

Output hidden; open in https://colab.research.google.com to view.

### Testing multiple prompts

In [None]:
prompt = ["photo of xyz, frontal face, Hagia Sophia in the background, natural lighting",
          "photo of xyz  in the desert, closeup, pyramids in the background, natural lighting, frontal face",
          "photo of xyz  in the forest, natural lighting, frontal face",
          "photo of xyz  as an engineer, natural lighting, frontal face, closeup, starry sky in the background",
          "face portrait of xyz in the snow, realistic, hd,vivid, sunset"]

negative_prompt = ["bad anatomy, ugly, deformed, desfigured, distorted face, poorly drawn hands, poorly drawn face, poorly drawn feet, blurry, low quality, low definition, lowres, out of frame, out of image, cropped, cut off, signature, watermark" ] * len(prompt)
num_samples = 1
guidance_scale = 10
num_inference_steps = 30
height = 512
width = 512

seed = random.randint(0, 2147483647) # gera um valor aleatório
print("Seed: {}".format(str(seed)))
generator = torch.Generator(device='cuda').manual_seed(seed)

with autocast("cuda"), torch.inference_mode():
    imgs = pipe(
        prompt,
        negative_prompt=negative_prompt,
        height=height, width=width,
        num_images_per_prompt=num_samples,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        generator=generator
    ).images

for img in imgs:
    display(img)

Output hidden; open in https://colab.research.google.com to view.

## Saving the results

In [None]:
!mkdir results

In [None]:
for i, img in enumerate(imgs):
  img.save('results/result_{}.png'.format(i+1))