<a href="https://colab.research.google.com/github/JosselinTD/Collabs/blob/main/DreamBooth_Stable_Diffusion.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title #Pre-requisite
#@markdown Check type of GPU and VRAM available. It should be at least a Tesla T4.
!nvidia-smi --query-gpu=name,memory.total,memory.free --format=csv,noheader




https://github.com/JosselinTD/Collabs/blob/main/DreamBooth_Stable_Diffusion.ipynb

In [None]:
#@title #Parameters

#@markdown The new token you will insert in the model. For example, if you want to had your face, you can use something like firstnameLastname. For the style of an artist, you can use artist_style. What matters is that the TOKEN should not already exist in the dictionnary to avoid conflict with already trained data.
TOKEN = 'sks' #@param {type:"string"}

#@markdown The class to what your TOKEN belong. Most common are `person`, `man`, `woman` or `style`. For example, if you want to generates images of your dof, set `dog`.
CLASS = 'person' #@param {type:"string"}

#@markdown This notebook will only generate a CPKT file. It will be saved in the file explorer on the left, and you will have to download it from there. But it's slow and you will lose it if your collab instance is reset. By uploading it on Google Drive, you will be able to download it faster and you will keep your data once the Collab is closed.
GDRIVE = True #@param {type: "boolean"}

#@markdown Once the model is trained it can be saved either complete or light. Complete is 4GB, light is 2GB. Complete is the default Stable Diffusion plus your training. Light is a light version of stable diffusion that will work with your new TOKEN but will have poor result without.
LIGHTWEIGHT_MODEL = False #@param {type: "boolean"}

#@markdown When Dreambooth train the model with your images, it use regularization images to improve the result. Regularization images are existing images generated from the class. For example, if you set the class as `person`, Stable Diffusion will generate images with the prompt `person`. The problem is that SD doesn't always generate good images, so a part of theses images will not be useful to the training. By using handpicked images instead of generated ones, it's possible to improve the training.
#@markdown
#@markdown There is 2 ways to do that. Either you upload theses regularization yourself in a folder called "regularization_images" in the file explorer on the left, or you use an existing dataset made by someone else. This notebook use the datasets done by djbielejeski (https://github.com/djbielejeski) for the class "person", "man" or "woman".
#@markdown
#@markdown If your class is "man", "woman" or "person" and if the checkbox is checked, the notebook will gather images from Github and put them in the "regularization_images" folder.
#@markdown
#@markdown When the training start, if the "regularization_folder" is empty, SD will generates them. Else it will use the existing one.
BETTER_TRAINING = True #@param {type: "boolean"}



# Technical variables

## The model to download from HuggingFace
MODEL_NAME = "CompVis/stable-diffusion-v1-4"
DATA_SOURCE = "/content/source"
DATA_REG = "/content/regularization_images"
DATA_TARGET = "/content/target"

# Install Requirements

In [None]:
#@title Installation

!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
%pip install -qq git+https://github.com/ShivamShrirao/diffusers
%pip install -q -U --pre triton
%pip install -q accelerate==0.12.0 transformers ftfy bitsandbytes gradio
%pip install -q https://github.com/metrolobo/xformers_wheels/releases/download/1d31a3ac_various_6/xformers-0.0.14.dev0-cp37-cp37m-linux_x86_64.whl

In [None]:
#@title Login to HuggingFace 🤗

#@markdown You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work.
from huggingface_hub import notebook_login
!git config --global credential.helper store
notebook_login()

Login successful
Your token has been saved to /root/.huggingface/token


In [None]:
#@title Login to Google Drive

if GDRIVE:
    from google.colab import drive
    drive.mount('/content/drive')

In [None]:
#@title Folders

!mkdir -p $DATA_SOURCE
!mkdir -p $DATA_REG
!mkdir -p $DATA_TARGET

In [None]:
#@title Images

#@markdown ###Your images
#@markdown You can use the file manager on the left panel to upload (drag and drop) to the `source` folder

#@markdown ###Regularization images
#@markdown Run this cell if your class is "person", "man" or "woman" and you have checked the BETTER_TRAiNING checkbox. If you have your own regularization images, upload them now. Else do nothing and the script will automatically generate some.
part = ''
if CLASS == "person":
  part = "person_ddim"
elif CLASS == "man":
  part = "man_1_ddim_step"
elif CLASS == "woman":
  part = "woman_ddim"

!git clone https://github.com/djbielejeski/Stable-Diffusion-Regularization-Images-{part}.git
!mv -v Stable-Diffusion-Regularization-Images-{part}/{part}/*.* regularization_images/



# Start Training

Use the table below to choose the best flags based on your memory and speed requirements. Tested on Tesla T4 GPU.


| `fp16` | `train_batch_size` | `gradient_accumulation_steps` | `gradient_checkpointing` | `use_8bit_adam` | GB VRAM usage | Speed (it/s) |
| ---- | ------------------ | ----------------------------- | ----------------------- | --------------- | ---------- | ------------ |
| fp16 | 1                  | 1                             | TRUE                    | TRUE            | 9.92       | 0.93         |
| no   | 1                  | 1                             | TRUE                    | TRUE            | 10.08      | 0.42         |
| fp16 | 2                  | 1                             | TRUE                    | TRUE            | 10.4       | 0.66         |
| fp16 | 1                  | 1                             | FALSE                   | TRUE            | 11.17      | 1.14         |
| no   | 1                  | 1                             | FALSE                   | TRUE            | 11.17      | 0.49         |
| fp16 | 1                  | 2                             | TRUE                    | TRUE            | 11.56      | 1            |
| fp16 | 2                  | 1                             | FALSE                   | TRUE            | 13.67      | 0.82         |
| fp16 | 1                  | 2                             | FALSE                   | TRUE            | 13.7       | 0.83          |
| fp16 | 1                  | 1                             | TRUE                    | FALSE           | 15.79      | 0.77         |


Add `--gradient_checkpointing` flag for around 9.92 GB VRAM usage.

remove `--use_8bit_adam` flag for full precision. Requires 15.79 GB with `--gradient_checkpointing` else 17.8 GB.

In [None]:
!accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$DATA_SOURCE \
  --class_data_dir=$DATA_REG \
  --output_dir=$DATA_TARGET \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt=$TOKEN \
  --class_prompt=$CLASS \
  --seed=1337 \
  --resolution=512 \
  --train_batch_size=1 \
  --mixed_precision="fp16" \
  --use_8bit_adam \
  --gradient_accumulation_steps=1 \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=20 \
  --sample_batch_size=4 \
  --max_train_steps=900


# Convert weights to ckpt to use in web UIs like AUTOMATIC1111.

In [None]:
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/scripts/convert_diffusers_to_original_stable_diffusion.py

ckpt_path = DATA_TARGET + "/" + TOKEN + ".ckpt"
if GDRIVE:
  ckpt_path = "/content/drive/MyDrive/" + TOKEN + ".ckpt"

half_arg = ""
if LIGHTWEIGHT_MODEL:
    half_arg = "--half"
!python convert_diffusers_to_original_stable_diffusion.py --model_path $DATA_TARGET  --checkpoint_path $ckpt_path $half_arg
print(f"[*] Converted ckpt saved at {ckpt_path}")

Reshaping encoder.mid.attn_1.q.weight for SD format
Reshaping encoder.mid.attn_1.k.weight for SD format
Reshaping encoder.mid.attn_1.v.weight for SD format
Reshaping encoder.mid.attn_1.proj_out.weight for SD format
Reshaping decoder.mid.attn_1.q.weight for SD format
Reshaping decoder.mid.attn_1.k.weight for SD format
Reshaping decoder.mid.attn_1.v.weight for SD format
Reshaping decoder.mid.attn_1.proj_out.weight for SD format
[*] Converted ckpt saved at /content/drive/MyDrive/stable_diffusion_weights/sks/model.ckpt
