<a href="https://colab.research.google.com/github/Zinston/colab_notebooks/blob/main/DreamBooth_Stable_Diffusion_(advanced_settings).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DreamBooth Stable Diffusion (advanced settings fork)

Fork by [Antoine Guenet](https://www.antoineguenet.com).

In [None]:
#@markdown Check type of GPU and VRAM available.
!nvidia-smi --query-gpu=name,memory.total,memory.free --format=csv,noheader

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

# Install Requirements

In [None]:
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
%pip install -qq git+https://github.com/ShivamShrirao/diffusers
%pip install -q -U --pre triton
%pip install -q accelerate==0.12.0 transformers ftfy bitsandbytes gradio

In [None]:
#@title Login to HuggingFace 🤗

#@markdown You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work.
from huggingface_hub import notebook_login
!git config --global credential.helper store
notebook_login()

Login successful
Your token has been saved to /root/.huggingface/token


### Install xformers from precompiled wheel.

In [None]:
%pip install -q https://github.com/metrolobo/xformers_wheels/releases/download/1d31a3ac_various_6/xformers-0.0.14.dev0-cp37-cp37m-linux_x86_64.whl
# These were compiled on Tesla T4, should also work on P100, thanks to https://github.com/metrolobo

# If precompiled wheels don't work, install it with the following command. It will take around 40 minutes to compile.
# %pip install git+https://github.com/facebookresearch/xformers@1d31a3a#egg=xformers

[K     |████████████████████████████████| 102.9 MB 131 kB/s 
[?25h

# Settings

## Base settings

In [None]:
#@markdown Name/Path of the initial model (defines `pretrained_model_name_or_path` parameter).
MODEL_NAME = "CompVis/stable-diffusion-v1-4" #@param {type:"string"}
# #@markdown Path to pretrained model or model identifier from [huggingface.co/models](https://www.huggingface.co/models) (required, default `None`):
# pretrained_model_name_or_path="" #@param{type: 'string'}
pretrained_model_name_or_path=MODEL_NAME

#@markdown Whether to pull your images from gdrive.
instance_in_gdrive = True #@param {type:"boolean"}
if instance_in_gdrive:
    from google.colab import drive
    drive.mount('/content/drive')

#@markdown Path for images of the concept for training (defines `instance_data_dir` parameter).
INSTANCE_DIR = "/content/drive/MyDrive/AI/model_images/me" #@param {type:"string"}
if not instance_in_gdrive:
    %(mkdir -p $INSTANCE_DIR)
# #@markdown A folder containing the training data of instance images (required, default `None`):
# instance_data_dir="" #@param{type: 'string'}
instance_data_dir=INSTANCE_DIR

#@markdown A general name for class like dog for dog images (defines `class_data_dir` parameter as `"/content/data/{CLASS_NAME}"`).
CLASS_NAME = "man" #@param {type:"string"}
# #@markdown The prompt to specify images in the same class as provided instance images (default `None`):
# class_prompt="" #@param{type: 'string'}
class_prompt=CLASS_NAME
CLASS_DIR = f"/content/data/{CLASS_NAME}"
# #@markdown A folder containing the training data of class images (default `None`):
# class_data_dir="" #@param{type: 'string'}
class_data_dir=CLASS_DIR

#@markdown The prompt with identifier specifying the instance (default `None`):
instance_prompt="JohnFDoe" #@param{type: 'string'}

#@markdown If model weights should be saved directly in google drive (takes around 4-5 GB).
save_to_gdrive = True #@param {type:"boolean"}
if save_to_gdrive and not instance_in_gdrive:
    from google.colab import drive
    drive.mount('/content/drive')

#@markdown Enter the directory name to save model at (defines `output_dir` parameter as `"/content/drive/MyDrive/{OUTPUT_DIR}"` if saving to Google Drive else `"/content/{OUTPUT_DIR}"`).

OUTPUT_DIR = "stable_diffusion_weights/sks" #@param {type:"string"}
if save_to_gdrive:
    OUTPUT_DIR = "/content/drive/MyDrive/" + OUTPUT_DIR
else:
    OUTPUT_DIR = "/content/" + OUTPUT_DIR
# #@markdown The output directory where the model predictions and checkpoints will be written (default `"text-inversion-model"`):
# output_dir="text-inversion-model" #@param{type: 'string'}
output_dir=OUTPUT_DIR

print(f"[*] Weights will be saved at {OUTPUT_DIR}")

!mkdir -p $OUTPUT_DIR

#@markdown sks is a rare identifier, feel free to replace it.

## Images

In [None]:
#@markdown Upload your images by running this cell.

#@markdown OR

#@markdown You can use the file manager on the left panel to upload (drag and drop) to INSTANCE_DIR (it uploads faster)

import os
from google.colab import files
import shutil

uploaded = files.upload()
for filename in uploaded.keys():
    dst_path = os.path.join(INSTANCE_DIR, filename)
    shutil.move(filename, dst_path)

## Main training settings


Use the table below to choose the best flags based on your memory and speed requirements. Tested on Tesla T4 GPU.


| `mixed_precision` | `train_batch_size` | `gradient_accumulation_steps` | `gradient_checkpointing` | `use_8bit_adam` | GB VRAM usage | Speed (it/s) |
| ---- | ------------------ | ----------------------------- | ----------------------- | --------------- | ---------- | ------------ |
| fp16 | 1                  | 1                             | TRUE                    | TRUE            | 9.92       | 0.93         |
| no   | 1                  | 1                             | TRUE                    | TRUE            | 10.08      | 0.42         |
| fp16 | 2                  | 1                             | TRUE                    | TRUE            | 10.4       | 0.66         |
| fp16 | 1                  | 1                             | FALSE                   | TRUE            | 11.17      | 1.14         |
| no   | 1                  | 1                             | FALSE                   | TRUE            | 11.17      | 0.49         |
| fp16 | 1                  | 2                             | TRUE                    | TRUE            | 11.56      | 1            |
| fp16 | 2                  | 1                             | FALSE                   | TRUE            | 13.67      | 0.82         |
| fp16 | 1                  | 2                             | FALSE                   | TRUE            | 13.7       | 0.83          |
| fp16 | 1                  | 1                             | TRUE                    | FALSE           | 15.79      | 0.77         |

Add `--gradient_checkpointing` flag for around 9.92 GB VRAM usage.

Remove `--use_8bit_adam` flag for full precision. Requires 15.79 GB storage with `--gradient_checkpointing` else 17.8 GB.

In [None]:
#@markdown Whether to use mixed precision. Choose between fp16 and bf16 (bfloat16). Bf16 requires PyTorch >= 1.10 and an Nvidia Ampere GPU (default `"fp16"`):
mixed_precision="fp16" #@param ["no", "fp16", "bf16"]
#@markdown Batch size (per device) for the training dataloader (default `2`):
train_batch_size=2 #@param{type: 'number'}
#@markdown Number of updates steps to accumulate before performing a backward/update pass (default `1`):
gradient_accumulation_steps=1 #@param{type: 'number'}
#@markdown Whether or not to use gradient checkpointing to save memory at the expense of slower backward pass (default `False`):
gradient_checkpointing=False #@param{type: 'boolean'}
#@markdown Whether or not to use 8-bit Adam from [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) (default `True`):
use_8bit_adam=True #@param{type: 'boolean'}

## Advanced training settings

### Defaults

In [None]:
#@markdown Flag to add prior preservation loss (default `True`):
with_prior_preservation=True #@param{type: 'boolean'}
#@markdown The weight of prior preservation loss (default `1.0`):
prior_loss_weight=1.0 #@param{type: 'number'}
#@markdown A seed for reproducible training (default `1337`, will be `None` if left at `0`):
seed=1337 #@param{type: 'number'}
#@markdown The resolution for input images, all the images in the train/validation dataset will be resized to this resolution (default `512`):
resolution=512 #@param{type: 'number'}
#@markdown Initial learning rate (after the potential warmup period) to use (default `5e-6`):
learning_rate=5e-6 #@param{type: 'number'}
#@markdown The scheduler type to use (default `"constant"`):
lr_scheduler="constant" #@param ["linear", "cosine", "cosine_with_restarts", "polynomial", "constant", "constant_with_warmup"]
#@markdown Number of steps for the warmup in the lr scheduler (default `0`):
lr_warmup_steps=0 #@param{type: 'number'}
#@markdown Minimal class images for prior perversation loss. If not have enough images, additional images will be sampled with class_prompt (default `50`):
num_class_images=50 #@param{type: 'number'}
#@markdown Batch size (per device) for sampling images (default `4`):
sample_batch_size=4 #@param{type: 'number'}
#@markdown Total number of training steps to perform.  If provided, overrides `num_train_epochs` (default `1000`, will be `None` if left at `0`):
max_train_steps=1000 #@param{type: 'number'}

### 8-bit Adam optimization

In [None]:
#@markdown The `beta1` parameter for the Adam optimizer (default `0.9`):
adam_beta1=0.9 #@param{type: 'number'}
#@markdown The `beta2` parameter for the Adam optimizer (default `0.999`):
adam_beta2=0.999 #@param{type: 'number'}
#@markdown Weight decay to use (default `1e-2`):
adam_weight_decay=1e-2 #@param{type: 'number'}
#@markdown Epsilon value for the Adam optimizer (default `1e-08`):
adam_epsilon=1e-08 #@param{type: 'number'}

### Hub


In [None]:
#@markdown Whether or not to push the model to the Hub (default `False`):
push_to_hub=False #@param{type: 'boolean'}
#@markdown The token to use to push to the Model Hub (default `None`, will be `None` if left blank):
hub_token="" #@param{type: 'string'}
#@markdown The name of the repository to keep in sync with the local `output_dir` (default `None`, will be `None` if left blank):
hub_model_id="" #@param{type: 'string'}

### Logging

In [None]:
#@markdown [TensorBoard](https://www.tensorflow.org/tensorboard) log directory. Will default to *output_dir/runs/**CURRENT_DATETIME_HOSTNAME*** (default `"logs"`):
logging_dir="logs" #@param{type: 'string'}
#@markdown Log every N steps (default `10`):
log_interval=10 #@param{type: 'number'}

### Others

In [None]:
#@markdown Pretrained tokenizer name or path if not the same as model_name (required, default `None`):
tokenizer_name="" #@param{type: 'string'}
#@markdown Whether to center crop images before resizing to resolution (default `False`):
center_crop=False #@param{type: 'boolean'}
#@markdown No doc for this one ¯\\\_(ツ)\_/¯ (default `1`):
num_train_epochs=1 #@param{type: 'number'}
#@markdown Scale the learning rate by the number of GPUs, gradient accumulation steps, and batch size (default `False`):
scale_lr=False #@param{type: 'boolean'}
#@markdown Max gradient norm (default `1.0`):
max_grad_norm=1.0 #@param{type: 'number'}
#@markdown Do not precompute and cache latents from VAE (default `False`):
not_cache_latents=False #@param{type: 'boolean'}
#@markdown For distributed training: `local_rank` (default `-1`):
local_rank=-1 #@param{type: 'number'}

# Train the model

In [None]:
#@markdown This will run the training with your parameters.

cmd = [
    'accelerate launch train_dreambooth.py',
    f'--pretrained_model_name_or_path="{pretrained_model_name_or_path}"',
    f'--instance_data_dir="{instance_data_dir}"',
    f'--class_data_dir="{class_data_dir}"',
    f'--output_dir="{output_dir}"',
    f'--prior_loss_weight={prior_loss_weight}',
    f'--instance_prompt="{instance_prompt}"',
    f'--class_prompt="{class_prompt}"',
    f'--seed={seed}',
    f'--train_batch_size={train_batch_size}',
    f'--gradient_accumulation_steps={gradient_accumulation_steps}',
    f'--learning_rate={learning_rate}',
    f'--lr_scheduler="{lr_scheduler}"',
    f'--lr_warmup_steps={lr_warmup_steps}',
    f'--num_class_images={num_class_images}',
    f'--sample_batch_size={sample_batch_size}',
    f'--prior_loss_weight={prior_loss_weight}',
    f'--max_train_steps={max_train_steps}',
    f'--resolution={resolution}',
    f'--learning_rate={learning_rate}',
    f'--adam_beta1={adam_beta1}',
    f'--adam_beta2={adam_beta2}',
    f'--adam_weight_decay={adam_weight_decay}',
    f'--adam_epsilon={adam_epsilon}',
    f'--logging_dir="{logging_dir}"',
    f'--log_interval={log_interval}',
    f'--tokenizer_name="{tokenizer_name}"',
    f'--num_train_epochs={num_train_epochs}',
    f'--max_grad_norm={max_grad_norm}',
    f'--local_rank={local_rank}',
]

if with_prior_preservation is True:
    cmd.append('--with_prior_preservation')
if use_8bit_adam is True:
    cmd.append('--use_8bit_adam')
if gradient_checkpointing is True:
    cmd.append('--gradient_checkpointing')
if with_prior_preservation is True:
    cmd.append('--with_prior_preservation')
if push_to_hub is True:
    cmd.append('--push_to_hub')
if center_crop is True:
    cmd.append('--center_crop')
if scale_lr is True:
    cmd.append('--scale_lr')
if not_cache_latents is True:
    cmd.append('--not_cache_latents')
if seed != 0:
    cmd.append(f'--seed={seed}')
if hub_token:
    cmd.append(f'--hub_token="{hub_token}"')
if hub_model_id:
    cmd.append(f'--hub_model_id="{hub_model_id}"')

cmd_string = ' '.join(cmd)
!$cmd_string

## Convert weights to ckpt to use in web UIs like AUTOMATIC1111.

In [None]:
#@markdown Download script
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/scripts/convert_diffusers_to_original_stable_diffusion.py

In [None]:
#@markdown Run conversion.
ckpt_path = OUTPUT_DIR + "/model.ckpt"

half_arg = ""
#@markdown  Whether to convert to fp16, takes half the space (2GB), might loose some quality.
fp16 = False #@param {type: "boolean"}
if fp16:
    half_arg = "--half"
!python convert_diffusers_to_original_stable_diffusion.py --model_path $OUTPUT_DIR  --checkpoint_path $ckpt_path $half_arg
print(f"[*] Converted ckpt saved at {ckpt_path}")

## Inference

In [None]:
import torch
from torch import autocast
from diffusers import StableDiffusionPipeline
from IPython.display import display

model_path = OUTPUT_DIR             # If you want to use previously trained model saved in gdrive, replace this with the full path of model in gdrive

pipe = StableDiffusionPipeline.from_pretrained(model_path, torch_dtype=torch.float16).to("cuda")
g_cuda = None

In [None]:
#@markdown Can set random seed here for reproducibility.
g_cuda = torch.Generator(device='cuda')
seed = 52362 #@param {type:"number"}
g_cuda.manual_seed(seed)

# Use the model

In [None]:
#@title Run for generating images.

prompt = "JohnFDoe twerking at a barbecue" #@param {type:"string"}
negative_prompt = "" #@param {type:"string"}
num_samples = 4 #@param {type:"number"}
guidance_scale = 7.5 #@param {type:"number"}
num_inference_steps = 50 #@param {type:"number"}
height = 512 #@param {type:"number"}
width = 512 #@param {type:"number"}

with autocast("cuda"), torch.inference_mode():
    images = pipe(
        prompt,
        height=height,
        width=width,
        negative_prompt=negative_prompt,
        num_images_per_prompt=num_samples,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        generator=g_cuda
    ).images

for img in images:
    display(img)

In [None]:
#@markdown Run Gradio UI for generating images.
import gradio as gr

def inference(prompt, negative_prompt, num_samples, height=512, width=512, num_inference_steps=50, guidance_scale=7.5):
    with torch.autocast("cuda"), torch.inference_mode():
        return pipe(
                prompt, height=int(height), width=int(width),
                negative_prompt=negative_prompt,
                num_images_per_prompt=int(num_samples),
                num_inference_steps=int(num_inference_steps), guidance_scale=guidance_scale,
                generator=g_cuda
            ).images

with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            prompt = gr.Textbox(label="Prompt", value="photo of sks guy, digital painting")
            negative_prompt = gr.Textbox(label="Negative Prompt", value="")
            run = gr.Button(value="Generate")
            with gr.Row():
                num_samples = gr.Number(label="Number of Samples", value=4)
                guidance_scale = gr.Number(label="Guidance Scale", value=7.5)
            with gr.Row():
                height = gr.Number(label="Height", value=512)
                width = gr.Number(label="Width", value=512)
            num_inference_steps = gr.Slider(label="Steps", value=50)
        with gr.Column():
            gallery = gr.Gallery()

    run.click(inference, inputs=[prompt, negative_prompt, num_samples, height, width, num_inference_steps, guidance_scale], outputs=gallery)

demo.launch(debug=True)

# Extras

In [None]:
#@title (Optional) Delete diffuser weights and only keep the ckpt to free up drive space (4GB).

#@markdown [ ! ] Caution, Only execute if you are sure u want to delete the diffuser format weights and only use the ckpt.
import shutil
from glob import glob
for f in glob(OUTPUT_DIR+"/*"):
    if not f.endswith(".ckpt"):
        try:
            shutil.rmtree(f)
        except NotADirectoryError:
            continue
        print("Deleted", f)

In [None]:
from google.colab import runtime

#@title (Optional) Disconnect the session
#@markdown Run to disconnect the session (useful when programmed to run after other cells).

runtime.unassign()