<a href="https://colab.research.google.com/github/plundh/pl-dreambooth/blob/main/pl_ShivDreamBooth.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is forked from [this repository](https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth) by Shivam Shriraro.

*On your Google Drive:*
1. Put your training images in ***dreambooth/training_images/[TRAINING_FOLDER_NAME]***
2. Put your class images in ***dreambooth/class_images/[CLASS_FOLDER_NAME]***

~ *NOTE: Needs a GPU with 15 GB VRAM or more.* ~

In [None]:
#@title Check Runtime GPU
!nvidia-smi --query-gpu=name,memory.total,memory.free --format=csv,noheader

In [None]:
#@title #1. Settings

import os
from google.colab import drive

# Mount Google Drive if not already mounted
GDRIVE_PATH = "/content/google_drive"
if os.path.isdir(GDRIVE_PATH):
  print(f"Google Drive  already mounted at '{GDRIVE_PATH}'" )
else:
  drive.mount(GDRIVE_PATH)

#@markdown ###**Input Model**
#@markdown This is the base model that will be modified for this training. Typically, this should be one of the standard Stable Diffusion models. \\
#@markdown To download it automatically, you have to: \\
#@markdown * Be a registered user in Hugging Face Hub and intput your [access token](https://huggingface.co/settings/tokens) here.
#@markdown * Accept the model license before downloading or using the Stable Diffusion weights. Visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license and tick the checkbox to agree.  
HUGGINGFACE_TOKEN = "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" #@param {type:"string"}
INPUT_MODEL_NAME = "runwayml/stable-diffusion-v1-5" #@param ["runwayml/stable-diffusion-v1-5", "compvis/stable-diffusion-v1-4"]

#@markdown \
#@markdown ###**Training Images**
#@markdown _/content/google_drive/MyDrive/dreambooth/training_images/[TRAINING_FOLDER_NAME]_ \\
#@markdown These will be images of the concept you wish to train, ie. images of an object, art style or someone's likeness.
#@markdown * Images: For objects and faces **10-30** training images are sufficient. For art styles, more are often better.\
#@markdown * You may train multiple concepts by adding additional dictionaries to **concepts_input** in the code. \\
TRAINING_FOLDER_NAME = "monet-images" #@param {type:"string"}
CONCEPT_NAME = "p-picasso" #@param {type:"string"}

#@markdown \
#@markdown ###**Class Images**
#@markdown _/content/google_drive/MyDrive/dreambooth/class_images/[CLASS_FOLDER_NAME]_ \\
#@markdown '*Prior Preservation*' uses generated *Class Images* from the *Concept* class to prevent your trained concept from distoring other instances of that class. \
#@markdown * If no folder is provided, class images will be generated prior to training.
#@markdown * The images should be generated by the Input Model.
#@markdown ** **Example:** if your training images are of somebody's likeness, train using the class 'person'. The class images should then be images generated by the Input Model using the prompt 'person'.
ENABLE_PRIOR_PRESERVATION = False #@param {type:"boolean"}
CLASS_FOLDER_NAME = "artstyle" #@param {type:"string"}

if ENABLE_PRIOR_PRESERVATION:
  CLASS_NAME = "painting" #@param {type:"string"}
  class_name_space = " " + CLASS_NAME
else:
  CLASS_NAME = ""
  class_name_space = CLASS_NAME

TRAINING_IMAGES_ROOT = GDRIVE_PATH + "/MyDrive/dreambooth/training_images/"
CLASS_IMAGES_ROOT = GDRIVE_PATH + "/MyDrive/dreambooth/class_images/"

#@markdown \
#@markdown ###**Training Settings**
#@markdown *'Steps'* is the unit describing how long and thorough the model will train. \
#@markdown * Too few steps, and the model will lack the ability to replicate desired characteristists from the *Training Images*. Too long, and it will overfit, meaning it will generate images that have characterists more closely resembling the training data than the prompt.
#@markdown * The appropriate number of steps scales more or less linearly with the number of *Training Images*, so *Steps Per Image* is used as a convenient metric. \
#@markdown * **60-100** *Steps Per Image* is a good starting point, but there is no magic formula. \
#@markdown \
#@markdown Checkpoints will be saved as '**[date]** \_( **[token]** @ **[training_image_folder]** \_ **[image_count]** i)_ **[total images]** _ **{steps per total images}** .ckpt' \\
#@markdown To facillitate finding the optimal number of *Steps*, checkpoints can be saved at *Steps Per Image* intervals.

STEPS_PER_IMAGE = 100 #@param {type:"integer"}
STEPS_PER_IMAGE_SAVE_INTERVAL = 20 #@param {type:"integer"}

### CONCEPTS GO HERE #################################################################
# Add a new dictionary to 'concepts_input' for each extra concept you wish to train.

concepts_input = [
    {
        "CONCEPT_NAME":                   CONCEPT_NAME,
        "class_name":                     CLASS_NAME,
        "TRAINING_FOLDER_NAME":           TRAINING_FOLDER_NAME,
        "CLASS_FOLDER_NAME":              CLASS_FOLDER_NAME
#    },
#    {
#        "CONCEPT_NAME":                  "c-monet",
#        "class_name":                    "artstyle",
#        "TRAINING_FOLDER_NAME":          "monet-images",
#        "CLASS_FOLDER_NAME":             "artstyle_ddim"
    }
]
######################################################################################
EXPORT_ONLY_LAST_CHECKPOINT = False #@param {type:"boolean"}

#@markdown \
#@markdown ###**Sample Generation**
#@markdown Image grids of different *'prompts'*, *'steps'* and *'CFG'* settings can be generated for each checkpoint in order to stress test the model.  \
#@markdown * **Note:** This step adds considerable time.
ENABLE_SAMPLE_GENERATION = True #@param {type:"boolean"}

SAMPLE_PROMPT_1 = "portrait film still of jaime lannister, patterned cape, woolen shirt, cinematic" #@param {type:"string"}
SAMPLE_PROMPT_2 = "alhambra exterior courtyard, garden" #@param {type:"string"}
SAMPLE_PROMPT_3 = "close-up of lion on savannah, three-quarter view, photo" #@param {type:"string"}
SAMPLE_PROMPT_4 = "spaceship cockpit, interior, sci-fi, photo" #@param {type:"string"}

SAMPLE_PROMPT_LIST = [
    f"{CONCEPT_NAME}{class_name_space}, {SAMPLE_PROMPT_1}",
    f"{CONCEPT_NAME}{class_name_space}, {SAMPLE_PROMPT_2}",
    f"{CONCEPT_NAME}{class_name_space}, {SAMPLE_PROMPT_3}",
    f"{CONCEPT_NAME}{class_name_space}, {SAMPLE_PROMPT_4}",
    f"{CONCEPT_NAME}{class_name_space}"
    ]

#@markdown \
DISCONNECT_ON_COMPLETION = True #@param {type:"boolean"}

In [None]:
#@title #2. Verify Settings and Training Data

import math
import fnmatch
import json
from collections import Counter
from prettytable import PrettyTable
from prettytable import SINGLE_BORDER
from IPython.display import Markdown as md

concepts_list = []
checkpoint_list = []

MODELS_ROOT = f"{GDRIVE_PATH}/MyDrive/dreambooth/models"

for concept in concepts_input:
  instance_values = {
        "instance_prompt":      concept["CONCEPT_NAME"],
        "instance_data_dir":    f"{TRAINING_IMAGES_ROOT}{concept['TRAINING_FOLDER_NAME']}/",
        "inst_file_count":      len(os.listdir(f"{TRAINING_IMAGES_ROOT}{concept['TRAINING_FOLDER_NAME']}")),
     }
  
  if ENABLE_PRIOR_PRESERVATION:
    class_values ={
      "class_prompt":         concept["class_name"],
      "class_data_dir":       f"{CLASS_IMAGES_ROOT}{concept['CLASS_FOLDER_NAME']}/",
      "class_file_count":     len(os.listdir(f"{CLASS_IMAGES_ROOT}{concept['CLASS_FOLDER_NAME']}"))
    }
    combined_values = {**instance_values, **class_values}
    concepts_list.append(combined_values)
  else:
    concepts_list.append(instance_values)

if not ENABLE_PRIOR_PRESERVATION: print("☑️ Skipping Prior Preservation")

#print(json.dumps(concepts_list, sort_keys=True, indent=2)) # For debugging
#print("\n")

if len(concepts_list) == 1:
    print("Training 1 concept")
else:
    print(f"Training {len(concepts_list)} concepts")

for concept in concepts_list:
  print("\n")
  class_name = f"_{concept['class_prompt']}" if ENABLE_PRIOR_PRESERVATION else ''
  combined_token = ",".join([concept['instance_prompt']for concept in concepts_list])
  combined_token_class_folder = ",".join(["(" + str(concept['instance_prompt']) + class_name + "@" + str(os.path.basename(os.path.normpath(concept['instance_data_dir']))) + "_" + str(concept['inst_file_count']) + "i" + ")" for concept in concepts_list])
  if concept['inst_file_count'] == 0:
    print(f"❌ No training images found in '{concept['instance_data_dir']}'")
  else:
    print(f"✅ {concept['inst_file_count']} training images found in '{concept['instance_data_dir']}'")

  if ENABLE_PRIOR_PRESERVATION:
    if concept['class_file_count'] == 0:
          print(f"❌ No Class images found in '{concept['class_data_dir']}'")
    else:
      print(f"✅ {concept['class_file_count']} class images found in '{concept['class_data_dir']}'")

  concept_table = PrettyTable(header=False)
  concept_table.align = "l"
  concept_table.set_style(SINGLE_BORDER)

  for key, value in concept.items():
    concept_table.add_row([key, value])
  print(concept_table)

with open("concepts_list.json", "w") as f:
    json.dump(concepts_list, f, indent=4)

total_class_images = sum([concept['inst_file_count'] for concept in concepts_list])

## Output paths
total_training_images = sum([concept["inst_file_count"] for concept in concepts_list])
date_string = !date +"%Y-%m-%d_%H-%M"
training_steps = total_training_images * STEPS_PER_IMAGE
save_interval = total_training_images * STEPS_PER_IMAGE_SAVE_INTERVAL

temp_folder = f"{date_string[-1]}_{combined_token_class_folder}"
temp_folder = f"stable_diffusion_weights/{temp_folder}"

os.makedirs(temp_folder, exist_ok=True)
os.makedirs(MODELS_ROOT, exist_ok=True)
os.makedirs(f"{GDRIVE_PATH}/MyDrive/dreambooth/training_images", exist_ok=True)
os.makedirs(f"{GDRIVE_PATH}/MyDrive/dreambooth/class_images", exist_ok=True)

checkpoint_steps = training_steps
checkpoints_table = PrettyTable(['Step', 'Steps per image'])
checkpoints_table.set_style(SINGLE_BORDER)
checkpoints_table.sortby = "Step"


def output_file_name(steps_per_img):
  return f"{date_string[-1]}_{combined_token_class_folder}_[{total_training_images}]_{{{steps_per_img}}}"

model_output_dir = f"{MODELS_ROOT}/{output_file_name(STEPS_PER_IMAGE)}"
os.makedirs(model_output_dir, exist_ok=True)

def output_file_path(steps_per_img):
  return f"{model_output_dir}/{output_file_name(steps_per_img)}.ckpt"

while checkpoint_steps > 0:
  steps_per_img = int(checkpoint_steps / total_training_images)
  checkpoints_table.add_row([checkpoint_steps, steps_per_img])
  checkpoint_list.append({"steps": checkpoint_steps, "steps_per_img": steps_per_img, "output_file_name": output_file_name(steps_per_img), "output_file_path": output_file_path(steps_per_img), "temp_folder_path": f"{temp_folder}/{checkpoint_steps}"})
  checkpoint_steps -= STEPS_PER_IMAGE_SAVE_INTERVAL * total_training_images

print("\n")
print(f"Saving checkpoints every {STEPS_PER_IMAGE_SAVE_INTERVAL} steps per image:")
print(checkpoints_table)
print("\n")
print("Ready for training ...")

In [None]:
#@title #3. Install Requirements

!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
!wget -q https://github.com/ShivamShrirao/diffusers/raw/main/scripts/convert_diffusers_to_original_stable_diffusion.py
%pip install -qq git+https://github.com/ShivamShrirao/diffusers
%pip install -q -U --pre triton
%pip install -q accelerate==0.12.0 transformers ftfy bitsandbytes gradio natsort
%pip install -q https://github.com/metrolobo/xformers_wheels/releases/download/1d31a3ac_various_6/xformers-0.0.14.dev0-cp37-cp37m-linux_x86_64.whl
# These were compiled on Tesla T4, should also work on P100, thanks to https://github.com/metrolobo

# If precompiled wheels don't work, install it with the following command. It will take around 40 minutes to compile.
# %pip install git+https://github.com/facebookresearch/xformers@1d31a3a#egg=xformers

!mkdir -p ~/.huggingface
!echo -n "{HUGGINGFACE_TOKEN}" > ~/.huggingface/token

print("Done!")

In [None]:
#@title #4. Training

if ENABLE_PRIOR_PRESERVATION:
  !accelerate launch train_dreambooth.py \
    --pretrained_model_name_or_path="{INPUT_MODEL_NAME}" \
    --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse" \
    --output_dir="{temp_folder}" \
    --with_prior_preservation \
    --prior_loss_weight=1.0 \
    --num_class_images=1000  \
    --seed=1337 \
    --resolution=512 \
    --train_batch_size=1 \
    --train_text_encoder \
    --revision="fp16" \
    --use_8bit_adam \
    --gradient_accumulation_steps=1 \
    --learning_rate=1e-6 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps="{training_steps}" \
    --save_interval="{save_interval}" \
    --pad_tokens \
    --concepts_list="concepts_list.json"

else:
  !accelerate launch train_dreambooth.py \
    --pretrained_model_name_or_path="{INPUT_MODEL_NAME}" \
    --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse" \
    --output_dir="{temp_folder}" \
    --seed=1337 \
    --resolution=512 \
    --train_batch_size=1 \
    --train_text_encoder \
    --revision="fp16" \
    --use_8bit_adam \
    --gradient_accumulation_steps=1 \
    --learning_rate=5e-6 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps="{training_steps}" \
    --save_interval="{save_interval}" \
    --pad_tokens \
    --concepts_list="concepts_list.json"


print("Done!")

In [None]:
#@title #5. Export Checkpoint files to Google Drive
from natsort import natsorted
from glob import glob
import os

half_arg = ""
fp16 = True
if fp16:
    half_arg = "--half"

def convert_diffusers(checkpoint_folder_name, steps_per_img):
  try:
    !python convert_diffusers_to_original_stable_diffusion.py --model_path "{checkpoint_folder_name}" --checkpoint_path "{output_file_path(steps_per_img)}" --half
    print(f"✅ Saved model '{output_file_path(steps_per_img)}'")
  except:
    print(f"❌ Could not save model '{output_file_name(training_steps)}.ckpt'")

if EXPORT_ONLY_LAST_CHECKPOINT:
  convert_diffusers(checkpoint_list[0]["temp_folder_path"], checkpoint_list[0]["steps_per_img"])
else:
  for checkpoint in checkpoint_list:
    convert_diffusers(checkpoint["temp_folder_path"], checkpoint["steps_per_img"])

if ENABLE_PRIOR_PRESERVATION:
  print(f"To invoke your trained subject, use '{combined_token_class_folder}' in the prompt.")
else:
  print(f"To invoke your trained subject, use '{combined_token}' in the prompt.")

In [None]:
#@title #5. Generate Sample Images
import torch
from contextlib import closing
from torch import autocast
from PIL import Image
from PIL import ImageDraw
from PIL import ImageFont
from diffusers import StableDiffusionPipeline, DDIMScheduler
from IPython.display import display
from google.colab import runtime
from tqdm import tqdm
import numpy
!wget https://github.com/MaxGhenis/random/raw/master/Roboto-Regular.ttf

numpy.random.seed(10)

def image_grid(imgs, rows, cols):
    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size\
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

if ENABLE_SAMPLE_GENERATION:
  os.makedirs(f'{model_output_dir}/samples', exist_ok=True)
  sample_steps = [10, 30, 60, 100, 150]
  sample_CFG = [4, 6, 9, 13, 18]
  height = 512
  width = 512

  print(f"Generating {len(checkpoint_list)} sample grids with {len(sample_steps) * len(sample_CFG)} images of size {width}px * {height}px ...")
  print("\n")

  for checkpoint in tqdm(checkpoint_list):
    checkpoint["sample_images"] = []
    device = "cuda"
    scheduler = DDIMScheduler(
        beta_start=0.00085,
        beta_end=0.012,
        beta_schedule="scaled_linear",
        clip_sample=False,
        set_alpha_to_one=False
        )
    
    pipe = StableDiffusionPipeline.from_pretrained(
        checkpoint['temp_folder_path'],
        scheduler=scheduler,
        revision="fp16",
        torch_dtype=torch.float16
        ).to(device)

    generator = torch.Generator(device=device)

    latents = None
    seeds = []
    seed = generator.seed()
    seeds.append(seed)
    generator = generator.manual_seed(10)
    
    image_latents = torch.randn(
        (1, pipe.unet.in_channels, height // 8, width // 8),
        generator=generator,
        device=device
    )

    latents = image_latents    
    # latents should have shape (4, 4, 64, 64) in this case
    # print(latents.shape)

    for prompt in SAMPLE_PROMPT_LIST:
      images = []
      print("\n")
      print(f"Generating prompt '{prompt}' ...")
      
      for CFG in sample_CFG:
        for steps in sample_steps:
          with autocast("cuda"):
            image = pipe(
                prompt,
                num_inference_steps=steps,
                guidance_scale=CFG,
                latents=latents,
                ).images

            im = ImageDraw.Draw(image[-1])
            myFont = ImageFont.truetype('Roboto-Regular.ttf', 20)
            im.multiline_text((6, 3), f"Steps: {steps}\nCFG: {CFG}", font=myFont, fill=(255, 255, 255))
            images = images + image

      grid = image_grid(images, rows=len(sample_CFG), cols=len(sample_steps))
      im = ImageDraw.Draw(grid)
      myFont = ImageFont.truetype('Roboto-Regular.ttf', 30)
      im.multiline_text((6, 50), f"Prompt: '{prompt}'\nSteps Per Image: {checkpoint['steps_per_img']}", font=myFont, fill=(255, 255, 255))
      
      file_path = f'{model_output_dir}/samples/<{prompt}>{checkpoint["output_file_name"]}.jpg'
      checkpoint["sample_images"].append({"img": grid, "img_path": file_path})
  
  # Debug
  #import pprint
  #pp = pprint.PrettyPrinter(indent=4)
  #pp.pprint(checkpoint_list)

  for checkpoint in checkpoint_list:
    for image in checkpoint["sample_images"]:
      image["img"].save(image["img_path"])
      print(f'Saved {image["img_path"]}')

print("\n")
print("✅ Done!")

if DISCONNECT_ON_COMPLETION:
  print("Disconnecting Runtime ...")
  runtime.unassign()