 ![bse_logo_textminingcourse](https://bse.eu/sites/default/files/bse_logo_small.png)

# Fine Tuning Model: DreamBooth (Google Colab)

This notebook goes through the modeling and inference code to implement a DreamBooth Stable Diffusion model using google colab. 

The main notebook utilized in this project can be found on our git in the folder 4. Fine Tuning Models, labeled: 
- "DreamBooth.ipynb"
- "DreamBooth-Inference.ipynb" 


### Setup Environment 

We changed the settings of our Google Colab in order to speed up the processing operations. 

- Change the runtime type to T4 GPU
    - Ensure that the memory size is at lead 12GB

In [None]:
# To ensure memory size for the google colab 
#!nvidia-smi

### Import Packages and Install Diffusion Libraries

In [None]:
import os
import shutil

# Image Display
from PIL import Image
import IPython.display as display
import matplotlib.pyplot as plt

In [None]:
# Diffuser libraries 

!pip install -qq "ipywidgets>=7,<8"
!git clone https://github.com/huggingface/diffusers
%cd /content/diffusers
!pip install .

In [None]:
# DreamBooth requirements & xFormers Library 

%cd /content/diffusers/examples/dreambooth
!pip install -r requirements.txt
!pip install bitsandbytes
!pip install transformers gradio ftfy accelerate
!pip install xformers

In [None]:
# Model Training 
from diffusers import DiffusionPipeline, UNet2DConditionModel
from transformers import CLIPTextModel
import torch

In [None]:
!pip install huggingface_hub

In [None]:
# Hugging Face 
from huggingface_hub import login

### Data preparation 

In [None]:
%cd /content

if os.path.exists("/content/custom_dataset"):
    print("Removing existing custom_dataset folder")
    !rm -rf /content/custom_dataset

print("Creating new custom_dataset folder")
!mkdir /content/custom_dataset
!mkdir /content/custom_dataset/class_images
!mkdir /content/custom_dataset/instance_images

print('Custom Dataset folder is created: /content/custom_dataset')

In [None]:
# Preprocessing data size function 

def resize_and_crop_images(folder_path, target_size=512):
    """
    Resize the images in a folder to have a smaller edge of the specified target size and save them to a new location.

    Parameters:
    - folder_path (str): Path to the folder containing the images.
    - target_size (int): Desired size for the smaller edge (default is 512).
    """
    # Define the output folder for resized and cropped images
    output_folder = '/kaggle/working/resized_images'
    
    # Create the output folder if it doesn't exist
    os.makedirs(output_folder, exist_ok=True)
    
    # Iterate through all files in the folder
    for filename in os.listdir(folder_path):
        file_path = os.path.join(folder_path, filename)

        # Check if the file is an image
        if os.path.isfile(file_path) and filename.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp')):
            # Open the image
            image = Image.open(file_path)

            # Get the original width and height
            width, height = image.size

            # Calculate the new size while maintaining the aspect ratio
            if width <= height:
                new_width = target_size
                new_height = int(height * (target_size / width))
            else:
                new_width = int(width * (target_size / height))
                new_height = target_size

            # Resize the image
            resized_image = image.resize((new_width, new_height))

            left = (new_width - target_size) // 2
            top = (new_height - target_size) // 2
            right = (new_width + target_size) // 2
            bottom = (new_height + target_size) // 2

            # Perform the center crop
            cropped_image = resized_image.crop((left, top, right, bottom))
            
            # Save the cropped image to the output folder
            cropped_image.save(os.path.join(output_folder, filename))

In [None]:
# Plotting images function 

def show_images_in_one_row(folder_path, target_size=256):
    images = []

    for filename in os.listdir(folder_path):
        file_path = os.path.join(folder_path, filename)
        if os.path.isfile(file_path) and filename.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp')):
            img = Image.open(file_path)
            img = img.resize((target_size, int(target_size * img.size[1] / img.size[0])))
            images.append(img)

    # Display images in one row
    fig, axes = plt.subplots(1, len(images), figsize=(len(images) * 3, 3))
    for ax, img in zip(axes, images):
        ax.imshow(img)
        ax.axis('off')
    plt.show()

### Preprocessing the data 

In [None]:
# Class Images
folder_path = '/content/custom_dataset/class_images'
if len(os.listdir(folder_path)):
  resize_and_crop_images(folder_path)
  show_images_in_one_row(folder_path)

# Instance Images
folder_path = '/content/custom_dataset/instance_images'
resize_and_crop_images(folder_path)
show_images_in_one_row(folder_path)

In [None]:
if os.path.exists("/content/outputs"):
    print("Removing existing outputs folder")
    !rm -rf /content/outputs

print("Creating new outputs folder")
!mkdir /content/outputs

print('Output folder is created: /content/outputs')

### Login into Hugging Face account 

Replace the name for the Hugging Face token where it states: "TOKEN_FROM_HF" to the desired name. This will be your own personal Hugging Gace token in order to save a private model and dataset. 

Instructions on using Hugging Face can be found here: https://github.com/maelysjb/Comics-GenAI/blob/main/README.md#:~:text=.gitignore-,README,-.md

In [None]:
login(token="TOKEN_FROM_HF") 

### Training Dreambooth Diffusion Model 

Replace the name for the Hugging Face model id where it states: "DreamBooth200" to the Hugging Face new model name. 

In [None]:
!python train_dreambooth.py --pretrained_model_name_or_path 'runwayml/stable-diffusion-v1-5' \
                            --revision "fp16" \
                            --instance_data_dir '/content/custom_dataset/instance_images' \
                            --class_data_dir '/content/custom_dataset/class_images' \
                            --instance_prompt 'An image of UnicornGirl in unicorn onesie.' \
                            --class_prompt 'An image of UnicornGirl in unicorn onesie.' \
                            --with_prior_preservation \
                            --prior_loss_weight 1.0 \
                            --num_class_images 100 \
                            --output_dir '/content/outputs' \
                            --resolution 512 \
                            --train_text_encoder \
                            --train_batch_size 2 \
                            --sample_batch_size 2 \
                            --max_train_steps 2000 \
                            --checkpointing_steps 1900 \
                            --gradient_accumulation_steps 1 \
                            --gradient_checkpointing \
                            --learning_rate 1e-6 \
                            --lr_scheduler 'constant' \
                            --lr_warmup_steps=0 \
                            --use_8bit_adam \
                            --validation_prompt 'An image of UnicornGirl in a unicorn onesie.' \
                            --num_validation_images 4 \
                            --mixed_precision "fp16" \
                            --enable_xformers_memory_efficient_attention \
                            --set_grads_to_none \
                            --push_to_hub \
                            --hub_model_id DreamBooth2000 
                            #--report_to 'wandb'

In [None]:
trained_model_path = '/content/outputs'

unet = UNet2DConditionModel.from_pretrained(trained_model_path + '/unet')
text_encoder = CLIPTextModel.from_pretrained(trained_model_path + '/text_encoder')

pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", unet=unet,
    text_encoder=text_encoder, dtype=torch.float16,
).to("cuda")

### Inference function

In [None]:
def inference(prompt, num_samples, negative_prompt, guidance_scale,
              num_inference_steps, height, width):
    images = pipeline(
        prompt,
        height=height,
        width=width,
        negative_prompt=negative_prompt,
        num_images_per_prompt=num_samples,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale
    ).images
    for i, image in enumerate(images):
        image.save(f"generated_image_{i}.png") 
        print(f"Generated image {i}:")
        display(image)  

### Display outputs from DreamBooth model

To generate different images of the character/data change the prompt ensuring to keep the same phrasing that was used while training. 

Some additional actions or emotions that were tested during inferencing are: 
- walking 
- crying 
- eating 
- with hands on face 
- playing tennis 
- doing yoga 

In [None]:
prompt = "An image of UnicornGirl in unicorn onesie running"
num_samples = 5
negative_prompt = ""
guidance_scale = 7.5
num_inference_steps = 50
height = 512
width = 512

inference(prompt, num_samples, negative_prompt, guidance_scale, num_inference_steps, height, width)