# **EveryDream 2 - Jupyter Notebook**

# Fine-tuning Stable Diffusion base model with a photographer image dataset

### Bella Kotak

**Bella Kotak** is an award-winning UK-based photographer with a strong, distinctive style. For this notebook tutorial, we will fine-tune a Stable Diffusion base model using her recent artwork.

Go and check her instagram account now at [https://www.instagram.com/bellakotak](https://www.instagram.com/bellakotak) ...and be amazed!

### Before fine-tuning the base model
Hope you checked her portfolio, but if you haven't already, have a look at it now because you need to know how much the baseline model needs to learn to be able to create decent-looking synthetic images of her unique artistic vision. 

If you prompt the base Stable Diffusion 1.5 model with **"a black and white photo of a woman wearing a floral crown and holding a bouquet of flowers in the style of Bella Kotak"**, the base model will struggle to generate a picture that truly represents her style, or even follow the prompt. 
  
![Bella before](https://drive.google.com/uc?export=view&id=1iUX_aMLQCulbcLMEMbta9GRsPk4VVG-i)

### After fine-tuning the base model
Thankfully by fine-tuning the base Stable Diffusion model using hundreds of captioned images of Bella Kotak, the ability of the base model to generate better-looking photographs based on her style is rather improved. Even the prompt is better followed. 

The following image was generated using the same prompt, seed, 614x512 resolution, and CFG value as the image above.

![Bella after](https://drive.google.com/uc?export=view&id=1GgOyCNIFAkjsvkVcYc7U3SlgppLXMJPX)


It's not a perfect image but yet the differences between the non-fine tuned model image and the fine-tuned one are very noticeable. That's the power of fine-tuning a Stable Diffusion base model with a custom image dataset.

# Step 1. Upload training images

### Download and extract the dataset 


We are going to download into our GPU instance an image dataset that has already been prepared. That is, images are organised in folders and each one is captioned in the following format:

_"[description of the image] in the style of Bella Kotak.jpg"_

Running the cell bellow will:
1) Download a public ZIP file from Google Drive. 

2) Extract the zip file into the **input** subfolder.

3) Delete the zip file so we are only left with image files. 


In [None]:
import os
import zipfile

# Install gdown (to be able to download files from Google Drive)
!pip install gdown

# Download dataset
google_id = "1Ifk07HeqxHfCCOCvb5oDF-cdxfkfsuq-"
path_to_dataset = "input/dataset.zip"

if not os.path.exists(path_to_dataset):
    !gdown google_id -O path_to_dataset
else:
    print(f"Already downloaded `{path_to_dataset}`")

# Unzip dataset into 'input' folder
with zipfile.ZipFile(path_to_dataset, 'r') as zip_ref:
    zip_ref.extractall('input/dataset')

# Remove zip file
os.remove(path_to_dataset)

print('Done')

# Step 2. Start Training

Once our training images are inside the input **folder** we are ready to train the model. 


### Training configuration
We will train with the following configuration settings:

* **project name**: "sd1_kotak" <= Name of the project. It is convenient to name it in a way that identifies it from other training sessions.
* **data_root**: "input" <= Folder location of the training images
* **max epochs**: 100 <= An epoch refers to the one entire passing of training images through the trainer. We are doing 100 entire passes.  
* **batch size**: 6 <= Determines the amount of images that are going to be trained every epoch
* **sample steps**: 80 <= Determines how frequently samples are generated. In this case we will save every 20 epoch steps.   
* **save every n epochs**: 20 <= Checkpoints will be saved every 20 epochs (since we are doing 100 epochs, we will end with 5 checkpoints) 
* **learning rate** (lr): 1.2e-6 <= The learning rate determines the pace at which our model learns. This is the default value and generally, it works for most cases. 
* **learning rate scheduler** (lr scheduler): constant <= The scheduler controls the way the learning rate is adjusted over time 
* **save ckpt dir**: "ouput" <= Folder location of the saved checkpoints

For more information about how to configure the trainer see the related chapter or check EveryDream 2 trainer's [official documentation](https://github.com/victorchall/EveryDream2trainer/blob/main/doc/TRAINING.md). 


### Running the training session

To start the training run the cell bellow. This could take a while depending on your number of images, batch size and max_epochs.

In [None]:
# Start the training

%run train.py --resume_ckpt "learn2train/stable-diffusion-v1-5" \
--project_name "sd1_kotak" \
--data_root "input" \
--max_epochs 100 \
--sample_steps 80 \
--batch_size 6 \
--save_every_n_epochs 20 \
--lr 1.2e-6 \
--lr_scheduler constant \
--save_ckpt_dir "output"

# Step 3. Hugging Face upload 

Instead of manually downlading your checkpoints into your computer, you will upload them to your very own Hugging Face repository. This offers you some advantages: First of all, it's free storage! Up to 100GB last time we checked. You can also easily share your models or showcase your work to others.  


### Enable uploading 

To enable uploading to Hugging Face go to your [Hugging Face's Access Tokens page](https://huggingface.co/settings/tokens) and click the "New Token" button below the "User Access Tokens" box. 

When the "Create a new access token" window pops-up fill-in the following:

* Name: **Everydream2** (or whatever you want)
* Role: **Write**

And click "Generate a token" to create your access token. 


### Create repository

Crate your own Hugging Face repository to host your checkpoints.  Go here [Create New HF Model](https://huggingface.co/new)  

* Model name: **kotak**
* Licence: **leave it blank**
* Private or Public: **your choice**

Make sure you write down the repo name you make for future use.  You can reuse it later.


### Log-in into your Hugging Face account

Run the cell below and paste your Hugging Face token into the prompt to log into your account to be able to upload data into your repo.

In [None]:
from huggingface_hub import notebook_login, hf_hub_download
import os
notebook_login()

### Upload checkpoints to your Hugging Face account

Make sure you are **logged in** using the above login cell first.

Use the cell below to upload one or more checkpoints to your personal Hugging Face repository. You should already be authorized to Huggingface by token if you used the download/token cells above.

When you run the cell below, a box will show up and you need to  **CLICK** to select which .ckpt files are marked for upload. This allows you to select which ones to upload.  If you don't click of the ckpts, nothing will happen.

You will also be required to fill-in your username and your repository name:
* Hugging Face username: Look for your username in [HuggingFace account page](https://huggingface.co/settings/account).
* Hugging Face repository name: **kotak**




In [None]:
# Run this cell after reading the instructions of the cell above. 

import glob
import os
from huggingface_hub import HfApi
from ipywidgets import *

all_ckpts = [f for f in glob.glob("*.ckpt")]
  
ckpt_picker = SelectMultiple(options=all_ckpts, layout=Layout(width="600px")) 
hfuser = Text(placeholder='Hugging Face username')
hfrepo = Text(placeholder='Hugging Face repository name')

api = HfApi()
upload_btn = Button(description='Upload')
out = Output()

def upload_ckpts(_):
    repo_id=f"{hfuser.value or hfuser.placeholder}/{hfrepo.value or hfrepo.placeholder}"
    with out:
        if ckpt_picker is None or len(ckpt_picker.value) < 1:
            print("Nothing selected for upload, make sure to click one of the ckpt files in the list, or, you have no ckpt files in the current directory.")
        for ckpt in ckpt_picker.value:
            print(f"Uploading to HF: huggingface.co/{repo_id}/{ckpt}")
            response = api.upload_file(
                path_or_fileobj=ckpt,
                path_in_repo=ckpt,
                repo_id=repo_id,
                repo_type=None,
                create_pr=1,
            )
            display(response)
        print("DONE")

upload_btn.on_click(upload_ckpts)
box = VBox([ckpt_picker, HBox([hfuser, hfrepo]), upload_btn, out])

display(box)

### Accept the PRs the cell above created

Go back to your Hugging Face repository and accept the PRs the above cell created to see your files to validate the uploads.

# Step 4. Test inference on your checkpoints

In [8]:
from ipywidgets import *
from IPython.display import display, clear_output
import os
import gc
import random
import torch
import inspect

from torch import autocast
from diffusers import StableDiffusionPipeline, AutoencoderKL, UNet2DConditionModel, DDIMScheduler, DDPMScheduler, PNDMScheduler, EulerAncestralDiscreteScheduler
from transformers import CLIPTextModel, CLIPTokenizer


checkpoints_ts = []
for root, dirs, files in os.walk("."):
        for file in files:
            if os.path.basename(file) == "model_index.json":
                ts = os.path.getmtime(os.path.join(root,file))
                ckpt = root
                checkpoints_ts.append((ts, root))

checkpoints = [ckpt for (_, ckpt) in sorted(checkpoints_ts, reverse=True)]
full_width = Layout(width='600px')
half_width = Layout(width='300px')

checkpoint = Dropdown(options=checkpoints, description='Checkpoint:', layout=full_width)
prompt = Textarea(value='a photo of ', description='Prompt:', layout=full_width)
height = IntSlider(value=512, min=256, max=768, step=32, description='Height:', layout=half_width)
width = IntSlider(value=512, min=256, max=768, step=32, description='Width:', layout=half_width)
cfg = FloatSlider(value=7.0, min=0.0, max=14.0, step=0.2, description='CFG Scale:', layout=half_width)
steps = IntSlider(value=30, min=10, max=100, description='Steps:', layout=half_width)
seed = IntText(value=-1, description='Seed:', layout=half_width)
generate_btn = Button(description='Generate', layout=full_width)
out = Output()

def generate(_):
    with out:
        clear_output()
        display(f"Loading model {checkpoint.value}")
        actual_seed = seed.value if seed.value != -1 else random.randint(0, 2**30)

        text_encoder = CLIPTextModel.from_pretrained(checkpoint.value, subfolder="text_encoder")
        vae = AutoencoderKL.from_pretrained(checkpoint.value, subfolder="vae")
        unet = UNet2DConditionModel.from_pretrained(checkpoint.value, subfolder="unet")
        tokenizer = CLIPTokenizer.from_pretrained(checkpoint.value, subfolder="tokenizer", use_fast=False)
        scheduler = DDIMScheduler.from_pretrained(checkpoint.value, subfolder="scheduler")
        text_encoder.eval()
        vae.eval()
        unet.eval()

        text_encoder.to("cuda")
        vae.to("cuda")
        unet.to("cuda")

        pipe = StableDiffusionPipeline(
            vae=vae,
            text_encoder=text_encoder,
            tokenizer=tokenizer,
            unet=unet,
            scheduler=scheduler,
            safety_checker=None, # save vram
            requires_safety_checker=None, # avoid nag
            feature_extractor=None, # must be none of no safety checker
        )

        pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
        
        print(inspect.cleandoc(f"""
              Prompt: {prompt.value}
              Resolution: {width.value}x{height.value}
              CFG: {cfg.value}
              Steps: {steps.value}
              Seed: {actual_seed}
              """))
        with autocast("cuda"):
            image = pipe(prompt.value, 
                generator=torch.Generator("cuda").manual_seed(actual_seed),
                num_inference_steps=steps.value, 
                guidance_scale=cfg.value,
                width=width.value,
                height=height.value
            ).images[0]
        del pipe
        gc.collect()
        with torch.cuda.device("cuda"):
            torch.cuda.empty_cache()
            torch.cuda.ipc_collect()
        display(image)
            
generate_btn.on_click(generate)
box = VBox(
    children=[
        checkpoint, prompt, 
        HBox([VBox([width, height]), VBox([steps, cfg])]), 
        seed, 
        generate_btn, 
        out]
)


display(box)

ModuleNotFoundError: No module named 'torch'