# Fine-Tuning StableDiffusion XL with DreamBooth

Over the past few years Generative AI models have popped up everywhere - from creating realistic responses to complex questions, to generating images and music to impress art critics around the globe. In this notebook we use the Hugging Face [Stable Diffusion XL (SDXL)](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) model to create images from text prompts. You'll see how to import the SDXL model and use it to generate an image. 

From there, you'll see how you can fine-tune the model using [DreamBooth](https://huggingface.co/docs/diffusers/training/dreambooth), a method for easily fine-tuning a text-to-image model. We'll use a small number of photos of **my dog** in this notebook to fine-tune SDXL. This will allow us to generate new images that include **my dog**! 

**IMPORTANT:** This project will utilize additional third-party open source software. Review the license terms of these open source projects before use. Third party components used as part of this project are subject to their separate legal notices or terms that accompany the components. You are responsible for confirming compliance with third-party component license terms and requirements.

### Stable Diffusion XL Model

First, we import the classes and libraries we need to run the notebook.

In [None]:
!pip install --upgrade pip
!pip install -q -r ../requirements.txt

In [None]:
import torch
from diffusers import StableDiffusionXLPipeline, DiffusionPipeline

Next, from the Hugging Face `diffusers` library, we create a `StableDiffusionXLPipeline` object from the SDXL base model. 

In [None]:
model_id="stabilityai/stable-diffusion-xl-base-1.0"

!echo ""
!echo "Using [{model_id}] as the pre-trained model for this demo"
!echo ""

pipe = StableDiffusionXLPipeline.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
pipe.to("cuda")
# pipe.enable_model_cpu_offload()

## Fine-Tuning the model with DreamBooth

Fine-Tuning is used to train an existing Machine Learning Model, given new information. In our case, we want to teach the SDXL model about **my dog**. This will allow us to create the perfect image of **my dog** in Space!

[DreamBooth](https://arxiv.org/abs/2208.12242) provides a way to fine-tune a text-to-image model using only a few images. Let's use this to tune our SDXL Model so that it knows about **my dog**!

We have 12 photos of **my dog** in our dataset - let's take a look at one of them.

In [None]:
from IPython.display import Image

display(Image(filename='../data/my-data/image01.png'))

In [None]:
#Lets clone `diffusers` repo and use the correct versions of huggingface cli and torch

!rm -rf diffusers
!git clone https://github.com/huggingface/diffusers
!cd diffusers && git checkout v0.21.4
!pip install -q peft==0.9.0 huggingface_hub[cli,torch]==0.21.4

Now we can use Hugging Face and DreamBooth to fine-tune this model. To do this we create a config, then specify some flags like an instance prompt, a resolution and a number of training steps for the fine-tuning algorithm to run. 

In [None]:
from accelerate.utils import write_basic_config
write_basic_config()

In [None]:
import os
import torch

# Set PYTORCH_CUDA_ALLOC_CONF
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:1024"

# Print total memory and other device properties
print(torch.cuda.get_device_properties(0).total_memory)
print(torch.cuda.get_device_properties(0))

In [None]:

!echo ""
!echo "Using [{model_id}] as the pre-trained model for this demo"
!echo ""

torch.cuda.empty_cache()

!accelerate launch ./diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path={model_id}  \
  --instance_data_dir=../data/my-data \
  --output_dir=../models/tuned-my-data \
  --mixed_precision="bf16" \
  --instance_prompt="a photo of my dog" \
  --resolution=768 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=100 \
  --seed="0" \
  --resume_from_checkpoint=latest

Now that the model is fine-tuned, let's tell our notebook where to find it.

In [None]:
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.load_lora_weights("../models/my-data")

Finally, we can use our fine-tuned model to create an image with **my dog** in it. Let's give it a go! 

In [None]:
image = pipe("A picture of my dog in space", num_inference_steps=75).images[0]

image