# Description
This notebook lets you finetune the [SDXL](https://huggingface.co/papers/2307.01952) model on a set of dozens of images related to one topic/theme. It uses DreamBooth and LoRA training methods. You can get good results with less than 20 images. 

DreamBooth is a training technique that updates the entire diffusion model by training on just a few images of a common subject or style. It works by associating a special prompt "_instance_prompt_" with the images. The keyword in this prompt is what will trigger the tuned weights during the inference.

LoRA is a training technique that significantly reduces the number of trainable parameters. It works by inserting a smaller number of new weights into the model and only these are trained. This produces smaller weights after the finetuning.

#### Experiment
The model was finetuned on 15 images of moroccan cities of size 512x512, with instance prompt = "a moroccan city".

The finetuning takes ~2 hours on a TESLA P100 GPU with a batch size of 1 and a learning rate of 1e-4.

Some results can be seen below:

![image.png](docs/result_1.jpg)
![image.png](docs/result_2.jpg)



# Setup

In [3]:
!git clone https://github.com/huggingface/diffusers

Cloning into 'diffusers'...
Updating files:  48% (617/1261)
Updating files:  49% (618/1261)
Updating files:  50% (631/1261)
Updating files:  51% (644/1261)
Updating files:  52% (656/1261)
Updating files:  53% (669/1261)
Updating files:  54% (681/1261)
Updating files:  55% (694/1261)
Updating files:  56% (707/1261)
Updating files:  57% (719/1261)
Updating files:  58% (732/1261)
Updating files:  59% (744/1261)
Updating files:  60% (757/1261)
Updating files:  61% (770/1261)
Updating files:  62% (782/1261)
Updating files:  63% (795/1261)
Updating files:  64% (808/1261)
Updating files:  65% (820/1261)
Updating files:  66% (833/1261)
Updating files:  67% (845/1261)
Updating files:  68% (858/1261)
Updating files:  69% (871/1261)
Updating files:  70% (883/1261)
Updating files:  71% (896/1261)
Updating files:  72% (908/1261)
Updating files:  73% (921/1261)
Updating files:  74% (934/1261)
Updating files:  75% (946/1261)
Updating files:  76% (959/1261)
Updating files:  77% (971/1261)
Updating fil

In [None]:
! pip install -q /diffusers/.
! pip install -q -r /diffusers/examples/dreambooth/requirements.txt
! pip install -q bitsandbytes>=0.40.0
! pip install -q xformers>=0.0.20
! pip install -q numpy>= 1.22.4

In [None]:
import os
import torch
from diffusers import DiffusionPipeline, AutoencoderKL
from accelerate.utils import write_basic_config
from huggingface_hub import whoami, upload_folder, create_repo, snapshot_download

write_basic_config()

## Parameters to set

In [None]:
# specify your HF token if you want to push the model to a HF repo
HF_TOKEN = ""
save_to_hf = False

# the directory where the images are stored or will be stored if you choose to download them from HF hub (see below)
img_data_dir = "data" 

# the directory where the lora weights will be stored
model_dir = "model" 

# the prompt that contains the common theme keyword of your images
instance_prompt = "a photo of a moroccan city" 



In [None]:
os.makedirs(model_dir, exist_ok=True)
os.makedirs('output/pretrained/', exist_ok=True)
os.makedirs('output/finetuned/', exist_ok=True)

# Prompt the pre-trained model
Uncomment only if you need to test the pretrained sdxl. 

If you're running this notebook on < 14GB VRAM then you cannot launch both this section and the finetuning script. In this case, restart your session. 

In [None]:
prompt = "moroccan city"

In [None]:
pipeline = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0").to("cuda")

num_inference_steps = 50 
guidance_scale = 10
image = pipeline(prompt,
                 num_inference_steps=num_inference_steps,
                 guidance_scale=guidance_scale).images[0]
image

# Finetuning

## Download dataset from HF (optional)

In [None]:
# os.makedirs(img_data_dir, exist_ok=True)
# snapshot_download(
#     "imomayiz/morocco-img",
#     local_dir=img_data_dir,
#     repo_type="dataset",
#     ignore_patterns=".gitattributes",
# )

## Finetune
Upload your images to _data_dir_ or download an img dataset from HF by executing the cell above.
 
Set the training parameters depending on the available VRAM.

The weights will be saved under _model_dir_.

In [None]:
#!/usr/bin/env bash
! accelerate launch /diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
  --pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
  --instance_data_dir=img_data_dir \
  --output_dir=model_dir \
  --instance_prompt=instance_prompt \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=3 \
  --gradient_checkpointing \
  --learning_rate=1e-4 \
  --snr_gamma=5.0 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --mixed_precision="fp16" \
  --use_8bit_adam \
  --max_train_steps=500 \
  --checkpointing_steps=717 \
  --seed="0"

## Test the finetuned model

In [None]:
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    vae=vae,
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True
)
pipe.load_lora_weights(model_dir)
_ = pipe.to("cuda")

In [None]:
prompt = "a photo of a modern moroccan city" # @param

image = pipe(prompt=prompt, num_inference_steps=20).images[0]
image

## Save model to HF (optional)

In [None]:
def save_model_card(
    repo_id: str,
    images=None,
    base_model=str,
    train_text_encoder=False,
    instance_prompt=str,
    validation_prompt=str,
    repo_folder=None,
    vae_path=None,
):
    img_str = "widget:\n" if images else ""
    for i, image in enumerate(images):
        image.save(os.path.join(repo_folder, f"image_{i}.png"))
        img_str += f"""
        - text: '{validation_prompt if validation_prompt else ' ' }'
          output:
            url:
                "image_{i}.png"
        """

    yaml = f"""
---
tags:
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
- text-to-image
- diffusers
- lora
- template:sd-lora
{img_str}
base_model: {base_model}
instance_prompt: {instance_prompt}
license: openrail++
---
    """

    model_card = f"""
# SDXL LoRA DreamBooth - {repo_id}

<Gallery />

## Model description

These are {repo_id} LoRA adaption weights for {base_model}.

The weights were trained  using [DreamBooth](https://dreambooth.github.io/).

LoRA for the text encoder was enabled: {train_text_encoder}.

Special VAE used for training: {vae_path}.

## Trigger words

You should use {instance_prompt} to trigger the image generation.

## Download model

Weights for this model are available in Safetensors format.

[Download]({repo_id}/tree/main) them in the Files & versions tab.

"""
    with open(os.path.join(repo_folder, "README.md"), "w") as f:
        f.write(yaml + model_card)

In [None]:
if save_to_hf:
    username = whoami(token=HF_TOKEN)["name"]
    repo_id = f"{username}/sdxl_lora"
    repo_id = create_repo(repo_id, exist_ok=True, token=HF_TOKEN).repo_id
    
    save_model_card(
        repo_id = repo_id,
        images=[],
        base_model="stabilityai/stable-diffusion-xl-base-1.0",
        train_text_encoder=False,
        instance_prompt=instance_prompt,
        validation_prompt=None,
        repo_folder="",
        vae_path="madebyollin/sdxl-vae-fp16-fix",
    )

    upload_folder(
        token=HF_TOKEN,
        repo_id=repo_id,
        folder_path=model_dir,
        commit_message="End of training",
        ignore_patterns=["step_*", "epoch_*"],
    )