---

📌 **This notebook has been updated in [jhj0517/finetuning-notebooks](https://github.com/jhj0517/finetuning-notebooks) repository!**

## Version : 1.0.0
---

In [None]:
#@title #(Optional) Check GPU

#@markdown To train SDXL lora at least 12GB VRAM is recommended.
#@markdown <br> And you need at least 16GB for CPU RAM, which is unfortunately not available on the free tier in Colab.
#@markdown <br>You can check your GPU setup before start.
!nvidia-smi

In [4]:
#@title #1. Install Dependencies
#@markdown This notebook is powered by https://github.com/huggingface/diffusers
!git clone https://github.com/huggingface/diffusers
%cd diffusers
!pip install -e .

# Cherry picked dependencies from https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/requirements_sdxl.txt to use in Colab.
!pip install ftfy
!pip install datasets
!pip install bitsandbytes
# Only install this if you want to use optimization with xformers.
# !pip install xformers


# Comment on the requirements above, and uncomment below if you're not using Colab.
# !pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
# !pip install deepspeed
# !pip install accelerate>=0.22.0
# !pip install transformers>=4.25.1
# !pip install ftfy
# !pip install tensorboard
# !pip install Jinja2
# !pip install datasets
# !pip install peft==0.7.0
# !pip install xformers

Cloning into 'diffusers'...
remote: Enumerating objects: 100693, done.[K
remote: Counting objects: 100% (534/534), done.[K
remote: Compressing objects: 100% (297/297), done.[K
remote: Total 100693 (delta 405), reused 253 (delta 227), pack-reused 100159 (from 3)[K
Receiving objects: 100% (100693/100693), 75.07 MiB | 19.06 MiB/s, done.
Resolving deltas: 100% (74292/74292), done.
/content/diffusers
Obtaining file:///content/diffusers
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: diffusers
  Building editable for diffusers (pyproject.toml) ... [?25l[?25hdone
  Created wheel for diffusers: filename=diffusers-0.35.0.dev0-0.editable-py3-none-any.whl size=11370 sha256=09a2ceb5b9be19a429252d79657533e66e7cdbbc3a115ed833b7cf087c289dd6
  Stored i

In [3]:
#@title # 2. (Optional) Mount Google Drive

#@markdown It's not mandatory but it's recommended to mount to Google Drive and use the Google Drive's path for your training image dataset.

#@markdown The dataset should have following structure:

#@markdown This notebook uses diffuser's dreambooth LoRA training, you only need image files in the dataset with this way.

#@markdown ### Example File Structure (Image Files Only):
#@markdown ```
#@markdown your-dataset/
#@markdown ├── a (1).png         # Image file
#@markdown ├── a (2).png         # Another image file
#@markdown ├── a (3).png         # Another image file
#@markdown ```

from google.colab import drive
import os
drive.mount('/content/drive')

Mounted at /content/drive


In [5]:
#@title # 3. (Optional) Register Huggingface Token To Download Base Model

#@markdown If you don't have entire base model files ([stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main)) in the drive you need to sign in to Huggingface to download the model.

#@markdown Get your tokens from https://huggingface.co/settings/tokens, and register it in colab's seceret as **`HF_TOKEN`** and use it in any notebook. ( 'Read' permission is enough )

#@markdown To register secrets in colab, click on the key-shaped icon in the left panel and enter your **`HF_TOKEN`** like this:

#@markdown ![image](https://media.githubusercontent.com/media/jhj0517/finetuning-notebooks/master/docs/screenshots/colab_secrets.png)

import getpass
import os
from google.colab import userdata

hf_token = userdata.get('HF_TOKEN')
os.environ['HF_TOKEN'] = hf_token

print("HF_TOKEN environment variable has been set.")

HF_TOKEN environment variable has been set.


In [None]:
#@title # 4. Train with Parameters
import os
import toml
import json
import re

#@markdown ## Paths Configuration
DATASET_DIR = "/content/drive/MyDrive/myface" # @param {type:"string"}
OUTPUT_DIR = "/content/drive/MyDrive/sdxl/outputs" # @param {type:"string"}
OUTPUT_NAME = "My-SDXL-LoRA-V1" # @param {type:"string"}

OUTPUT_DIR = os.path.join(OUTPUT_DIR, OUTPUT_NAME)
os.makedirs(OUTPUT_DIR, exist_ok=True)

#@markdown ## Base Model Configuration
BASE_MODEL_PATH_OR_ID = "stabilityai/stable-diffusion-xl-base-1.0" # @param {type:"string"}
BASE_VAE_PATH_OR_ID = "madebyollin/sdxl-vae-fp16-fix" # @param {type:"string"}

#@markdown ## Dataset Configuration
# CAPTION_EXTENSION = ".txt" # @param {type:"string"}
RESOLUTION = 1024 # @param {type:"integer"}
# CAPTION_COLUMN = "text"

#@markdown ## Training Settings
MIXED_PRECISION = "bf16" # @param ["no", "fp16", "bf16"]
INSTANCE_PROMPT = "a mia girl" # @param {type:"string"}
RANDOM_FLIP = True # @param {type:"boolean"}
TRAIN_BATCH_SIZE = 1 # @param {type:"integer"}
MAX_TRAIN_STEPS = 1000 # @param {type:"integer"}
CHECKPOINTING_STEPS = 1 # @param {type:"integer"}
LEARNING_RATE = 1e-4 # @param {type:"number"}
LR_SCHEDULER = "constant" # @param ["linear", "cosine", "cosine_with_restarts", "polynomial", "constant", "constant_with_warmup"]
LR_WARMUP_STEPS = 0 # @param {type:"integer"}
GRADIENT_ACCUMULATION_STEPS = 4 # @param {type:"integer"}
SEED = 77 # @param {type:"integer"}
GRADIENT_CHECKPOINTING = True # @param {type:"boolean"}
USE_8_BIT_ADAM = True # @param {type:"boolean"}
# ENABLE_XFORMERS_MEMORY_EFFICIENT_ATTENTION = False # @param {type:"boolean"}


#@markdown ## Network Settings
RANK = 4 # @param {type:"integer"}


#@markdown ## Validation Configuration
#@markdown WandB is a 3rd party service, to use it you need to get an API key from https://wandb.ai/authorize.
ENABLE_WANDB = False # @param {type:"boolean"}
VALIDATION_PROMPT = "mia is cute"  # @param {type:"string"}
# NUM_VALIDATION_IMAGES = 4 # @param {type:"integer"}
VALIDATION_EPOCHS = 25 # @param {type:"integer"}

# Write Command
command_parts = [
    "accelerate", "launch",
    "\"/content/diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py\"",
]

command_parts.extend([
    f"--pretrained_model_name_or_path=\"{BASE_MODEL_PATH_OR_ID}\"",
    f"--pretrained_vae_model_name_or_path=\"{BASE_VAE_PATH_OR_ID}\"",
    f"--instance_data_dir=\"{DATASET_DIR}\"",
    f"--instance_prompt=\"{INSTANCE_PROMPT}\"",
#    f"--caption_column={CAPTION_COLUMN}",
    f"--mixed_precision={MIXED_PRECISION}",
    f"--resolution={RESOLUTION}",
    f"--max_train_steps={MAX_TRAIN_STEPS}",
    f"--train_batch_size={TRAIN_BATCH_SIZE}",
    f"--checkpointing_steps={CHECKPOINTING_STEPS}",
    f"--learning_rate={LEARNING_RATE}",
    f"--lr_scheduler={LR_SCHEDULER}",
    f"--lr_warmup_steps={LR_WARMUP_STEPS}",
    f"--seed={SEED}",
    f"--output_dir={OUTPUT_DIR}",
    f"--validation_prompt=\"{VALIDATION_PROMPT}\"",
#    f"--num_validation_images={NUM_VALIDATION_IMAGES}",
    f"--validation_epochs={VALIDATION_EPOCHS}",
    f"--gradient_accumulation_steps={GRADIENT_ACCUMULATION_STEPS}",
    f"--rank={RANK}",

])

if RANDOM_FLIP:
    command_parts.append("--random_flip")

if ENABLE_WANDB:
    command_parts.append("--report_to=\"wandb\"")

if GRADIENT_CHECKPOINTING:
    command_parts.append("--gradient_checkpointing")

if USE_8_BIT_ADAM:
    command_parts.append("--use_8bit_adam")

# if ENABLE_XFORMERS_MEMORY_EFFICIENT_ATTENTION:
#     command_parts.append("--enable_xformers_memory_efficient_attention")

# Write metadata.jsonl for the dataset
def create_metadata_jsonl(dataset_dir, caption_extension=".txt"):
    metadata = []
    image_files = [f for f in os.listdir(dataset_dir) if f.lower().endswith(('.png', '.jpg', '.jpeg'))]

    for image_file in image_files:
        base_name = os.path.splitext(image_file)[0]
        caption_file = f"{base_name}{caption_extension}"

        if os.path.exists(os.path.join(dataset_dir, caption_file)):
            try:
                with open(os.path.join(dataset_dir, caption_file), "r", encoding="utf-8") as f:
                    caption = f.read().strip()

                match = re.search(r"\((\d+)\)", base_name)
                if match:
                    file_number = int(match.group(1))
                    new_file_name = f"{file_number:04d}.png"
                else:
                    file_number = len(metadata) + 1
                    new_file_name = f"{file_number:04d}.png"

                metadata.append({"file_name": new_file_name, "text": caption})

                os.rename(os.path.join(dataset_dir, image_file), os.path.join(dataset_dir, new_file_name))
                os.rename(os.path.join(dataset_dir, caption_file), os.path.join(dataset_dir, f"{file_number:04d}{caption_extension}"))

            except Exception as e:
                print(f"Error processing {image_file}: {e}")
        else:
            print(f"Warning: Caption file {caption_file} not found for {image_file}")

    metadata_path = os.path.join(dataset_dir, "metadata.jsonl")
    with open(metadata_path, "w", encoding="utf-8") as outfile:
        for item in metadata:
            json.dump(item, outfile, ensure_ascii=False)
            outfile.write("\n")

# Diffuser's script does not use each caption with dreambooth.
# create_metadata_jsonl(DATASET_DIR, CAPTION_EXTENSION)
# print(f"{os.path.join(DATASET_DIR, 'metadata.jsonl')} has written.")

# Train
!accelerate config default
command = " ".join(command_parts)
print(command)
!{command}

Configuration already exists at /root/.cache/huggingface/accelerate/default_config.yaml, will not override. Run `accelerate config` manually or pass a different `save_location`.
accelerate launch "/content/diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py" --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" --pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" --instance_data_dir="/content/drive/MyDrive/myface" --instance_prompt="a mia girl" --mixed_precision=bf16 --resolution=1024 --max_train_steps=1000 --train_batch_size=1 --checkpointing_steps=1 --learning_rate=0.0001 --lr_scheduler=constant --lr_warmup_steps=0 --seed=77 --output_dir=/content/drive/MyDrive/sdxl/outputs/My-SDXL-LoRA-V1 --validation_prompt="mia is cute" --validation_epochs=25 --gradient_accumulation_steps=4 --rank=4 --random_flip --gradient_checkpointing --use_8bit_adam
ipex flag is deprecated, will be removed in Accelerate v1.10. From 2.7.0, PyTorch has all needed optimi

In [None]:
#@markdown # 5. (Optional) Test your LoRA

from huggingface_hub.repocard import RepoCard
from diffusers import DiffusionPipeline
import torch

BASE_MODEL_PATH_OR_ID = "stabilityai/stable-diffusion-xl-base-1.0" # @param {type:"string"}
YOUR_LORA_PATH = "/content/drive/MyDrive/finetuning-notebooks/sdxl/outputs/something/pytorch_lora_weights.safetensors" # @param {type:"string"}
PROMPT = "A picture of a sks dog in a bucket" # @param {type:"string"}

pipe = DiffusionPipeline.from_pretrained(BASE_MODEL_PATH_OR_ID, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.load_lora_weights(YOUR_LORA_PATH)
image = pipe(PROMPT, num_inference_steps=25).images[0]
image.save("sks_dog.png")

from IPython.display import display
display(image)