# LoRA Fine-Tuning of SDXL for Studio Ghibli Style

This notebook provides a complete workflow for fine-tuning a **Stable Diffusion XL (SDXL)** model to generate images in a specific artistic style using **Low-Rank Adaptation (LoRA)**. The goal is to teach the model the "Studio Ghibli" aesthetic using a custom dataset.

---

### Core Process

1.  **Setup & Installation**
    * Installs essential libraries like `diffusers`, `transformers`, `peft`, `bitsandbytes`, and `accelerate` for distributed training.
    * Downloads a specialized script for training a Dreambooth LoRA on SDXL.

2.  **Data Preparation**
    * Loads the "Nechintosh/ghibli" image dataset from the Hugging Face Hub.
    * Prepares the training data by saving images locally and creating a `metadata.jsonl` file.
    * Each image caption in the metadata is prepended with the trigger phrase "**Studio Ghibli**" to associate the style with the prompt.

3.  **LoRA Training**
    * Uses `accelerate` to launch the training script for efficient, distributed training on a GPU.
    * Key training parameters are configured, including the base SDXL model, data directories, learning rate, batch size, and resolution.
    * The training progress and metrics are logged to **Weights & Biases (wandb)** for monitoring.

4.  **Inference and Image Generation**
    * Loads the base SDXL model pipeline.
    * Applies the newly trained LoRA weights to the pipeline.
    * Demonstrates how to generate a new image by providing a prompt that includes the "**Studio Ghibli**" trigger, successfully creating an image in the fine-tuned style.

# Install required libraries with version specifications


In [None]:
!pip install -U peft==0.15.1 bitsandbytes transformers accelerate git+https://github.com/huggingface/diffusers.git datasets wandb -q

# Download the training script for Dreambooth LoRA SDXL


In [None]:
!wget https://raw.githubusercontent.com/AnKiTu03/dreambot/refs/heads/main/train_dreambooth_lora_sdxl.py

# Import necessary libraries for data handling, visualization, and system operations


In [None]:
from IPython import get_ipython
from IPython.display import display
from datasets import load_dataset
import os
import json
import locale
import wandb

# Log in to Wandb (Weights & Biases) for experiment tracking
### Replace " " with your actual Wandb API key

In [None]:
wandb.login(key="")

# Load the specified dataset from the Hugging Face Hub


In [None]:
ds = load_dataset("Nechintosh/ghibli", streaming=True)

# Save dataset images and create a metadata file in JSON Lines format


In [None]:
local_data_dir = "./ghibli_training_data"
os.makedirs(local_data_dir, exist_ok=True)

metadata_path = os.path.join(local_data_dir, "metadata.jsonl")

with open(f'{local_data_dir}metadata.jsonl', "w") as outfile:
    for i, example in enumerate(ds["train"]):
        if i <250:
            image = example["image"]
            text =  example["caption"]

            # Save the image
            image_filename = f"image_{i:04d}.png"
            image_path = os.path.join(local_data_dir, image_filename)
            image.save(image_path)

            # Create the metadata entry
            entry = {"file_name": image_filename, "prompt": text}

            # Write the metadata entry to the JSON Lines file
            json.dump(entry, outfile)
            outfile.write('\n')

print(f"Saved dataset to {local_data_dir} with metadata.jsonl")

# Create a directory for the output of the LoRA training


In [None]:
!mkdir -p /kaggle/working/ghibli_LoRA


# Read, modify, and collect the updated entries


In [None]:
metadata_path = "/kaggle/working/ghibli_training_datametadata.jsonl"
updated_lines = []

# Read, modify, and collect the updated entries
with open(metadata_path, "r") as infile:
    for line in infile:
        entry = json.loads(line)
        entry["prompt"] = "Studio Ghibli " + entry["prompt"]
        updated_lines.append(entry)

# Overwrite the original file with updated captions
with open(metadata_path, "w") as outfile:
    for entry in updated_lines:
        json.dump(entry, outfile)
        outfile.write('\n')

print("Updated captions in metadata.jsonl with 'Studio Ghibli' prefix.")


# Set locale encoding and configure accelerate


In [None]:
locale.getpreferredencoding = lambda: "UTF-8"
!accelerate config default

# Launch the Dreambooth LoRA training script with specified parameters


In [None]:
#!/usr/bin/env bash
!accelerate launch train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
  --pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
  --instance_data_dir={local_data_dir}\
  --output_dir="/kaggle/working/ghibli_LoRA" \
  --caption_column="prompt"\
  --mixed_precision="fp16" \
  --instance_prompt="Studio Ghibli" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=3 \
  --gradient_checkpointing \
  --learning_rate=1e-4 \
  --snr_gamma=5.0 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --mixed_precision="fp16" \
  --use_8bit_adam \
  --max_train_steps=300 \
  --checkpointing_steps=717 \
  --seed="0"\
  --report_to=wandb

# Load the trained LoRA model and the base Stable Diffusion XL pipeline


In [None]:
import torch
from diffusers import DiffusionPipeline, AutoencoderKL

vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    vae=vae,
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True
)
pipe.load_lora_weights("/content/pytorch_lora_weights.safetensors")
_ = pipe.to("cuda")

# Generate an image using the trained LoRA model and a specific prompt


In [None]:
prompt = "Studio Ghibli animation still of futuristic vibrant city, lively diverse crowd, lush green spaces, animated sparkles, joyous atmosphere, vivid colors, sense of harmony, ultra-detailed concept art, by Studio Ghibli"

image = pipe(prompt=prompt, num_inference_steps=50).images[0]
image