# 🔧 LoRA Training (Stable Diffusion XL (SDXL)) — Google Colab

---



### Step - 1 : Environment Setup — GPU (A100 Reccomemded (36 GB+ Of VRAM) ) & Drive Mount

Before installing dependencies, we need to:  

- **Check GPU** — make sure Colab is running with a GPU.  
- **Mount Google Drive** — to save datasets, models, and outputs.  


In [None]:
# Check GPU
!nvidia-smi || echo "No GPU detected — set Colab runtime to GPU."

In [None]:
# STEP 1 — Mount Google Drive (stores datasets/models/outputs)
from google.colab import drive
drive.mount('/content/drive')

# Unmount Google Drive if needed
# from google.colab import drive
# drive.flush_and_unmount()

In [None]:
# To Move Your Data from Gdrive
# !cp -r "/content/drive/MyDrive/data" /content/aigenmodel/10_aigenmodel

## Step 2 : Dependency Installation

In [None]:
# 1) Install xFormers built for the existing torch 2.8.0 + cu126 without changing torch
!pip install --upgrade --no-deps xformers

# 2) Install the rest of the training deps
!pip install --upgrade bitsandbytes pytorch-lightning prodigyopt==1.0 lion-pytorch==0.0.6 ftfy easygui voluptuous opencv-python

In [None]:
# Dependency Check Here
# opencv-python Missing is OK Only for GUI

import importlib

libs_to_check = [
    # Hugging Face core
    "torch", "torchvision", "torchaudio",
    "transformers", "diffusers", "accelerate",

    # Memory/optimization
    "xformers", "bitsandbytes", "safetensors", "einops",

    # Training frameworks
    "pytorch_lightning", "prodigyopt", "lion_pytorch",

    # Data & preprocessing
    "opencv-python", "cv2", "PIL", "ftfy", "sentencepiece", "datasets",

    # Utils & viz
    "tensorboard", "huggingface_hub", "altair", "easygui",
    "toml", "voluptuous", "imagesize", "rich"
]

# special cases where version attribute differs
special_version_attrs = {
    "cv2": "version",
    "PIL": "PILLOW_VERSION",  # old
}

for lib in libs_to_check:
    try:
        module_name = lib.replace("-", "_")
        module = importlib.import_module(module_name)

        # try __version__, or special attributes, or unknown
        version = getattr(module, "__version__", None)
        if version is None:
            # handle cv2, PIL, etc.
            if lib in special_version_attrs:
                version = getattr(module, special_version_attrs[lib], None)
            elif lib == "PIL":
                try:
                    import PIL
                    version = getattr(PIL, "__version__", None)
                except Exception:
                    version = None

        print(f"✅ {lib} - {version if version else 'unknown'}")

    except Exception as e:
        print(f"❌ {lib} - MISSING ({e.__class__.__name__})")

## STEP 3 — Choose Base Model
You have **two options**:
1. **Download SDXL from Hugging Face** (requires a free account + accepted license). Recommended for first-time users.
2. **Point to a local `.safetensors`** you already have in Drive.

In [None]:
# B: Clone kohya-ss (sd-scripts) and install basic requirements

!if [ ! -d "/content/sd-scripts" ]; then \
  git clone https://github.com/kohya-ss/sd-scripts.git /content/sd-scripts; \
else \
  echo "✅ sd-scripts already cloned at /content/sd-scripts"; \
fi

%cd /content/sd-scripts

# !pip install -q -U pip
# !pip install -q -r requirements.txt || echo "⚠️ requirements install had issues; we’ll fix specific packages later"

print("✅ kohya repo ready at /content/sd-scripts (check any warnings above).")


In [None]:
from huggingface_hub import snapshot_download, login

# 🔑 Login with your HuggingFace token (must have accepted SDXL license)
login()
MODEL = "sdxl-base-1.0"
MODEL_DIR = f"/content/{MODEL}" # Set our Path to Drive If needed

# 📥 Download SDXL 1.0 base model from StabilityAI

# # Instead of downloading again, just check if files exist
# import os
# if not os.path.exists(f"{MODEL_DIR}/sd_xl_base_1.0.safetensors"):
#     snapshot_download(
#         repo_id="stabilityai/stable-diffusion-xl-base-1.0",
#         local_dir=MODEL_DIR,
#         allow_patterns=["*.safetensors","*.json","tokenizer/*","scheduler/*","vae/*","unet/config.json","text_encoder/*"]
#     )
# else:
#     print("✅ Model already exists in Drive. Skipping download.")


snapshot_download(
    repo_id="stabilityai/stable-diffusion-xl-base-1.0",
    local_dir=MODEL_DIR,
    local_dir_use_symlinks=False,
    allow_patterns=[
        "*.safetensors",              # main weights
        "model_index.json",           # model config
        "tokenizer/*",
        "tokenizer/*", "tokenizer_2/*",   # << include both!
        "text_encoder/*", "text_encoder_2/*",# tokenizer files
        "scheduler/*",                # schedulers
        "vae/*",                      # VAE
        "unet/*",                     # UNet
        "text_encoder/*",             # text encoder
        "*.json",                     # other configs
    ]
)

print("✅ SDXL 1.0 Base download complete at /content/drive/MyDrive/SD/models/sdxl-base-1.0")

## STEP 4 — Project Config

In this step, we set up the project details and training parameters:  

- **Project Name** — used to create folders for datasets, outputs, and logs.  
- **Trigger Word** — a unique token (e.g., `akashawriter`) that activates your LoRA.  
- **Paths** — dataset, images, outputs, and logs are auto-created in Google Drive.

In [None]:
# ==== USER CONFIG ====
PROJECT_NAME = input("Please Enter Your Project Name: ").strip()
TRIGGER      = input("Please Enter Your Trigger Word (Avoid Common Words): ").strip()

# Paths
DATASET_DIR  = f"/content/{PROJECT_NAME}/"
IMAGES_DIR   = f"/content/{PROJECT_NAME}/5_{TRIGGER}/"
OUTPUT_DIR   = f"/content/drive/MyDrive/LoRA_Output/{PROJECT_NAME}"
LOG_DIR      = f"/content/drive/MyDrive/LoRA_Logs/{PROJECT_NAME}"

# Create folders if not exist
import os
for p in [DATASET_DIR, OUTPUT_DIR, LOG_DIR, IMAGES_DIR]:
    os.makedirs(p, exist_ok=True)

print("✅ Folders ready:")
print("📂 Dataset Dir :", DATASET_DIR)
print("📂 Images Dir  :", IMAGES_DIR)
print("📂 Output Dir  :", OUTPUT_DIR)
print("📂 Log Dir     :", LOG_DIR)
print("\n⚡ Place your training images inside:", IMAGES_DIR)


## STEP 5 — Upload Images ( You Can Skip This If you Have DataSet Ready )
### Option A — Copy images into the dataset folder in **Google Drive** directly:
- Put all images into: `LoRA_Datasets/5_<PROJECT_NAME>`

### Option B — Upload from your local machine (quick):

In [None]:
# Upload images directly (if you didn't place them into Drive already)
from google.colab import files
print("Please Upload Your Training Data Photos: ")
uploaded = files.upload()
for fname, filedata in uploaded.items():
    with open(os.path.join(IMAGES_DIR, fname), 'wb') as f:
        f.write(filedata)
print("Uploaded:", list(uploaded.keys()))

In [None]:
# Converts Image to Single Format & Trims the photo

import os
from PIL import Image
import cv2

# Path where your images are stored
# IMAGES_DIR = "/content/drive/MyDrive/LoRA_Datasets/Vidya_Balan/10_Vidya_Balan"

TARGET_SIZE = 1024 # Chnage this as Required
count = 0

for fname in os.listdir(IMAGES_DIR):
    fpath = os.path.join(IMAGES_DIR, fname)

    # Skip non-image files
    if not fname.lower().endswith(('.jpg', '.jpeg', '.png', '.webp', '.avif')):
        continue

    try:
        # Load image with cv2 (handles webp/avif better than PIL directly)
        img = cv2.imread(fpath, cv2.IMREAD_UNCHANGED)
        if img is None:
            print(f"⚠️ Skipped (not an image): {fname}")
            continue

        # Convert to RGB
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        pil_img = Image.fromarray(img)

        # --- Center crop (square) ---
        w, h = pil_img.size
        min_dim = min(w, h)
        left = (w - min_dim) // 2
        top = (h - min_dim) // 2
        right = left + min_dim
        bottom = top + min_dim
        pil_img = pil_img.crop((left, top, right, bottom))

        # --- Resize ---
        pil_img = pil_img.resize((TARGET_SIZE, TARGET_SIZE), Image.LANCZOS)

        # --- Save as JPG with clean name ---
        clean_name = f"vidya_{count:03d}.jpg"
        save_path = os.path.join(IMAGES_DIR, clean_name)
        pil_img.save(save_path, "JPEG", quality=95)

        # --- Delete the old file (only if name is different from the new one) ---
        if fpath != save_path:
            os.remove(fpath)

        count += 1

    except Exception as e:
        print(f"❌ Error on {fname}: {e}")

print(f"✅ Processed {count} images. Only clean JPG files remain in {IMAGES_DIR}")


## STEP 6 — (Optional) Auto‑Generate Simple Captions
Optional but Highly Reccomeded

This creates a **.txt** file next to each image using your template. You can edit them later. For higher quality, you should hand‑write short, accurate captions per image.

In [None]:
import glob, os

# Base caption template used for auto-captioning (optional). Keep it SIMPLE.
# Adjust for your Images
CAPTION_TEMPLATE = f"photo of {TRIGGER}, professional portrait, high quality, detailed face, studio lighting"

print("After creating caption files, add the actual captions for accurate results. (Reccomended)")
image_exts = (".png", ".jpg", ".jpeg", ".webp", ".bmp")

imgs = [p for p in glob.glob(os.path.join(DATASET_DIR, "**/*"), recursive=True) if os.path.splitext(p)[1].lower() in image_exts]
print(f"Found {len(imgs)} images")

for img_path in imgs:
    base, _ = os.path.splitext(img_path)
    txt_path = base + ".txt"
    if not os.path.exists(txt_path):
        with open(txt_path, 'w', encoding='utf-8') as f:
            f.write(CAPTION_TEMPLATE)

print("Caption files created where missing.")

## STEP 7 — Train LoRA (kohya-ss `sdxl_train_network.py`)
**Tips:**
- Start with the defaults. If results look samey/overfit, reduce `MAX_STEPS` or improve dataset.
- If underfit (not learning your subject), **increase** `MAX_STEPS` to 6–8k or improve captions.
- Keep `NETWORK_DIM` at 16/32 for small, flexible LoRAs.

- **Resolution** — choose `512` (faster, lighter) or `768` (more detail, higher VRAM).  
- **Batch Size** — set to `1` (use `2` if GPU has enough VRAM).  
- **Max Steps** — training iterations (start with 3k–5k, adjust based on results).  
- **Network Dim / Alpha** — controls LoRA size & capacity (16–32 is common).  
- **Learning Rates** — fine-tuned for text encoder and U-Net.  

👉 If you’re unsure, keep the defaults — they work well for most cases.  

In [None]:
import os

# ----------------------
# CONFIG
# ----------------------
PRETRAINED_MODEL = MODEL_DIR

# Hyperparameters tuned for A100/L4
RESOLUTION        = 1024          # 768 or lesser is safe for Weaker GPU
BATCH_SIZE        = 1             # Chnage it according to power
GRAD_ACC_STEPS    = 2             # Chnage it according to power
MAX_STEPS         = 6000
NETWORK_DIM       = 64
NETWORK_ALPHA     = 32
LEARNING_RATE     = 1.0            # Prodigy expects ~1.0, not 1e-4

# ----------------------
# TRAINING COMMAND
# ----------------------

# --network_weights /content/drive/MyDrive/LoRA_Output/aigenmodel/500-*.safetensors

train_cmd = f'''
accelerate launch --mixed_precision=bf16 sdxl_train_network.py \
  --pretrained_model_name_or_path="{PRETRAINED_MODEL}" \
  --train_data_dir="{DATASET_DIR}" \
  --output_dir="{OUTPUT_DIR}" \
  --logging_dir="{LOG_DIR}" \
  --resolution={RESOLUTION} \
  --network_module=networks.lora \
  --network_dim={NETWORK_DIM} \
  --network_alpha={NETWORK_ALPHA} \
  --learning_rate={LEARNING_RATE} \
  --train_batch_size={BATCH_SIZE} \
  --gradient_accumulation_steps={GRAD_ACC_STEPS} \
  --max_train_steps={MAX_STEPS} \
  --save_every_n_steps=500 \
  --save_last_n_steps=3 \
  --save_last_n_epochs=3 \
  --save_state \
  --save_precision=bf16 \
  --optimizer_type=Prodigy \
  --mem_eff_attn \
  --shuffle_caption \
  --caption_extension=.txt \
  --max_data_loader_n_workers=2 \
  --log_prefix="female_writer_v1" \
  --enable_bucket \
  --bucket_reso_steps=64 \
  --random_crop \
  --log_with tensorboard \
  2>&1 | tee /content/train.log
'''

# ----------------------
# RUN TRAINING
# ----------------------
print("🚀 Starting LoRA training with Colab Pro A100/L4...\n")
exit_code = os.system(train_cmd)
print("\n✅ Training finished with exit code:", exit_code)

## STEP 8 — Test the LoRA (Diffusers)
This loads SD1.5 and your LoRA, then generates a sample image.

**Note:** If you downloaded SDXL in Step 3 (Option A), it will reuse that folder. If you used a `.safetensors` base model, you can still test with the diffusers SD1.5 pipeline below. (From the last Version of this Code i.e v0.1.0)

In [None]:
# A 24 GB VRAM L4 GPU May Also Work for This

import os, glob
from diffusers import StableDiffusionXLPipeline
import torch

OUTPUT_DIR = r"/content/drive/MyDrive/LoRA_Output/aigenmodel/"
RESOLUTION = 1024

# find latest safetensors in output
lora_files = sorted([p for p in glob.glob(os.path.join(OUTPUT_DIR, "*.safetensors"))], key=os.path.getmtime)
assert lora_files, "No LoRA files found in OUTPUT_DIR. Check training output."
LORA_PATH = lora_files[-1]
print("Using LoRA:", LORA_PATH)

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# Load LoRA
pipe.load_lora_weights(os.path.dirname(LORA_PATH), weight_name=os.path.basename(LORA_PATH))

while True:
  print("Press Clt + C To Exit")
  # prompt = f"portrait of {TRIGGER}, professional lighting, ultra-detailed, 8k, sharp focus"
  prompt = input("Build Anything (Make sure to add your Trigger word): ")
  # neg = "low quality, blurry, lowres, bad hands, worst quality, jpeg artifacts"
  neg = input("Negative Prompt (leave blank if none): ")
  image = pipe(prompt, negative_prompt=neg, num_inference_steps=30, guidance_scale=7.5, height=RESOLUTION, width=RESOLUTION).images[0]

  image.save("/content/sample_lora_output.png")
  print("Saved /content/sample_lora_output.png")

## STEP 9 — Download / Deliverables
- Your LoRA `.safetensors` is saved in: `LoRA_Output/<PROJECT_NAME>` (on Drive).
- The sample output image is at: `/content/sample_lora_output.png`.
- You can zip the output folder for delivery.

In [None]:
# Take back What you Trained

!zip -r /content/lora_output.zip "$OUTPUT_DIR" || echo "Zip failed (likely no files)."
print("If succeeded, download: /content/lora_output.zip from Colab sidebar → Files.")