# üîß Female LoRA Training (Stable Diffusion 1.5) ‚Äî Google Colab

### Step - 1 : Environment Setup ‚Äî GPU (T4/A100 Reccomemded) & Drive Mount

Before installing dependencies, we need to:  

- **Check GPU** ‚Äî make sure Colab is running with a GPU.  
- **Mount Google Drive** ‚Äî to save datasets, models, and outputs.  
- **Set cache dirs** ‚Äî reuse Hugging Face/pip downloads for faster runs.  
- **Upgrade Python** ‚Äî switch Colab from 3.9 ‚Üí 3.10 (required for kohya-ss).  


In [None]:
# Check GPU
!nvidia-smi || echo "No GPU detected ‚Äî set Colab runtime to GPU."

In [None]:
# STEP 1 ‚Äî Mount Google Drive (stores datasets/models/outputs)
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Save Catch to GDrive
%env HF_HOME=/content/drive/MyDrive/hf_cache
%env TRANSFORMERS_CACHE=/content/drive/MyDrive/hf_cache
%env HF_DATASETS_CACHE=/content/drive/MyDrive/hf_cache
%env PIP_CACHE_DIR=/content/drive/MyDrive/pip_cache

In [None]:
!python3 --version

In [None]:
# Force Colab to use Python 3.10
!sudo apt-get update -y
!sudo apt-get install -y python3.10 python3.10-dev python3.10-distutils
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
!sudo update-alternatives --config python3
!curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
!python3 --version

In [None]:
# Version Must Be 3.10
!python3 --version

### Step - 2 : Dependency Installation Steps

Before training a LoRA, we need to prepare the Colab environment.  
By default, Colab ships with older Python and mismatched CUDA/PyTorch versions, which can cause errors.  
The following steps will:  

- **Force Colab to use Python 3.10** (required for kohya-ss scripts).  
- **Install the correct PyTorch + CUDA 12.1 stack** (works well with T4 GPUs).  
- **Add xformers and bitsandbytes** for efficient memory usage and 8-bit optimizers.  
- **Install Hugging Face + diffusers libraries** (specific versions tested for stability).  
- **Include extra utilities** for training, logging, and image handling.  
- **Finally, verify all versions** to ensure the setup is correct.  


In [None]:
# Torch + CUDA 12.1 stack (stable for T4, SD 1.5 training)
!pip install -q torch==2.2.2+cu121 torchvision==0.17.2+cu121 torchaudio==2.2.2+cu121 --index-url https://download.pytorch.org/whl/cu121

In [None]:
# xformers (must match torch), bitsandbytes (for 8-bit optimizer)
!pip install -q xformers==0.0.25.post1 bitsandbytes==0.43.1

In [None]:
# Hugging Face + diffusers stack (versions tested with kohya-ss)
!pip install -q accelerate==0.27.2 transformers==4.39.3 diffusers==0.25.0 safetensors==0.4.2

In [None]:
!pip install -q \
  einops==0.7.0 ftfy==6.1.1 tensorboard==2.17.0 \
  opencv-python==4.8.1.78 pillow tqdm sentencepiece datasets==2.19.0 \
  pytorch-lightning==1.9.0 prodigyopt==1.0 lion-pytorch==0.0.6 \
  altair==4.2.2 easygui==0.98.3 toml==0.10.2 voluptuous==0.13.1 \
  huggingface-hub==0.24.5 imagesize==1.4.1 rich==13.7.0

In [None]:
!pip install --upgrade --force-reinstall numpy==1.26.4

In [None]:
# Fresh clone of sd-scripts
%cd /content/
!rm -rf sd-scripts
!git clone https://github.com/kohya-ss/sd-scripts.git
%cd sd-scripts

In [None]:
!python3 -c "import torch; print('Torch:', torch.__version__, '| CUDA:', torch.version.cuda, '| GPU OK:', torch.cuda.is_available())"
!python3 -c "import xformers; print('Xformers:', xformers.__version__)"
!python3 -c "import diffusers; print('Diffusers:', diffusers.__version__)"
!python3 -c "import transformers; print('Transformers:', transformers.__version__)"
!python3 -c "import accelerate; print('Accelerate:', accelerate.__version__)"

## STEP 3 ‚Äî Choose Base Model
You have **two options**:
1. **Download SD1.5 from Hugging Face** (requires a free account + accepted license). Recommended for first-time users.
2. **Point to a local `.safetensors`** you already have in Drive.

‚ö†Ô∏è **SD1.5** is lighter and easier to train than SDXL. Use SD1.5 unless you **know** you need SDXL.

In [None]:
# OPTION A ‚Äî Download SD1.5 (requires Hugging Face token)
USE_HF = True  # set False if you want to use a local .safetensors instead
MODEL_REPO = "runwayml/stable-diffusion-v1-5"  # SD1.5 official repo
MODEL_DIR = "/content/drive/MyDrive/SD/models/sd15"

if USE_HF:
    from huggingface_hub import login, snapshot_download
    print("üîê Login to Hugging Face (paste your access token):")
    login()
    snapshot_download(MODEL_REPO, local_dir=MODEL_DIR, local_dir_use_symlinks=False)
    BASE_MODEL_PATH = MODEL_DIR
else:
    # OPTION B ‚Äî Use a local .safetensors file stored on Drive
    BASE_MODEL_PATH = "/content/drive/MyDrive/SD/models/sd15.safetensors"

print("BASE_MODEL_PATH:", BASE_MODEL_PATH)

## STEP 4 ‚Äî Project Config

In this step, we set up the project details and training parameters:  

- **Project Name** ‚Äî used to create folders for datasets, outputs, and logs.  
- **Trigger Word** ‚Äî a unique token (e.g., `akashawriter`) that activates your LoRA.  
- **Paths** ‚Äî dataset, images, outputs, and logs are auto-created in Google Drive.  
- **Resolution** ‚Äî choose `512` (faster, lighter) or `768` (more detail, higher VRAM).  
- **Batch Size** ‚Äî set to `1` (use `2` if GPU has enough VRAM).  
- **Max Steps** ‚Äî training iterations (start with 3k‚Äì5k, adjust based on results).  
- **Network Dim / Alpha** ‚Äî controls LoRA size & capacity (16‚Äì32 is common).  
- **Learning Rates** ‚Äî fine-tuned for text encoder and U-Net.  

üëâ If you‚Äôre unsure, keep the defaults ‚Äî they work well for most cases.  


In [None]:
# ==== USER CONFIG ====
PROJECT_NAME = input("Please Enter Your Project Name: ")
TRIGGER      = input("Please Enter Your Trigger Word (Avoid Common Words): ")
DATASET_DIR  = f"/content/drive/MyDrive/LoRA_Datasets/{PROJECT_NAME}"
IMAGES_DIR  = f"/content/drive/MyDrive/LoRA_Datasets/{PROJECT_NAME}/10_{TRIGGER}/"
OUTPUT_DIR   = f"/content/drive/MyDrive/LoRA_Output/{PROJECT_NAME}"
LOG_DIR      = f"/content/drive/MyDrive/LoRA_Logs/{PROJECT_NAME}"

RESOLUTION        = 768   # 512 or 768; 768 gives more detail if VRAM allows
BATCH_SIZE        = 1     # increase to 2 if GPU VRAM allows
MAX_STEPS         = 4000  # start with 3‚Äì5k; iterate based on results
NETWORK_DIM       = 16    # 16/32 are good starting points; higher = heavier model
NETWORK_ALPHA     = 16
LEARNING_RATE     = 0.0001
TEXT_ENCODER_LR   = 5e-5
UNET_LR           = 1e-4

import os
for p in [DATASET_DIR, OUTPUT_DIR, LOG_DIR, IMAGES_DIR]:
    os.makedirs(p, exist_ok=True)
print("Folders ready:\n", DATASET_DIR, "\n", OUTPUT_DIR, "\n", LOG_DIR, "\n", IMAGES_DIR)

## STEP 5 ‚Äî Upload Images
### Option A ‚Äî Copy images into the dataset folder in **Google Drive** directly:
- Put all images into: `LoRA_Datasets/<PROJECT_NAME>`

### Option B ‚Äî Upload from your local machine (quick):

In [None]:
# Upload images directly (if you didn't place them into Drive already)
from google.colab import files
print("Please Upload Your Training Data Photos: ")
uploaded = files.upload()
for fname, filedata in uploaded.items():
    with open(os.path.join(IMAGES_DIR, fname), 'wb') as f:
        f.write(filedata)
print("Uploaded:", list(uploaded.keys()))

## STEP 6 ‚Äî (Optional) Auto‚ÄëGenerate Simple Captions
This creates a **.txt** file next to each image using your template. You can edit them later. For higher quality, you should hand‚Äëwrite short, accurate captions per image.

In [None]:
import glob, os

# Base caption template used for auto-captioning (optional). Keep it SIMPLE.
# Adjust for your Images
CAPTION_TEMPLATE = f"photo of {TRIGGER}, professional portrait, studio lighting, high detail"

print("After creating caption files, add the actual captions for accurate results. (Reccomended)")
image_exts = (".png", ".jpg", ".jpeg", ".webp", ".bmp")

imgs = [p for p in glob.glob(os.path.join(DATASET_DIR, "**/*"), recursive=True) if os.path.splitext(p)[1].lower() in image_exts]
print(f"Found {len(imgs)} images")

for img_path in imgs:
    base, _ = os.path.splitext(img_path)
    txt_path = base + ".txt"
    if not os.path.exists(txt_path):
        with open(txt_path, 'w', encoding='utf-8') as f:
            f.write(CAPTION_TEMPLATE)

print("Caption files created where missing.")

## STEP 7 ‚Äî Train LoRA (kohya-ss `train_network.py`)
**Tips:**
- Start with the defaults. If results look samey/overfit, reduce `MAX_STEPS` or improve dataset.
- If underfit (not learning your subject), **increase** `MAX_STEPS` to 6‚Äì8k or improve captions.
- Keep `NETWORK_DIM` at 16/32 for small, flexible LoRAs.


In [None]:
import os

PRETRAINED_MODEL = "runwayml/stable-diffusion-v1-5"

# Training command
train_cmd = f'''
accelerate launch train_network.py \
  --pretrained_model_name_or_path="{PRETRAINED_MODEL}" \
  --train_data_dir="{DATASET_DIR}" \
  --output_dir="{OUTPUT_DIR}" \
  --logging_dir="{LOG_DIR}" \
  --resolution={RESOLUTION} \
  --network_module=networks.lora \
  --network_dim={NETWORK_DIM} \
  --network_alpha={NETWORK_ALPHA} \
  --learning_rate={LEARNING_RATE} \
  --text_encoder_lr={TEXT_ENCODER_LR} \
  --unet_lr={UNET_LR} \
  --train_batch_size={BATCH_SIZE} \
  --max_train_steps={MAX_STEPS}} \
  --save_every_n_steps=200 \
  --mixed_precision=fp16 \
  --save_precision=fp16 \
  --optimizer_type=AdamW8bit \
  --xformers \
  --shuffle_caption \
  --caption_extension=.txt \
  --max_data_loader_n_workers=1 \
  --clip_skip=2 \
  --log_prefix="female_writer_v1" \
  --enable_bucket \
  --bucket_reso_steps=64 \
  --random_crop \
  2>&1 | tee /content/train.log
'''

# Run training
print("Starting LoRA training...\n")
exit_code = os.system(train_cmd)
print("\nTraining finished with exit code:", exit_code)

## STEP 8 ‚Äî Test the LoRA (Diffusers)
This loads SD1.5 and your LoRA, then generates a sample image.

**Note:** If you downloaded SD1.5 in Step 3 (Option A), it will reuse that folder. If you used a `.safetensors` base model, you can still test with the diffusers SD1.5 pipeline below.

In [None]:
import os, glob
from diffusers import StableDiffusionPipeline
import torch

# find latest safetensors in output
lora_files = sorted([p for p in glob.glob(os.path.join(OUTPUT_DIR, "*.safetensors"))], key=os.path.getmtime)
assert lora_files, "No LoRA files found in OUTPUT_DIR. Check training output."
LORA_PATH = lora_files[-1]
print("Using LoRA:", LORA_PATH)

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

try:
    pipe.load_lora_weights(os.path.dirname(LORA_PATH), weight_name=os.path.basename(LORA_PATH))
    print("LoRA loaded via Diffusers API.")
except Exception as e:
    print("LoRA load failed:", e)

# prompt = f"portrait of {TRIGGER}, professional lighting, ultra-detailed, 8k, sharp focus"
prompt = input("Build Anything (Make sure to add your Trigger word): ")
# neg = "low quality, blurry, lowres, bad hands, worst quality, jpeg artifacts"
neg = input("Negative Prompt (leave blank if none): ")
image = pipe(prompt, negative_prompt=neg, num_inference_steps=30, guidance_scale=7.5, height=RESOLUTION, width=RESOLUTION).images[0]

image.save("/content/sample_lora_output.png")
print("Saved /content/sample_lora_output.png")

## STEP 9 ‚Äî Download / Deliverables
- Your LoRA `.safetensors` is saved in: `LoRA_Output/<PROJECT_NAME>` (on Drive).
- The sample output image is at: `/content/sample_lora_output.png`.
- You can zip the output folder for delivery.

In [None]:
!zip -r /content/lora_output.zip "$OUTPUT_DIR" || echo "Zip failed (likely no files)."
print("If succeeded, download: /content/lora_output.zip from Colab sidebar ‚Üí Files.")