# 🚀 Training LoRA for Image-to-Image on Colab

Notebook này để train LoRA trên Google Colab với GPU T4 miễn phí

**Thời gian train:** 2-4 giờ cho 15-30 ảnh

**GPU:** T4 (15GB VRAM) - FREE tier

## 📋 Bước 1: Setup môi trường

In [None]:
# Check GPU
!nvidia-smi

# Install dependencies
!pip install -q diffusers transformers accelerate peft torch torchvision xformers datasets

## 📦 Bước 2: Upload code và dataset

**Option 1: Clone từ GitHub (khuyên dùng)**

In [None]:
# Clone repository của bạn
!git clone YOUR_REPO_URL
%cd NCKH_OpenVINO

**Option 2: Upload từ local**

1. Zip folder training/ và dataset/
2. Upload lên Colab:
   - Click Files icon bên trái
   - Upload files
3. Unzip:

In [None]:
# Unzip uploaded files
!unzip -q training.zip
!unzip -q dataset.zip

**Option 3: Download dataset từ Drive**

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Copy dataset từ Drive
!cp -r "/content/drive/MyDrive/your_dataset" ./dataset/

## 📊 Bước 3: Chuẩn bị dataset

Dataset structure cần:
```
dataset/
├── image1.jpg
├── image2.jpg
├── ...
└── metadata.json  # Tự động tạo nếu chưa có
```

In [None]:
# Kiểm tra dataset
import os
from pathlib import Path

dataset_path = Path("./dataset")
image_files = list(dataset_path.glob("*.jpg")) + list(dataset_path.glob("*.png"))

print(f"Found {len(image_files)} images")
print("\nFirst 5 images:")
for img in image_files[:5]:
    print(f"  - {img.name}")

In [None]:
# Prepare dataset with prompts
!python training/dataset_preparation.py \
    --source_dir ./dataset \
    --output_dir ./processed_dataset \
    --augment_multiplier 3 \
    --use_vietnamese_prompts

## 🎓 Bước 4: Train LoRA

**Tham số quan trọng:**
- `--rank`: 4-16 (rank càng cao = model càng mạnh nhưng file càng lớn)
- `--learning_rate`: 1e-4 đến 5e-5
- `--num_train_epochs`: 10-30 epochs
- `--train_batch_size`: 1-2 (tùy VRAM)

In [None]:
# Start training
!python training/train_lora.py \
    --data_dir ./processed_dataset/augmented \
    --output_dir ./lora_output \
    --pretrained_model_name_or_path "runwayml/stable-diffusion-v1-5" \
    --resolution 512 \
    --train_batch_size 1 \
    --gradient_accumulation_steps 4 \
    --num_train_epochs 20 \
    --learning_rate 1e-4 \
    --rank 8 \
    --alpha 32 \
    --mixed_precision "fp16" \
    --validation_prompt "beautiful Vietnamese landscape, mountains" \
    --validation_steps 100 \
    --save_steps 500

## 🔍 Bước 5: Test LoRA

In [None]:
# Test trained LoRA
from diffusers import StableDiffusionPipeline
import torch
from PIL import Image

# Load base model
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# Load LoRA
pipe.unet.load_attn_procs("./lora_output")

# Test generation
prompt = "beautiful Vietnamese landscape, rice terraces, mountains"
image = pipe(prompt, num_inference_steps=30).images[0]
image.save("test_result.png")

# Display
display(image)

## 💾 Bước 6: Convert to OpenVINO

In [None]:
# Install OpenVINO
!pip install -q openvino openvino-dev

# Convert LoRA-merged model to OpenVINO
!python training/convert_to_openvino.py \
    --model_path "runwayml/stable-diffusion-v1-5" \
    --lora_path ./lora_output \
    --output_path ./model_with_lora_ov \
    --fp16

## 📥 Bước 7: Download kết quả

In [None]:
# Zip LoRA weights
!zip -r lora_weights.zip ./lora_output
!zip -r openvino_model.zip ./model_with_lora_ov

# Download về local
from google.colab import files
files.download('lora_weights.zip')
files.download('openvino_model.zip')

# Hoặc copy vào Drive
!cp lora_weights.zip "/content/drive/MyDrive/"
!cp openvino_model.zip "/content/drive/MyDrive/"

## 📊 Training Tips

### **Nếu gặp OOM (Out of Memory):**
```python
# Giảm batch size
--train_batch_size 1

# Tăng gradient accumulation
--gradient_accumulation_steps 8

# Giảm resolution
--resolution 384
```

### **Để train nhanh hơn:**
```python
# Enable xformers
!pip install xformers

# Giảm validation frequency
--validation_steps 500

# Giảm epochs nếu dataset lớn
--num_train_epochs 10
```

### **Để quality tốt hơn:**
```python
# Tăng rank (nhưng file lớn hơn)
--rank 16

# Tăng epochs
--num_train_epochs 30

# Lower learning rate
--learning_rate 5e-5
```