# Arabic Writer Identification - Hyperparameter Search
## Google Colab Version (Optimized for A100 GPU)

This notebook runs Bayesian hyperparameter optimization for Arabic writer identification using Optuna.

**Features:**
- ‚úÖ Single GPU optimization (perfect for Colab A100)
- ‚úÖ Google Drive integration (automatic save)
- ‚úÖ Quick test mode (10-20 min to verify setup)
- ‚úÖ Resumable (can stop and continue)
- ‚úÖ NaN detection and safeguards

**Runtime:** Make sure GPU is enabled! 
`Runtime ‚Üí Change runtime type ‚Üí Hardware accelerator ‚Üí GPU`

## Step 1: Mount Google Drive

All your data and results will be saved to Google Drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Verify GPU
import torch
print(f"\nGPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("‚ö†Ô∏è WARNING: No GPU detected! Enable GPU in Runtime settings.")

## Step 2: Download Script from GitHub

Download the latest version directly from GitHub (always up-to-date!).

In [None]:
# Download latest version from GitHub
print("üì• Downloading latest script from GitHub...")

# GitHub repository details
GITHUB_USER = "a-mahdi"
REPO_NAME = "WrIHAM-Code"
BRANCH = "claude/analyze-hyperparameter-search-QJqnk"  # Update this if you merge to main
SCRIPT_NAME = "run_hyperparameter_search_colab.py"

# Construct raw GitHub URL
github_url = f"https://raw.githubusercontent.com/{GITHUB_USER}/{REPO_NAME}/{BRANCH}/{SCRIPT_NAME}"

# Download the file
!wget -q -O run_hyperparameter_search_colab.py "{github_url}"

# Verify file exists and show info
import os
if os.path.exists('run_hyperparameter_search_colab.py'):
    print("‚úÖ Script downloaded successfully!")
    !ls -lh run_hyperparameter_search_colab.py
    print(f"\nüîó Source: {github_url}")
else:
    print("‚ùå Download failed! Check your internet connection.")
    print(f"   URL: {github_url}")

## Step 3: Install Dependencies

Install required packages (PyTorch should already be installed in Colab).

In [None]:
!pip install -q optuna opencv-python-headless seaborn scikit-learn timm

# Verify installations
import optuna
import cv2
import seaborn
import timm

print("‚úÖ All dependencies installed!")
print(f"Optuna version: {optuna.__version__}")
print(f"PyTorch version: {torch.__version__}")

## Step 4: Configure Paths

Set paths to your data and where you want to save results.

**IMPORTANT:** Update these paths to match your Google Drive structure!

In [None]:
# ========== UPDATE THESE PATHS ==========

# Path to your Mirath_extracted_lines folder on Google Drive
DATA_ROOT = "/content/drive/MyDrive/Mirath_extracted_lines"

# Where to save checkpoints and results (will be created if doesn't exist)
CHECKPOINT_DIR = "/content/drive/MyDrive/hyperparameter_search_results"

# ========================================

# Verify data path exists
if os.path.exists(DATA_ROOT):
    print(f"‚úÖ Data found at: {DATA_ROOT}")
    !ls -d {DATA_ROOT}/train/* | head -5
else:
    print(f"‚ùå Data NOT found at: {DATA_ROOT}")
    print("   Please update DATA_ROOT above!")

print(f"\nüìÅ Results will be saved to: {CHECKPOINT_DIR}")

## Step 5: Run Quick Test (RECOMMENDED FIRST!)

**Run this first to verify everything works!**

Quick test uses minimal resources:
- 3 writers (instead of 21)
- 50 lines/writer (instead of 300)
- Batch size 32 (instead of 128)
- 5 epochs (instead of 70)
- 2 trials (instead of 12)

**Time: ~10-20 minutes**

In [None]:
!python run_hyperparameter_search_colab.py \
  --data_root "{DATA_ROOT}" \
  --checkpoint_dir "{CHECKPOINT_DIR}/quick_test" \
  --quick_test

print("\n‚úÖ Quick test completed!")
print(f"Results saved to: {CHECKPOINT_DIR}/quick_test")

## Step 6: Run Full Hyperparameter Search

**Only run this after quick test succeeds!**

This will run the full hyperparameter search:
- All 21 writers
- 300 lines/writer
- Batch size 128
- 70 epochs per trial
- 12 trials total

**Time: ~24-48 hours**

**Note:** This is resumable! If Colab disconnects, just run this cell again and it will continue from where it stopped.

In [None]:
!python run_hyperparameter_search_colab.py \
  --data_root "{DATA_ROOT}" \
  --checkpoint_dir "{CHECKPOINT_DIR}/full_search" \
  --n_trials 12 \
  --use_all_writers

print("\n‚úÖ Full search completed!")
print(f"Results saved to: {CHECKPOINT_DIR}/full_search")

## Alternative: Run with Writer Subset

If you want to experiment with fewer writers (faster):

In [None]:
# Run with 7 writers instead of all 21
!python run_hyperparameter_search_colab.py \
  --data_root "{DATA_ROOT}" \
  --checkpoint_dir "{CHECKPOINT_DIR}/subset_7writers" \
  --n_trials 12 \
  --num_writers_subset 7

## Step 7: View Results

After training completes, view the results:

In [None]:
import json
import pandas as pd
from pathlib import Path

# Choose which run to analyze
RESULTS_DIR = f"{CHECKPOINT_DIR}/full_search"  # or "quick_test"

# List all trials
print("Trials completed:")
!ls -d {RESULTS_DIR}/trial_* 2>/dev/null | head -20

# Load best trial summary
summary_file = Path(RESULTS_DIR) / "best_overall" / "summary.txt"
if summary_file.exists():
    print("\n" + "="*70)
    print("BEST TRIAL SUMMARY")
    print("="*70)
    with open(summary_file) as f:
        print(f.read())
else:
    print("\n‚ö†Ô∏è No results yet. Training still in progress or not started.")

# Show training plots
import matplotlib.pyplot as plt
from IPython.display import Image as IPImage, display

plots_dir = Path(RESULTS_DIR) / "best_overall" / "plots"
if plots_dir.exists():
    print("\nüìä Training Plots:")
    for plot in sorted(plots_dir.glob("*.png")):
        print(f"\n{plot.name}")
        display(IPImage(filename=str(plot)))
else:
    print("\n‚ö†Ô∏è No plots found yet.")

## Step 8: Download Best Model

Download the best model and results to your local machine:

In [None]:
from google.colab import files

# Create a zip of the best model
RESULTS_DIR = f"{CHECKPOINT_DIR}/full_search"
!cd {RESULTS_DIR} && zip -r best_model.zip best_overall/

# Download
files.download(f"{RESULTS_DIR}/best_model.zip")

print("‚úÖ Best model downloaded!")

## üí° Tips for Using Colab

1. **Colab Timeout:** Free Colab sessions disconnect after ~12 hours. Use Colab Pro for longer sessions.

2. **Resumability:** If disconnected, just run the search cell again. It will resume from the Optuna database.

3. **Monitor Progress:** Check your Google Drive folder to see trial results being saved in real-time.

4. **GPU Memory:** The A100 has 40GB memory, which is more than enough for batch size 128.

5. **Save Often:** Results auto-save to Google Drive, so you won't lose progress.

6. **Check Status:** Look at the Optuna database file (`optuna_study.db`) to see completed trials.

## üö® Troubleshooting

**Problem:** "No GPU available"  
**Solution:** Go to `Runtime ‚Üí Change runtime type ‚Üí GPU`

**Problem:** "Data not found"  
**Solution:** Update `DATA_ROOT` path in Step 4

**Problem:** "Out of memory"  
**Solution:** Add `--quick_test` flag to use smaller batch size

**Problem:** Colab disconnected  
**Solution:** Just run the cell again - it will resume automatically