# Hyperspectral Material Classification - Google Colab Training

This notebook trains the material classification model on Google Colab's free GPU.

**Before running:**
1. Runtime → Change runtime type → GPU (T4)
2. Upload your data to Google Drive
3. Run cells in order

## 1. Setup Environment

In [None]:
# Check GPU availability
!nvidia-smi

In [None]:
# Clone repository
!git clone https://github.com/PlugNawapong/my-ml-project.git
%cd my-ml-project

In [None]:
# Install dependencies
!pip install -q torch torchvision albumentations tqdm Pillow numpy matplotlib scikit-learn

## 2. Load Data from Google Drive

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Copy data from Google Drive to Colab workspace
# ADJUST THE PATH to match your Google Drive folder structure

import os

# Example path - adjust to your Google Drive structure
drive_data_path = '/content/drive/MyDrive/dl-plastics-data'

# Copy training data
if os.path.exists(f'{drive_data_path}/data'):
    !cp -r {drive_data_path}/data ./
    print('✓ Training data copied')
else:
    print('⚠ Training data not found. Please upload to Google Drive first.')

# Copy inference datasets
if os.path.exists(f'{drive_data_path}/inference_data_set1'):
    !cp -r {drive_data_path}/inference_data_set1 ./
    print('✓ Inference dataset 1 copied')

if os.path.exists(f'{drive_data_path}/inference_data_set2'):
    !cp -r {drive_data_path}/inference_data_set2 ./
    print('✓ Inference dataset 2 copied')

## 3. Inspect Data (Optional)

In [None]:
# Inspect training data
!python inspect_data.py --data_dir data

# Display generated plots
from IPython.display import Image, display

print('\n=== Band Visualization ===')
display(Image('data_inspection_bands.png'))

print('\n=== Label Visualization ===')
display(Image('data_inspection_labels.png'))

print('\n=== Raw Spectral Signatures ===')
display(Image('data_inspection_spectra_raw.png'))

print('\n=== Normalized Spectral Signatures (Used in Training) ===')
display(Image('data_inspection_spectra_normalized.png'))

## 4. Train Model

Choose one of the training options below:

### Option A: Fast 1D Model (Recommended for Quick Testing)

In [None]:
!python train.py \
    --model spectral_cnn_1d \
    --epochs 50 \
    --batch_size 2048 \
    --max_samples_per_class 10000 \
    --dropout 0.5 \
    --lr 0.001

### Option B: 2D CNN with Spatial Patches

In [None]:
!python train.py \
    --model spectral_cnn_2d \
    --use_patches \
    --patch_size 3 \
    --epochs 50 \
    --batch_size 512 \
    --max_samples_per_class 5000 \
    --dropout 0.5 \
    --bin_factor 2

### Option C: Hybrid Model (Best Accuracy)

In [None]:
!python train.py \
    --model hybrid \
    --use_patches \
    --patch_size 5 \
    --epochs 100 \
    --batch_size 256 \
    --max_samples_per_class 5000 \
    --dropout 0.5 \
    --augment \
    --bin_factor 2

## 5. Run Inference

In [None]:
# Find the latest trained model
import glob
model_files = glob.glob('outputs/*/best_model.pth')
if model_files:
    latest_model = sorted(model_files)[-1]
    print(f'Using model: {latest_model}')
else:
    print('No trained model found!')

In [None]:
# Run inference on inference_data_set1
!python inference.py \
    --checkpoint {latest_model} \
    --model spectral_cnn_1d \
    --data_dir inference_data_set1

In [None]:
# Run inference on inference_data_set2
!python inference.py \
    --checkpoint {latest_model} \
    --model spectral_cnn_1d \
    --data_dir inference_data_set2

## 6. Visualize Results

In [None]:
# Display prediction visualizations
from IPython.display import Image, display
import json

print('\n=== Inference Data Set 1 Results ===')
display(Image('predictions/inference_data_set1/prediction_visualization.png'))

# Show statistics
with open('predictions/inference_data_set1/statistics.json', 'r') as f:
    stats1 = json.load(f)
print(f"Mean Confidence: {stats1['mean_confidence']:.4f}")
print("\nClass Distribution:")
for class_name, class_stats in stats1['class_distribution'].items():
    if class_stats['percentage'] > 0:
        print(f"  {class_name}: {class_stats['percentage']:.2f}% (conf: {class_stats['mean_confidence']:.4f})")

print('\n=== Inference Data Set 2 Results ===')
display(Image('predictions/inference_data_set2/prediction_visualization.png'))

with open('predictions/inference_data_set2/statistics.json', 'r') as f:
    stats2 = json.load(f)
print(f"Mean Confidence: {stats2['mean_confidence']:.4f}")
print("\nClass Distribution:")
for class_name, class_stats in stats2['class_distribution'].items():
    if class_stats['percentage'] > 0:
        print(f"  {class_name}: {class_stats['percentage']:.2f}% (conf: {class_stats['mean_confidence']:.4f})")

## 7. Download Results

In [None]:
# Zip all results
!zip -r results.zip outputs/ predictions/

# Download to your computer
from google.colab import files
files.download('results.zip')

print('\n✓ Results downloaded! Extract the zip file to see:')
print('  - outputs/ : Trained models and training history')
print('  - predictions/ : Inference results and visualizations')

## 8. Save to Google Drive (Optional)

In [None]:
# Copy results back to Google Drive
!mkdir -p /content/drive/MyDrive/dl-plastics-results
!cp -r outputs/ /content/drive/MyDrive/dl-plastics-results/
!cp -r predictions/ /content/drive/MyDrive/dl-plastics-results/

print('✓ Results saved to Google Drive: /MyDrive/dl-plastics-results/')