# RASC: Relationship-Aware Scene Captioning
## Google Colab Training Notebook

This notebook trains the complete RASC pipeline on Google Colab:
1. **Object Detection** (YOLOv8)
2. **Relationship Prediction** (Neural Motifs)
3. **Caption Generation** (T5)

---

## üìã Setup & Installation

In [None]:
# Check GPU availability
!nvidia-smi

In [None]:
# Install dependencies
!pip install -q ultralytics transformers datasets PyYAML pillow tqdm scikit-learn

In [None]:
# Mount Google Drive (to save models and access data)
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Clone or upload the RASC project
# Option 1: Clone from GitHub
# !git clone https://github.com/yourusername/rasc.git /content/rasc

# Option 2: Upload the zip file and extract
!mkdir -p /content/rasc
# Upload rasc-project.zip using the file browser, then:
# !unzip -q /content/rasc-project.zip -d /content/

# Option 3: Copy from Google Drive
# !cp -r /content/drive/MyDrive/rasc-project /content/rasc

In [None]:
# Setup Python path
import sys
sys.path.insert(0, '/content/rasc/src')

# Verify setup
!ls -la /content/rasc/

## üìÇ Upload Processed Data

Since you already have processed data locally, upload it to Colab:

In [None]:
# Option 1: Upload from local machine
from google.colab import files

print("Upload your processed data files:")
print("1. vg_5k_subset.json")
print("2. label_map.json")
print("3. relationship_pairs.json")
print("4. Train/val/test splits (zip them first)")

uploaded = files.upload()

In [None]:
# Option 2: Copy from Google Drive (faster for large files)
# Assuming you've uploaded data to Drive beforehand:
!mkdir -p /content/rasc/data/processed/relationships
!mkdir -p /content/rasc/data/splits

# Copy processed files
!cp /content/drive/MyDrive/rasc_data/vg_5k_subset.json /content/rasc/data/processed/
!cp /content/drive/MyDrive/rasc_data/label_map.json /content/rasc/data/processed/
!cp /content/drive/MyDrive/rasc_data/relationship_pairs.json /content/rasc/data/processed/relationships/

# Copy splits
!cp -r /content/drive/MyDrive/rasc_data/splits/* /content/rasc/data/splits/

# Verify
!ls -lh /content/rasc/data/processed/
!ls -lh /content/rasc/data/splits/

## ‚öôÔ∏è Configuration

Create Colab-optimized configuration:

In [None]:
import torch
import yaml

# Detect device
device = "cuda:0" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

# Create Colab config
config = {
    'project': {'name': 'rasc', 'version': '1.0.0', 'seed': 42},
    'paths': {
        'processed_data': '/content/rasc/data/processed',
        'models': '/content/rasc/models',
        'experiments': '/content/rasc/experiments/runs',
        'logs': '/content/rasc/logs',
        'label_map': '/content/rasc/data/processed/label_map.json',
        'relationship_pairs': '/content/rasc/data/processed/relationships/relationship_pairs.json',
        'splits': '/content/rasc/data/splits'
    },
    'detection': {
        'model_name': 'yolov8n',
        'training': {
            'epochs': 30,
            'batch_size': 16,
            'image_size': [512, 640],
            'workers': 2,
            'device': device
        },
        'optimizer': {'lr0': 0.01}
    },
    'relationship': {
        'model_type': 'neural_motifs',
        'num_classes': 150,
        'num_relations': 10,
        'embedding_dim': 128,
        'training': {
            'epochs': 15,
            'batch_size': 64,
            'learning_rate': 0.001
        }
    },
    'captioning': {
        'model_name': 't5-small',
        'training': {
            'epochs': 5,
            'batch_size': 8,
            'learning_rate': 0.0001
        }
    }
}

# Save config
!mkdir -p /content/rasc/configs
with open('/content/rasc/configs/config.yaml', 'w') as f:
    yaml.dump(config, f)

print("‚úì Configuration created")

## üéØ Stage 1: Object Detection Training

Train YOLOv8 for object detection:

In [None]:
# First, create YOLO config file
yolo_config = """
path: /content/rasc/data/splits

train: train/images
val: val/images
test: test/images

nc: 150
names: ['window', 'man', 'tree', 'wall', 'shirt', 'building', 'person', 'ground', 'sky', 'sign']
# ... add all 150 class names from your label_map.json
"""

with open('/content/rasc/configs/yolo.yaml', 'w') as f:
    f.write(yolo_config)

# Load actual class names from label_map
import json
with open('/content/rasc/data/processed/label_map.json') as f:
    label_map = json.load(f)

# Sort by index and create names list
names = [k for k, v in sorted(label_map.items(), key=lambda x: x[1])]
print(f"Loaded {len(names)} object classes")

In [None]:
# Train YOLO
%cd /content/rasc
!python src/models/train_yolo.py \
  --config configs/config.yaml \
  --experiment-name yolo_colab_v1

In [None]:
# Monitor training (in separate cell while training)
# View logs
!tail -n 20 /content/rasc/logs/*.log

# Check experiments
!ls -lh /content/rasc/experiments/runs/

## üîó Stage 2: Relationship Prediction Training

In [None]:
# Train relationship model
!python src/models/train_relationship.py \
  --config configs/config.yaml \
  --experiment-name relationship_colab_v1

In [None]:
# Check training progress
import json

# Find latest experiment
import glob
exp_dirs = glob.glob('/content/rasc/experiments/runs/relationship_*')
latest_exp = sorted(exp_dirs)[-1]

# Load metrics
metrics_file = f"{latest_exp}/metrics/metrics.json"
if os.path.exists(metrics_file):
    with open(metrics_file) as f:
        metrics = json.load(f)
    
    # Plot training curve
    import matplotlib.pyplot as plt
    
    if 'train_loss' in metrics:
        train_loss = [m['value'] for m in metrics['train_loss']]
        plt.plot(train_loss, label='Train Loss')
        
    if 'val_loss' in metrics:
        val_loss = [m['value'] for m in metrics['val_loss']]
        plt.plot(val_loss, label='Val Loss')
    
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.title('Relationship Model Training')
    plt.show()

## üí¨ Stage 3: Caption Generation Training

In [None]:
# First, build T5 dataset from relationships
# You'll need to create this file or copy it
!python src/data/build_t5_dataset.py --config configs/config.yaml

In [None]:
# Train T5 caption model
!python src/models/train_t5.py \
  --config configs/config.yaml \
  --experiment-name t5_colab_v1

## üîÆ Inference & Testing

In [None]:
# Upload a test image
from google.colab import files
from PIL import Image
import io

print("Upload a test image:")
uploaded = files.upload()

# Get the filename
test_image = list(uploaded.keys())[0]

# Display image
img = Image.open(io.BytesIO(uploaded[test_image]))
plt.figure(figsize=(10, 8))
plt.imshow(img)
plt.axis('off')
plt.title('Test Image')
plt.show()

In [None]:
# Run inference
!python src/models/inference.py \
  --image {test_image} \
  --config configs/config.yaml \
  --output results.json

In [None]:
# View results
with open('results.json') as f:
    results = json.load(f)

print("="*60)
print("INFERENCE RESULTS")
print("="*60)
print(f"\nObjects detected: {results['num_objects']}")
print(f"Relationships: {results['num_relationships']}")
print(f"\nGenerated Caption:")
print(f"  {results['caption']}")
print(f"\nInference time: {results['inference_time']:.2f}s")
print("="*60)

## üíæ Save Models to Google Drive

In [None]:
# Save trained models to Google Drive for persistence
!mkdir -p /content/drive/MyDrive/rasc_models

# Copy YOLO weights
!cp -r /content/rasc/experiments/runs/yolo_colab_v1*/weights \
  /content/drive/MyDrive/rasc_models/yolo_weights

# Copy relationship model
!cp /content/rasc/models/relationship_predictor/*.pt \
  /content/drive/MyDrive/rasc_models/

# Copy T5 model
!cp -r /content/rasc/models/caption_generator/t5_scene \
  /content/drive/MyDrive/rasc_models/

print("‚úì Models saved to Google Drive")

## üìä View Experiment Results

In [None]:
# List all experiments
!ls -lh /content/rasc/experiments/runs/

# View metrics for a specific experiment
exp_name = "yolo_colab_v1_20240210_143022"  # Replace with your experiment
!cat /content/rasc/experiments/runs/{exp_name}/metrics/metrics.json | python -m json.tool

## üé® Visualization (Optional)

In [None]:
# Visualize detection results
from ultralytics import YOLO

# Load best YOLO model
model = YOLO('/content/rasc/experiments/runs/yolo_colab_v1_*/weights/best.pt')

# Run on test image
results = model(test_image)

# Display
results[0].plot()
plt.show()