# Visual Genome Adapter End-to-End (Colab)

Run the Visual Genome adapter experiments directly inside Google Colab. This notebook walks through mounting Drive, cloning the repo, preparing data, running a quick smoke test, launching a full experiment, and visualising metrics inline.


## 1. Environment Prep

1. Mount Google Drive (`from google.colab import drive; drive.mount('/content/drive')`).
2. Clone or upload the repository into Colab (for example `!git clone https://github.com/<your-username>/comp545_final.git /content/comp545_final`).
3. Place the Visual Genome archives (or processed JSON) under a Drive directory that you will reference in the next cell.


In [None]:
# Parameters first: adjust requirements only if your environment lacks them
REQUIRED_PACKAGES = [
    "open-clip-torch",
    "torch",
    "torchvision",
    "pillow",
    "numpy",
    "matplotlib",
    "pandas",
]

%pip install -q {' '.join(REQUIRED_PACKAGES)}


In [None]:
# Parameter block: update these paths for your Colab session
CONFIG = {
    "repo_root": "/content/comp545_final",  # location of the cloned repo
    "data_root": "/content/drive/MyDrive/comp545_data",  # Drive folder containing Visual Genome data
    "output_root": "/content/drive/MyDrive/comp545_outputs",  # Drive folder for experiment results
    "env": "colab",
    "limit_images": 1000,  # set None for full dataset
    "output_name": "vg_colab_run",
    "distill_weights": (0.0, 0.1),
}

import os
import sys
from pathlib import Path

repo_root = Path(CONFIG["repo_root"]).expanduser().resolve()
data_root = Path(CONFIG["data_root"]).expanduser().resolve()
output_root = Path(CONFIG["output_root"]).expanduser().resolve()

os.environ["VG_COLAB_REPO_ROOT"] = str(repo_root)
os.environ["VG_COLAB_DATA_ROOT"] = str(data_root)
os.environ["VG_COLAB_OUTPUT_ROOT"] = str(output_root)

repo_root.mkdir(parents=True, exist_ok=True)
data_root.mkdir(parents=True, exist_ok=True)
output_root.mkdir(parents=True, exist_ok=True)

if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

print("repo_root:", repo_root)
print("data_root:", data_root)
print("output_root:", output_root)


## 2. Verify Dataset Layout
The helper functions expect the following structure under `CONFIG['data_root']`:
```
visual_genome_raw/
  region_descriptions.json  (or .zip)
  image_data.json           (or .zip)
  VG_100K.zip
  VG_100K_2.zip
visual_genome/
  images/
    VG_100K/
    VG_100K_2/
```
If you already processed the JSON on another machine, copy `visual_genome_splits.json` into `visual_genome/` and skip the next cell.


In [None]:
# Parameters for data preparation
RUN_PROCESS = False  # set True to (re)generate processed splits
PROCESS_OVERRIDES = {
    "max_images": 5000,
    "max_regions_per_image": 6,
    "min_region_words": 3,
    "validation_ratio": 0.1,
    "test_ratio": 0.1,
    "seed": 42,
}

from src.config.runtime import resolve_paths
from src.data.visual_genome import VisualGenomeProcessConfig, download_visual_genome, process_visual_genome, verify_visual_genome

if RUN_PROCESS:
    paths = resolve_paths(CONFIG["env"])
    cfg = VisualGenomeProcessConfig(**PROCESS_OVERRIDES)
    download_visual_genome(paths, include_images=True, force=False)
    processed_path = process_visual_genome(paths, cfg)
    print("processed splits:", processed_path)
    verify_visual_genome(paths)
else:
    print("Skipping data preparation; set RUN_PROCESS=True to enable.")


In [None]:
# Parameters for adapter training
ADAPTER_OVERRIDES = {
    "output_name": CONFIG["output_name"],
    "limit_images": CONFIG["limit_images"],
    "distill_weights": CONFIG["distill_weights"],
    "device_preference": "cuda",
    "adapter_steps": 300,
    "adapter_batch": 32,
}

from src.training.vg_adapter import AdapterExperimentConfig, run_visual_genome_adapter

paths = resolve_paths(CONFIG["env"])
base_config = AdapterExperimentConfig()
config = base_config
for key, value in ADAPTER_OVERRIDES.items():
    setattr(config, key, value)

results = run_visual_genome_adapter(paths, config)
metrics_path = results.get("metrics_path")
print("metrics stored at:", metrics_path)

from pprint import pprint

summary = results.get("adapter_metrics")
if isinstance(summary, dict) and summary:
    distill_keys = list(summary.keys())
    best_key = distill_keys[-1]
    test_metrics = summary[best_key]["test"]
    print("test split metrics (distill={:.2f}):".format(best_key))
    pprint(test_metrics)
else:
    print("adapter metrics empty; check configuration.")


## 3. Next Steps
- Adjust `CONFIG` and `ADAPTER_OVERRIDES` as you move from smoke tests to full training.
- Results and plots land under `CONFIG['output_root']/CONFIG['output_name']`.
- To resume on a different machine, copy the processed `visual_genome_splits.json` and keep the same directory layout.

When you are ready to convert this notebook back to scripts, the functions live in `src/training/vg_adapter.py` and `src/data/visual_genome.py`.
