# Jersey Number Pipeline — Colab Setup & Run

**Before running:** Runtime → Change runtime type → **GPU** (e.g. T4).

Then run the cells in order. Edit the **CONFIG** cell to match your Google Drive paths.

## 1. CONFIG — Edit these to match your Google Drive

Put on Drive:
- Dataset zip (e.g. `jersey-2023.zip`) in your project folder.
- A `weights` folder with subfolders: `models/`, `reid/`, `pose/` (with the required .pth/.ckpt files).

**Persistence:** Set `PERSIST_TO_DRIVE = True` to clone the repo and SAM into a folder on Drive (e.g. `My Drive/jersey-number-pipeline/jersey-number-pipeline/`). They will survive runtime disconnects and you can skip re-cloning next time. If `False`, everything is in Colab’s `/content/` and is lost when the runtime ends.

In [None]:
# ============== EDIT THESE ==============
DRIVE_BASE = "/content/drive/MyDrive"
DRIVE_PROJECT_FOLDER = "jersey-number-pipeline"  # folder under My Drive with your zip and weights
DATASET_ZIP = "jersey-2023.zip"
WEIGHTS_FOLDER = "weights"
MOUNT_DRIVE = True  # set False if you already mounted in a previous run
PERSIST_TO_DRIVE = True  # True = clone repo & SAM to Drive (persistent); False = use /content (lost when runtime ends)
# =======================================

## 2. Mount Google Drive

In [None]:
if MOUNT_DRIVE:
    from google.colab import drive
    drive.mount("/content/drive")

## 3. Clone repo and all sub-repos (to Drive or /content)

Clones the main repo plus sub-repos as in `setup.py`: **SAM** (`sam2/`), **Centroid-ReID** (`reid/centroids-reid/`), **ViTPose** (`pose/ViTPose/`), **PARSeq** (`str/parseq/`). With `PERSIST_TO_DRIVE = True`, everything lives on Drive and persists across sessions.

In [None]:
import os

# Same repos as setup.py: main repo, sam2, reid/centroids-reid, pose/ViTPose, str/parseq
drive_project = os.path.join(DRIVE_BASE, DRIVE_PROJECT_FOLDER)
if PERSIST_TO_DRIVE:
    # Clone into Drive so all repos persist across sessions
    os.makedirs(drive_project, exist_ok=True)
    repo_root = os.path.join(drive_project, "jersey-number-pipeline")
    if not os.path.isdir(os.path.join(repo_root, ".git")):
        get_ipython().system(f'cd "{drive_project}" && git clone https://github.com/superbolt08/jersey-number-pipeline.git')
    get_ipython().run_line_magic("cd", repo_root)
else:
    # Ephemeral: clone to /content (faster, lost when runtime ends)
    get_ipython().system("git clone https://github.com/superbolt08/jersey-number-pipeline.git")
    get_ipython().run_line_magic("cd", "/content/jersey-number-pipeline")
    repo_root = "/content/jersey-number-pipeline"

# SAM (required for legibility)
if not os.path.isdir("sam2"):
    get_ipython().system("git clone --recurse-submodules https://github.com/davda54/sam.git sam2")

# Re-ID: centroids-reid (setup.py)
os.makedirs("reid", exist_ok=True)
if not os.path.isdir("reid/centroids-reid"):
    get_ipython().system("git clone --recurse-submodules https://github.com/mikwieczorek/centroids-reid.git reid/centroids-reid")
    os.makedirs("reid/centroids-reid/models", exist_ok=True)

# Pose: ViTPose (setup.py)
os.makedirs("pose", exist_ok=True)
if not os.path.isdir("pose/ViTPose"):
    get_ipython().system("git clone --recurse-submodules https://github.com/ViTAE-Transformer/ViTPose.git pose/ViTPose")

# STR: PARSeq (setup.py)
os.makedirs("str", exist_ok=True)
if not os.path.isdir("str/parseq"):
    get_ipython().system("git clone --recurse-submodules https://github.com/baudm/parseq.git str/parseq")

print("Repo root:", repo_root)

## 4. Copy dataset from Drive and unzip

In [None]:
import os

# repo_root was set in the previous cell (Clone)
drive_project = os.path.join(DRIVE_BASE, DRIVE_PROJECT_FOLDER)
drive_zip = os.path.join(drive_project, DATASET_ZIP)

print(f"Expected zip file path: {drive_zip}") # Print expected path for user to verify

os.makedirs(f"{repo_root}/data/SoccerNet", exist_ok=True)
!cp "{drive_zip}" "{repo_root}/"
!unzip -o -q "{repo_root}/{DATASET_ZIP}" -d "{repo_root}/data/SoccerNet"

# If zip had train/ and test/ at top level, put them under jersey-2023
data_sn = f"{repo_root}/data/SoccerNet"
if os.path.isdir(f"{data_sn}/train") and not os.path.isdir(f"{data_sn}/jersey-2023"):
    !mkdir -p "{data_sn}/jersey-2023"
    !mv "{data_sn}/train" "{data_sn}/jersey-2023/"
    !mv "{data_sn}/test" "{data_sn}/jersey-2023/"

!ls "{data_sn}"/

## 5. Copy model weights from Drive

In [None]:
drive_weights = os.path.join(drive_project, WEIGHTS_FOLDER)

!mkdir -p "{repo_root}/models" "{repo_root}/reid/centroids-reid/models" "{repo_root}/pose/ViTPose/checkpoints"
!cp "{drive_weights}/models/"* "{repo_root}/models/" 2>/dev/null || true
!cp "{drive_weights}/reid/"* "{repo_root}/reid/centroids-reid/models/" 2>/dev/null || true
!cp "{drive_weights}/pose/"* "{repo_root}/pose/ViTPose/checkpoints/" 2>/dev/null || true
print("Weights copied.")

## 6. Install dependencies

In [None]:
!pip install -q torch torchvision opencv-python Pillow numpy pandas scipy tqdm pytorch-lightning yacs

## 7. Run the pipeline

In [None]:
get_ipython().run_line_magic("cd", repo_root)
get_ipython().system(f"PYTHONPATH={repo_root}/sam2:$PYTHONPATH python main.py SoccerNet test")
#get_ipython().system("python main.py SoccerNet test")

## 8. (Optional) Download results or copy to Drive

In [None]:
# Zip and download to your computer
# get_ipython().system(f'cd "{repo_root}" && zip -r out.zip out/')
# from google.colab import files
# files.download('out.zip')

# Or copy outputs to Drive
get_ipython().system(f'cp -r "{repo_root}/out" "{DRIVE_BASE}/{DRIVE_PROJECT_FOLDER}/"')
print("Outputs copied to Drive.")