# Origin Take-home: Prompted Segmentation (Structured Workflow)

This notebook is organized to preserve *git-trackable* results at each stage:
1. Dataset prep
2. CLIPSeg zero-shot baseline
3. SAM3 baseline (optional, if access/setup works)
4. CLIPSeg fine-tuning
5. Improvement experiments (threshold/epochs/image size/etc.)

Use `results/` for small artifacts only (metrics JSON, notes, selected visuals).
Do **not** commit datasets/checkpoints.


In [None]:
#@title 1) Mount Drive (optional)
# Keep this False for faster training I/O. Use Drive mainly for persistence.
USE_DRIVE = False
if USE_DRIVE:
    from google.colab import drive
    drive.mount('/content/drive')


In [None]:
#@title 2) Clone / update repo
from pathlib import Path

REPO_URL = "https://github.com/<your-username>/origin_finetune.git"  # @param {type:"string"}
BRANCH = "main"  # @param {type:"string"}

BASE_DIR = Path('/content')
if USE_DRIVE:
    BASE_DIR = Path('/content/drive/MyDrive/origin_takehome')
BASE_DIR.mkdir(parents=True, exist_ok=True)

REPO_DIR = BASE_DIR / 'origin_finetune'
if REPO_DIR.exists() and (REPO_DIR / '.git').exists():
    %cd {REPO_DIR}
    !git fetch origin
    !git checkout {BRANCH}
    !git pull --ff-only origin {BRANCH}
else:
    %cd {BASE_DIR}
    !git clone -b {BRANCH} {REPO_URL} {REPO_DIR}
    %cd {REPO_DIR}

print("Repo dir:", REPO_DIR)


In [None]:
#@title 3) Install dependencies
%cd {REPO_DIR}
!python -m pip install -q --upgrade pip
!pip install -q -r requirements.txt roboflow requests


In [None]:
#@title 4) Create local artifact folders (tracked + untracked)
%cd {REPO_DIR}
!mkdir -p outputs/metrics outputs/eval_vis outputs/report_panels results/baselines results/finetuned results/experiments


In [None]:
#@title 5) Enter Roboflow API key (hidden input)
import os
from getpass import getpass

if 'ROBOFLOW_API_KEY' not in os.environ or not os.environ['ROBOFLOW_API_KEY']:
    os.environ['ROBOFLOW_API_KEY'] = getpass('Enter ROBOFLOW_API_KEY: ')
print('API key set:', bool(os.environ.get('ROBOFLOW_API_KEY')))


In [None]:
#@title 5b) (Optional) Enter Hugging Face token for SAM3 (hidden input)
import os
from getpass import getpass

if 'HF_TOKEN' not in os.environ or not os.environ['HF_TOKEN']:
    os.environ['HF_TOKEN'] = getpass('Enter HF_TOKEN (optional now, required for SAM3): ')
print('HF token set:', bool(os.environ.get('HF_TOKEN')))


In [None]:
#@title 6) Download Roboflow datasets (forked versions)
%cd {REPO_DIR}
!mkdir -p data/raw

!python -m src.data.download_roboflow     --api-key "$ROBOFLOW_API_KEY"     --workspace "manojs-workspace-mbjw9"     --project "drywall-join-detect-jdsh1"     --version 1     --format coco     --out-dir data/raw/drywall_join

!python -m src.data.download_roboflow     --api-key "$ROBOFLOW_API_KEY"     --workspace "manojs-workspace-mbjw9"     --project "cracks-3ii36-9iz5c"     --version 1     --format coco     --out-dir data/raw/cracks


In [None]:
#@title 7) Convert Roboflow COCO exports -> merged manifest -> label-aware resplit
%cd {REPO_DIR}
!mkdir -p data/processed

!python -m src.data.prepare_from_roboflow_coco_export     --export-root data/raw/drywall_join     --dataset-tag drywall-join-detect     --out-dir data/processed

!python -m src.data.prepare_from_roboflow_coco_export     --export-root data/raw/cracks     --dataset-tag cracks     --out-dir data/processed

!python -m src.data.merge_manifests     --inputs data/processed/manifest_drywall-join-detect.csv data/processed/manifest_cracks.csv     --out data/processed/manifest_all.csv

!python -m src.data.resplit_manifest     --manifest-csv data/processed/manifest_all.csv     --out data/processed/manifest_all_resplit.csv     --seed 42


## CLIPSeg Baseline (Zero-shot)
Run this first and archive metrics to `results/baselines/` before any fine-tuning.


In [None]:
#@title 8) Evaluate zero-shot CLIPSeg baseline
%cd {REPO_DIR}
!python -m src.eval_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-dir CIDAS/clipseg-rd64-refined   --split test   --save-vis-dir outputs/eval_vis_clipseg_zeroshot   --max-vis 4   --metrics-out outputs/metrics/clipseg_zeroshot_test.json


In [None]:
#@title 9) Archive zero-shot CLIPSeg baseline artifacts into tracked results/
%cd {REPO_DIR}
!python scripts/archive_experiment.py   --category baselines   --run-id clipseg_zeroshot_v1   --summary-json outputs/metrics/clipseg_zeroshot_test.json   --copy outputs/eval_vis_clipseg_zeroshot   --notes "Zero-shot CLIPSeg baseline on manifest_all_resplit test split"


## SAM3 Baseline (Optional / Stretch)
Use this section only if SAM3 access + setup works in Colab. Keep CLIPSeg zero-shot and fine-tuned results as the primary deliverables.

Suggested policy:
- First run a **subset** (e.g. 50-100 test images)
- Save metrics in the same JSON schema as `src.eval_clipseg.py`
- Archive to `results/baselines/sam3_zeroshot_*`


In [None]:
#@title 10) Install/upgrade dependencies for SAM3 (run only if needed)
%cd {REPO_DIR}
!pip install -q -U transformers accelerate huggingface_hub
# If the official sam3 package requires extra deps, install them here as well.


In [None]:
#@title 11) Run SAM3 zero-shot baseline (subset first, then full)
%cd {REPO_DIR}
SAM3_RUN_ID = "sam3_zeroshot_subset100_v1"  # @param {type:"string"}
SAM3_MODEL_ID = "facebook/sam3"  # @param {type:"string"}
SAM3_MAX_SAMPLES = 100  # @param {type:"integer"}

# Edit the command strings below if you change the values above.
!python -m src.eval_sam3   --manifest-csv data/processed/manifest_all_resplit.csv   --model-id "facebook/sam3"   --split test   --max-samples 100   --save-vis-dir outputs/eval_vis_sam3_zeroshot_subset100_v1   --max-vis 4   --metrics-out outputs/metrics/sam3_zeroshot_subset100_v1.json


In [None]:
#@title 12) Archive SAM3 baseline (run after SAM3 metrics exist)
%cd {REPO_DIR}
SAM3_RUN_ID = "sam3_zeroshot_subset100_v1"  # @param {type:"string"}
# Edit the command strings below if you change the run id above.
!python scripts/archive_experiment.py   --category baselines   --run-id sam3_zeroshot_subset100_v1   --summary-json outputs/metrics/sam3_zeroshot_subset100_v1.json   --copy outputs/eval_vis_sam3_zeroshot_subset100_v1   --notes "SAM3 zero-shot baseline (subset/full; check max_samples in metrics JSON)"


## CLIPSeg Fine-tuning (Main Result)
Use unique output directories per experiment to avoid overwriting.


In [None]:
#@title 13) Fine-tune CLIPSeg (main run)
%cd {REPO_DIR}
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # @param {type:"string"}
FT_OUTPUT_DIR = f"checkpoints/{MAIN_RUN_ID}"
print("FT_OUTPUT_DIR =", FT_OUTPUT_DIR)

# Edit the command strings below if you change MAIN_RUN_ID.
!python -m src.train_clipseg     --manifest-csv data/processed/manifest_all_resplit.csv     --output-dir checkpoints/clipseg_ft_e8_352_v1     --epochs 8     --batch-size 4     --image-size 352     --lr 2e-5     --grad-accum-steps 1


In [None]:
#@title 14) Evaluate fine-tuned CLIPSeg (main run)
%cd {REPO_DIR}
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # must match previous cell if rerun separately
FT_OUTPUT_DIR = f"checkpoints/{MAIN_RUN_ID}"
FT_EVAL_VIS_DIR = f"outputs/eval_vis_{MAIN_RUN_ID}"
FT_METRICS_OUT = f"outputs/metrics/{MAIN_RUN_ID}_test.json"

# Edit the command strings below if you change MAIN_RUN_ID.
!python -m src.eval_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-dir checkpoints/clipseg_ft_e8_352_v1   --split test   --save-vis-dir outputs/eval_vis_clipseg_ft_e8_352_v1   --max-vis 4   --metrics-out outputs/metrics/clipseg_ft_e8_352_v1_test.json


In [None]:
#@title 15) Create side-by-side report panels (orig | GT | pred) for main run
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image

repo = REPO_DIR
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # @param {type:"string"}
manifest = pd.read_csv(repo / 'data/processed/manifest_all_resplit.csv')
test_df = manifest[manifest['split'] == 'test'].copy()
pred_dir = repo / 'outputs' / f'eval_vis_{MAIN_RUN_ID}'
panel_dir = repo / 'outputs' / f'report_panels_{MAIN_RUN_ID}'
panel_dir.mkdir(parents=True, exist_ok=True)

saved = 0
for _, row in test_df.iterrows():
    image_id = str(row['image_id'])
    pred_path = pred_dir / f"{image_id}__pred.png"
    gt_path = pred_dir / f"{image_id}__gt.png"
    if not pred_path.exists() or not gt_path.exists():
        continue
    img = Image.open(row['image_path']).convert('RGB')
    gt = Image.open(gt_path).convert('L')
    pred = Image.open(pred_path).convert('L')

    fig, axes = plt.subplots(1, 3, figsize=(12, 4))
    axes[0].imshow(img); axes[0].set_title('Original'); axes[0].axis('off')
    axes[1].imshow(gt, cmap='gray'); axes[1].set_title('GT'); axes[1].axis('off')
    axes[2].imshow(pred, cmap='gray'); axes[2].set_title('Pred'); axes[2].axis('off')
    fig.suptitle(f"{image_id} | {row['label']}")
    out = panel_dir / f"{image_id}__panel.png"
    fig.tight_layout()
    fig.savefig(out, dpi=150, bbox_inches='tight')
    plt.close(fig)
    print("saved", out)
    saved += 1
    if saved >= 4:
        break
print("panels saved:", saved)


In [None]:
#@title 15b) Create report panels (4 per dataset) for main run [recommended]
%cd {REPO_DIR}
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # @param {type:"string"}
PANELS_PER_DATASET = 4  # @param {type:"integer"}

!python scripts/make_report_panels.py   --manifest-csv data/processed/manifest_all_resplit.csv   --pred-dir outputs/eval_vis_{MAIN_RUN_ID}   --out-dir outputs/report_panels_{MAIN_RUN_ID}   --split test   --group-col dataset_tag   --per-group {PANELS_PER_DATASET}


In [None]:
#@title 16) Model size (runtime & footprint section) for main run
%cd {REPO_DIR}
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # @param {type:"string"}
!du -sh checkpoints/clipseg_ft_e8_352_v1
!du -sh checkpoints/clipseg_ft_e8_352_v1/*


In [None]:
#@title 17) Archive fine-tuned CLIPSeg main run artifacts into tracked results/
%cd {REPO_DIR}
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # @param {type:"string"}
# Edit the command strings below if you change MAIN_RUN_ID.
!python scripts/archive_experiment.py   --category finetuned   --run-id clipseg_ft_e8_352_v1   --summary-json outputs/metrics/clipseg_ft_e8_352_v1_test.json   --copy checkpoints/clipseg_ft_e8_352_v1/best_metrics.json checkpoints/clipseg_ft_e8_352_v1/train_history.json outputs/report_panels_clipseg_ft_e8_352_v1   --notes "Main fine-tuned CLIPSeg run (full fine-tuning, 8 epochs, image_size=352)"


## Improvement Experiments (run one at a time, archive each)
Recommended order:
1. Threshold sweep (cheap)
2. More epochs (12/16)
3. Larger image size (512, smaller batch)
4. Class balancing / weighting
5. Prompt augmentation expansion
6. Post-processing


In [None]:
#@title 18) Threshold sweep helper on main checkpoint (cheap improvement test)
%cd {REPO_DIR}
MAIN_RUN_ID = "clipseg_ft_e8_352_v1"  # @param {type:"string"}
for thr in [0.3, 0.4, 0.5, 0.6]:
    out_json = f"outputs/metrics/{MAIN_RUN_ID}_test_thr{str(thr).replace('.', '')}.json"
    print("\n=== threshold", thr, "===")
    # If interpolation fails in your Colab, replace {thr}/{MAIN_RUN_ID} with literal values.
    !python -m src.eval_clipseg       --manifest-csv data/processed/manifest_all_resplit.csv       --model-dir checkpoints/{MAIN_RUN_ID}       --split test       --threshold {thr}       --metrics-out {out_json}


In [None]:
#@title 19) Custom CLIPSeg experiment template (new checkpoint dir each time)
%cd {REPO_DIR}
EXP_RUN_ID = "clipseg_ft_e12_352_trial1"  # @param {type:"string"}
EXP_EPOCHS = 12  # @param {type:"integer"}
EXP_IMAGE_SIZE = 352  # @param {type:"integer"}
EXP_BATCH = 4  # @param {type:"integer"}
EXP_LR = 2e-5  # @param {type:"number"}

# If shell interpolation fails in your Colab, replace {EXP_*} placeholders with literals.
!python -m src.train_clipseg     --manifest-csv data/processed/manifest_all_resplit.csv     --output-dir checkpoints/{EXP_RUN_ID}     --epochs {EXP_EPOCHS}     --batch-size {EXP_BATCH}     --image-size {EXP_IMAGE_SIZE}     --lr {EXP_LR}

!python -m src.eval_clipseg     --manifest-csv data/processed/manifest_all_resplit.csv     --model-dir checkpoints/{EXP_RUN_ID}     --split test     --metrics-out outputs/metrics/{EXP_RUN_ID}_test.json

!python scripts/archive_experiment.py   --category experiments   --run-id {EXP_RUN_ID}   --summary-json outputs/metrics/{EXP_RUN_ID}_test.json   --copy checkpoints/{EXP_RUN_ID}/best_metrics.json checkpoints/{EXP_RUN_ID}/train_history.json   --notes "Custom experiment; record changes in run_id and report notes"


In [None]:
#@title 19b) Create balanced-train manifest helper (equal images from both datasets for train split)
%cd {REPO_DIR}
BAL_MANIFEST = "data/processed/manifest_all_resplit_balanced_train_by_dataset.csv"  # @param {type:"string"}

!python -m src.data.balance_manifest_by_group   --manifest-csv data/processed/manifest_all_resplit.csv   --out {BAL_MANIFEST}   --split train   --group-col dataset_tag   --unit-col image_id   --seed 42


In [None]:
#@title 19c) Planned Run 1: rd64-refined | 16 epochs | 512x512 | threshold 0.4
%cd {REPO_DIR}
RUN_ID = "clipseg_ft_e16_512_thr04_v1"  # @param {type:"string"}

# Notes:
# - threshold here affects validation metrics/checkpoint selection during training
# - optimized for A100 throughput; reduce batch if you hit OOM
!python -m src.train_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-name CIDAS/clipseg-rd64-refined   --output-dir checkpoints/{RUN_ID}   --epochs 16   --batch-size 8   --grad-accum-steps 1   --image-size 512   --lr 2e-5   --threshold 0.4   --num-workers 8   --persistent-workers   --prefetch-factor 4   --tf32   --amp-dtype bf16   --no-processor-resize

!python -m src.eval_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-dir checkpoints/{RUN_ID}   --split test   --image-size 512   --threshold 0.4   --num-workers 8   --persistent-workers   --prefetch-factor 4   --no-processor-resize   --metrics-out outputs/metrics/{RUN_ID}_test.json

!python scripts/archive_experiment.py   --category experiments   --run-id {RUN_ID}   --summary-json outputs/metrics/{RUN_ID}_test.json   --copy checkpoints/{RUN_ID}/best_metrics.json checkpoints/{RUN_ID}/train_history.json   --notes "Run1: rd64-refined, 16 epochs, image_size=512, threshold=0.4 (val/test eval threshold)."


In [None]:
#@title 19d) Planned Run 2: rd64-refined | 4 epochs | 1024x1024 | threshold 0.4 (high-risk/OOM)
%cd {REPO_DIR}
RUN_ID = "clipseg_ft_e4_1024_thr04_v1"  # @param {type:"string"}

# 1024x1024 remains memory-heavy. On A100 start at batch_size=2; if OOM, drop to 1.
!python -m src.train_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-name CIDAS/clipseg-rd64-refined   --output-dir checkpoints/{RUN_ID}   --epochs 4   --batch-size 2   --grad-accum-steps 1   --image-size 1024   --lr 2e-5   --threshold 0.4   --num-workers 8   --persistent-workers   --prefetch-factor 4   --tf32   --amp-dtype bf16   --no-processor-resize

!python -m src.eval_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-dir checkpoints/{RUN_ID}   --split test   --batch-size 2   --image-size 1024   --threshold 0.4   --num-workers 8   --persistent-workers   --prefetch-factor 4   --no-processor-resize   --metrics-out outputs/metrics/{RUN_ID}_test.json

!python scripts/archive_experiment.py   --category experiments   --run-id {RUN_ID}   --summary-json outputs/metrics/{RUN_ID}_test.json   --copy checkpoints/{RUN_ID}/best_metrics.json checkpoints/{RUN_ID}/train_history.json   --notes "Run2: rd64-refined, 4 epochs, image_size=1024, threshold=0.4. A100-tuned defaults (batch_size=2, grad_accum=1); reduce batch if OOM."


In [None]:
#@title 19e) Planned Run 3: rd64-refined | 8 epochs | 512x512 | threshold 0.4 | balanced train by dataset
%cd {REPO_DIR}
RUN_ID = "clipseg_ft_e8_512_thr04_baltrain_dataset_v1"  # @param {type:"string"}
BAL_MANIFEST = "data/processed/manifest_all_resplit_balanced_train_by_dataset.csv"  # @param {type:"string"}

# Make balanced manifest first (cell 19b), then run this cell.
!python -m src.train_clipseg   --manifest-csv {BAL_MANIFEST}   --model-name CIDAS/clipseg-rd64-refined   --output-dir checkpoints/{RUN_ID}   --epochs 8   --batch-size 8   --grad-accum-steps 1   --image-size 512   --lr 2e-5   --threshold 0.4   --num-workers 8   --persistent-workers   --prefetch-factor 4   --tf32   --amp-dtype bf16   --no-processor-resize

!python -m src.eval_clipseg   --manifest-csv data/processed/manifest_all_resplit.csv   --model-dir checkpoints/{RUN_ID}   --split test   --image-size 512   --threshold 0.4   --num-workers 8   --persistent-workers   --prefetch-factor 4   --no-processor-resize   --metrics-out outputs/metrics/{RUN_ID}_test.json

!python scripts/archive_experiment.py   --category experiments   --run-id {RUN_ID}   --summary-json outputs/metrics/{RUN_ID}_test.json   --copy checkpoints/{RUN_ID}/best_metrics.json checkpoints/{RUN_ID}/train_history.json   --notes "Run3: rd64-refined, 8 epochs, image_size=512, threshold=0.4, train split balanced by dataset_tag (equal image_id count per dataset). Eval on full test manifest."


In [None]:
#@title 20) Example inference for required output mask naming (main run)
%cd {REPO_DIR}
import pandas as pd
from pathlib import Path

MAIN_RUN_ID = "clipseg_ft_e8_352_v1"
m = pd.read_csv('data/processed/manifest_all_resplit.csv')
row = m.iloc[0]
print('Example image path:', row['image_path'])
print('Example label:', row['label'])

# Uncomment one:
# !python -m src.infer_clipseg --model-dir checkpoints/{MAIN_RUN_ID} --image "{row['image_path']}" --prompt "segment crack" --out-dir outputs/pred_masks
# !python -m src.infer_clipseg --model-dir checkpoints/{MAIN_RUN_ID} --image "{row['image_path']}" --prompt "segment taping area" --out-dir outputs/pred_masks


In [None]:
#@title 21) Inspect tracked result artifacts before committing
%cd {REPO_DIR}
!find results -maxdepth 4 -type f | sort
!git status --short


## Optional: Commit/Pull/Push from Colab
If you want to push result artifacts from Colab, authenticate carefully (PAT/token). Prefer committing only notebook/code/`results/` files.

If you prefer safer workflow:
- finish runs in Colab
- copy/pull changed repo files locally
- commit/push from local machine


## Submission Checklist
- GitHub link (codebase)
- Colab link (shareable)
- PDF report with:
  - Methodology
  - Data-preparation
  - Results (metrics table + visuals + runtime/footprint)
  - Failure cases + potential solutions

Security note: rotate/revoke your Roboflow API key after submission.
