# Q2 — Text-driven image segmentation with SAM (Colab-ready)

This notebook demonstrates a pipeline that takes a single image and a text prompt, uses a text→region model (GroundingDINO or CLIPSeg) to produce region seeds, and feeds the seeds to Segment Anything (SAM) to obtain precise masks.

Notes:
- You will likely need to download/upload model checkpoints (GroundingDINO and SAM). The notebook includes a fallback demo (full-image box) so you can verify SAM works without grounding weights.

In [None]:
# Install dependencies (run this cell in Colab)
!pip install -q git+https://github.com/facebookresearch/segment-anything.git
!pip install -q git+https://github.com/IDEA-Research/GroundingDINO.git
!pip install -q transformers timm opencv-python-headless

print('Install complete (may take a minute).')


In [None]:
import os
import cv2
import torch
import numpy as np
import matplotlib.pyplot as plt
from segment_anything import sam_model_registry, SamPredictor
from PIL import Image
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print('Device:', device)


## Step 1 — prepare SAM
Upload a SAM checkpoint (e.g., `sam_vit_h.pth`) to Colab (use left pane Files → Upload). Update the path below if different.

In [None]:
# Path to SAM checkpoint (upload to Colab /content/)
sam_checkpoint = '/content/sam_vit_h.pth'  # <-- upload this file in Colab
model_type = 'vit_h'
if os.path.exists(sam_checkpoint):
    sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
    predictor = SamPredictor(sam)
    print('SAM loaded')
else:
    print('SAM checkpoint not found. Upload sam_vit_h.pth to /content or change the path. You can still run the demo fallback if you do not have the checkpoint.')


## Step 2 — Upload an image and give a text prompt
Upload an image file to Colab (left Files pane). Set `img_path` and `prompt` below.

In [None]:
# Example (upload an image to /content/example.jpg)
img_path = '/content/example.jpg'  # <-- upload image here
prompt = 'a red bicycle'  # change to your desired text prompt

if not os.path.exists(img_path):
    print('Image not found at', img_path, "\nUpload an image to /content/example.jpg via the Colab Files pane.")
else:
    print('Image found. Prompt:', prompt)


## Step 3 — (Optional) GroundingDINO or CLIPSeg inference to convert text→boxes/seeds

If you have a GroundingDINO checkpoint, you can run it to produce boxes for the prompt. If not, this notebook will fall back to a simple full-image box so you can verify SAM integration.

In [None]:
# Fallback pipeline that runs SAM on a box derived from the entire image (demo)
if os.path.exists(img_path) and 'predictor' in globals():
    image_bgr = cv2.imread(img_path)
    H, W = image_bgr.shape[:2]
    # Fallback: use full image as a single box seed (x1,y1,x2,y2)
    boxes = np.array([[W*0.02, H*0.02, W*0.98, H*0.98]])
    predictor.set_image(image_bgr[:,:,::-1])  # SAM expects RGB
    input_boxes = torch.tensor(boxes, device=predictor.device, dtype=torch.float)
    masks, scores, logits = predictor.predict(box=input_boxes, multimask_output=True)

    def show_mask_on_image(img_bgr, mask, color=(0,255,0), alpha=0.4):
        overlay = img_bgr.copy()
        overlay[mask==1] = (overlay[mask==1] * (1-alpha) + np.array(color) * alpha).astype(np.uint8)
        return overlay

    for i, m in enumerate(masks):
        over = show_mask_on_image(image_bgr, m)
        plt.figure(figsize=(6,6)); plt.imshow(over[:,:,::-1]); plt.title(f'Mask {i} (score {scores[i]:.3f})'); plt.axis('off')
else:
    print('Either image or SAM predictor not available. Upload both to Colab to run end-to-end.')


## Notes & limitations
- For best text→region quality you should run GroundingDINO (or CLIPSeg) to convert the text prompt into high-quality boxes/seeds, then feed them to SAM. GroundingDINO requires a checkpoint (download or upload). The repo `IDEA-Research/GroundingDINO` provides helpers.
- SAM2 official models may be gated; this notebook uses the open-source Segment Anything integration. If you have SAM2 weights or an SDK, replace the SAM loading cell accordingly.
- The fallback (full-image box) is present so you can verify that SAM runs in Colab even without grounding checkpoints.
