# try-detection.ipynb

This notebook is a **research / demo** guide showing how to run object detection on an image
and produce cropped detection outputs suitable for storing in your app's `storage/crops/`.

It is written so you can run it either with a real YOLOv8 installation (`ultralytics`) **or**
as a simulated demo if you don't have the model weights locally.

Sections:
- Setup (packages and model weights)
- Detection wrapper that uses YOLO when available
- Demo run that creates example crops and prints detection metadata


## Setup

Recommended packages (run in your environment / virtualenv):
```
pip install ultralytics pillow numpy opencv-python
```

- If you use a GPU, install the appropriate `torch`/CUDA packages before installing `ultralytics`.
- Place your YOLOv8 weights (for example `yolov8s.pt`) in a safe folder and provide its path to the loader.

This notebook will gracefully fall back to a simulated detection if `ultralytics` is not installed
— so you can still see the data flow and cropped output generation.

In [None]:
%pip install pillow

In [None]:
import os
from pathlib import Path
from datetime import datetime
import json
import uuid
try:
    from PIL import Image
except Exception:
    raise RuntimeError('Pillow is required to run this notebook; please pip install pillow')

print('Notebook helpers ready')


In [None]:
def detect_and_crop(image_path, out_dir='generated_crops', model_path='yolov8s.pt', conf_thresh=0.25):
    """
    Run object detection on `image_path` and save crops to `out_dir`.
    If ultralytics is available and `model_path` exists, it will run real detection.
    Otherwise it will simulate two detections so you can inspect the outputs.

    Returns a list of detection metadata dicts:
    [{ 'id':..., 'label':..., 'confidence':..., 'bbox':[x1,y1,x2,y2], 'crop_path':..., 'timestamp':... }, ...]
    """
    out_dir = Path(out_dir)
    out_dir.mkdir(parents=True, exist_ok=True)
    image_path = Path(image_path)
    if not image_path.exists():
        raise FileNotFoundError(f'Input image not found: {image_path}')

    # Try to import ultralytics and run actual detection
    try:
        from ultralytics import YOLO
        real_model = True
    except Exception:
        real_model = False

    im = Image.open(image_path).convert('RGB')
    w, h = im.size
    results = []

    if real_model and Path(model_path).exists():
        print('Running real YOLO detection using model:', model_path)
        model = YOLO(model_path)
        preds = model(str(image_path))
        # ultralytics returns a list of Results for each image; we have one image
        boxes = preds[0].boxes
        for i, box in enumerate(boxes):
            # box.xyxy, box.conf, box.cls
            xyxy = box.xyxy.tolist()[0]
            conf = float(box.conf.tolist()[0])
            cls_idx = int(box.cls.tolist()[0])
            label = model.names.get(cls_idx, str(cls_idx)) if hasattr(model, 'names') else str(cls_idx)
            x1, y1, x2, y2 = map(int, xyxy)
            crop = im.crop((x1, y1, x2, y2))
            det_id = str(uuid.uuid4())
            crop_name = f"{det_id}_{label}.jpg"
            crop_path = out_dir / crop_name
            crop.save(crop_path)
            meta = {
                'id': det_id,
                'label': label,
                'confidence': conf,
                'bbox': [x1, y1, x2, y2],
                'crop_path': str(crop_path),
                'timestamp': datetime.utcnow().isoformat() + 'Z'
            }
            results.append(meta)
    else:
        print('Ultralytics YOLO not available or model weights missing. Running simulated demo detections.')
        # Create two simulated detections scaled to image size
        sample_dets = [
            { 'label': 'person', 'confidence': 0.95, 'bbox': [int(w*0.1), int(h*0.05), int(w*0.4), int(h*0.8)] },
            { 'label': 'bench', 'confidence': 0.88, 'bbox': [int(w*0.5), int(h*0.6), int(w*0.95), int(h*0.9)] }
        ]
        for d in sample_dets:
            x1, y1, x2, y2 = d['bbox']
            crop = im.crop((x1, y1, x2, y2))
            det_id = str(uuid.uuid4())
            crop_name = f"{det_id}_{d['label']}.jpg"
            crop_path = out_dir / crop_name
            crop.save(crop_path)
            meta = {
                'id': det_id,
                'label': d['label'],
                'confidence': d['confidence'],
                'bbox': [x1, y1, x2, y2],
                'crop_path': str(crop_path),
                'timestamp': datetime.utcnow().isoformat() + 'Z'
            }
            results.append(meta)

    return results


## Demo run

Use the function on a test image. The repository layout expects a sample at `data/sample1.jpg`.
If that file doesn't exist, put an image there or change the path below.


In [None]:
demo_img = 'data/sample1.jpg'
if not os.path.exists(demo_img):
    # Create a placeholder image so the demo runs even without a sample image
    from PIL import Image, ImageDraw
    p = Path('data')
    p.mkdir(parents=True, exist_ok=True)
    placeholder = Image.new('RGB', (640, 480), (200, 200, 200))
    draw = ImageDraw.Draw(placeholder)
    draw.text((20,20), 'Placeholder image for demo', fill=(0,0,0))
    placeholder.save(demo_img)

out = detect_and_crop(demo_img, out_dir='generated_crops_demo', model_path='models/weights/yolov8s.pt')
print('Detections:')
print(json.dumps(out, indent=2))


## Notes & Next Steps

- To run real detection, ensure `ultralytics` is installed and `model_path` points to valid YOLOv8 weights.
- This notebook saves crops into `generated_crops_demo/` (or `generated_crops/` depending on your call).
- After verifying detection results locally, integrate `detect_and_crop` into `app/services/detection.py`.
- For blurriness detection: add a `blur_check.py` using variance-of-Laplacian on each crop and mark `is_blurred`.
- Store original image + crops in the database (e.g., MongoDB GridFS) and save metadata (label, bbox, timestamp) in a collection.


In [None]:
%pip list

In [3]:
import os
from pathlib import Path
from datetime import datetime
import uuid
from PIL import Image
import numpy as np

# optional: cv2 for blur detection
import cv2

def variance_of_laplacian_gray(np_img_gray):
    """Return variance of Laplacian (focus measure). Higher = sharper."""
    return cv2.Laplacian(np_img_gray, cv2.CV_64F).var()

def detect_and_crop_real(
    image_path,
    out_dir="generated_crops",
    model_path="yolov8s.pt",    # path or known model name (yolov8s.pt, yolov8n.pt, etc.)
    device=None,               # "cpu" or "cuda:0" or None (auto)
    conf_thresh=0.25,
    iou_thresh=0.45,
    imgsz=640,
    classes=None,              # list of class indices or names to filter (optional)
    blur_threshold=100.0       # below this variance => considered blurred (tune as needed)
):
    """
    Run YOLOv8 detection on image_path and save crops to out_dir.
    Returns list of metadata dicts:
    [
      {
        'id': str,
        'label': 'person',
        'confidence': 0.93,
        'bbox': [x1,y1,x2,y2],
        'crop_path': '/abs/path/to/crop.jpg',
        'timestamp': '2025-10-30T..Z',
        'is_blurred': False
      }, ...
    ]
    """
    out_dir = Path(out_dir)
    out_dir.mkdir(parents=True, exist_ok=True)
    image_path = Path(image_path)
    if not image_path.exists():
        raise FileNotFoundError(f"Input image not found: {image_path}")

    # Determine device
    if device is None:
        try:
            import torch
            device = "cuda" if torch.cuda.is_available() else "cpu"
        except Exception:
            device = "cpu"

    # Try to import ultralytics
    try:
        from ultralytics import YOLO
    except Exception as e:
        raise RuntimeError("ultralytics is required for real detection. pip install ultralytics") from e

    # Load model (ultralytics will attempt to download weights for known names)
    model = YOLO(str(model_path))

    # Prepare classes argument: ultralytics accepts indices; mapping handled below
    classes_arg = None
    if classes is not None:
        # If classes are names, convert to indices if model.names available
        if isinstance(classes, (list, tuple)) and len(classes)>0 and isinstance(classes[0], str):
            # build name->index map
            name_to_idx = {v: k for k, v in model.names.items()} if hasattr(model, "names") else {}
            try:
                classes_arg = [name_to_idx[c] for c in classes]
            except KeyError:
                # if mapping fails, try passing None and let user specify indices
                classes_arg = None
        else:
            classes_arg = classes

    # Run inference
    results = model(
        str(image_path),
        imgsz=imgsz,
        device=device,
        conf=conf_thresh,
        iou=iou_thresh,
        classes=classes_arg,
        verbose=False
    )

    # We expect one image -> results[0]
    res = results[0]
    boxes = getattr(res, "boxes", None)
    if boxes is None or len(boxes) == 0:
        return []  # no detections

    # read image with PIL for cropping (keeps colors correct)
    im = Image.open(image_path).convert("RGB")
    w, h = im.size

    # Extract numpy arrays from ultralytics Boxes
    # boxes.xyxy -> Nx4 tensor; boxes.conf -> Nx1; boxes.cls -> Nx1
    xyxy_arr = np.array(boxes.xyxy.cpu())  # shape (N,4)
    confs = np.array(boxes.conf.cpu()).reshape(-1)
    cls_arr = np.array(boxes.cls.cpu()).reshape(-1).astype(int)
    name_map = model.names if hasattr(model, "names") else {}

    results_meta = []
    for i in range(xyxy_arr.shape[0]):
        x1, y1, x2, y2 = xyxy_arr[i].astype(int).tolist()
        conf = float(confs[i])
        cls_idx = int(cls_arr[i])
        label = name_map.get(cls_idx, str(cls_idx))

        # clamp coordinates to image bounds
        x1c = max(0, min(x1, w - 1))
        y1c = max(0, min(y1, h - 1))
        x2c = max(0, min(x2, w))
        y2c = max(0, min(y2, h))

        # ensure non-empty box
        if x2c <= x1c or y2c <= y1c:
            continue

        crop = im.crop((x1c, y1c, x2c, y2c))
        # filename: <uuid>_<label>_<timestamp>.jpg
        det_id = str(uuid.uuid4())
        ts = datetime.utcnow().isoformat() + "Z"
        safe_label = "".join(c for c in label if c.isalnum() or c in ("-", "_")).lower() or "cls"
        crop_name = f"{det_id}_{safe_label}.jpg"
        crop_path = out_dir / crop_name

        # Save crop (JPEG)
        crop.save(crop_path, format="JPEG", quality=90)

        # Blur check: convert to grayscale numpy array and compute variance-of-Laplacian
        np_crop = np.array(crop)
        gray = cv2.cvtColor(np_crop, cv2.COLOR_RGB2GRAY)
        focus_measure = variance_of_laplacian_gray(gray)
        is_blurred = focus_measure < blur_threshold

        meta = {
            "id": det_id,
            "label": label,
            "confidence": conf,
            "bbox": [int(x1c), int(y1c), int(x2c), int(y2c)],
            "crop_path": str(crop_path),
            "timestamp": ts,
            "is_blurred": bool(is_blurred),
            "focus_measure": float(focus_measure),
        }
        results_meta.append(meta)

    return results_meta


In [5]:
# Example usage
out = detect_and_crop_real(
    "data/sample2.jpg",
    out_dir="generated_crops_real",
    model_path="models/weights/yolov8s.pt",  # or "yolov8s.pt" (ultralytics may auto-download)
    conf_thresh=0.3,
    imgsz=640
)

print("Detections:")
import json
print(json.dumps(out, indent=2))


Detections:
[
  {
    "id": "e9101447-24ec-4cbd-ad00-6af8b99e8cba",
    "label": "chair",
    "confidence": 0.8306059241294861,
    "bbox": [
      1915,
      2518,
      3317,
      3639
    ],
    "crop_path": "generated_crops_real\\e9101447-24ec-4cbd-ad00-6af8b99e8cba_chair.jpg",
    "timestamp": "2025-10-30T11:19:03.469115Z",
    "is_blurred": false,
    "focus_measure": 1118.6570082655448
  },
  {
    "id": "4b4663d2-69c8-40a4-a06c-70c1443c932c",
    "label": "bed",
    "confidence": 0.6031917333602905,
    "bbox": [
      1209,
      2249,
      2232,
      2954
    ],
    "crop_path": "generated_crops_real\\4b4663d2-69c8-40a4-a06c-70c1443c932c_bed.jpg",
    "timestamp": "2025-10-30T11:19:03.537218Z",
    "is_blurred": false,
    "focus_measure": 490.3846971667312
  },
  {
    "id": "6e28e00a-d3a3-41cb-b859-51c7f61b5aa9",
    "label": "couch",
    "confidence": 0.5025451183319092,
    "bbox": [
      2242,
      2192,
      2782,
      2504
    ],
    "crop_path": "generated_cro

  xyxy_arr = np.array(boxes.xyxy.cpu())  # shape (N,4)
  confs = np.array(boxes.conf.cpu()).reshape(-1)
  cls_arr = np.array(boxes.cls.cpu()).reshape(-1).astype(int)
  ts = datetime.utcnow().isoformat() + "Z"


In [6]:
# After detection:
from pathlib import Path
import shutil
import time

orig_dst = Path("storage/originals")
orig_dst.mkdir(parents=True, exist_ok=True)
orig_id = str(uuid.uuid4())
orig_name = orig_dst / f"{orig_id}_{Path('data/sample1.jpg').name}"
shutil.copy("data/sample1.jpg", orig_name)

# metadata object to store in DB (example)
doc = {
    "original_id": orig_id,
    "original_path": str(orig_name),
    "uploaded_at": datetime.utcnow().isoformat() + "Z",
    "detections": out  # out is the results_meta returned earlier
}
# Insert into Mongo/Postgres as appropriate
print("Stored original image as:", orig_name)


Stored original image as: storage\originals\45a02db8-d3df-4eeb-98c6-14bd22cba781_sample1.jpg


  "uploaded_at": datetime.utcnow().isoformat() + "Z",


In [1]:
%pip show ultralytics

Name: ultralytics
Version: 8.3.222
Summary: Ultralytics YOLO 🚀 for SOTA object detection, multi-object tracking, instance segmentation, pose estimation and image classification.
Home-page: https://ultralytics.com
Author: 
Author-email: Glenn Jocher <glenn.jocher@ultralytics.com>, Jing Qiu <jing.qiu@ultralytics.com>
License: AGPL-3.0
Location: c:\Users\dhanu\.conda\envs\oenv\Lib\site-packages
Requires: matplotlib, numpy, opencv-python, pillow, polars, psutil, pyyaml, requests, scipy, torch, torchvision, ultralytics-thop
Required-by: 
Note: you may need to restart the kernel to use updated packages.
