# Ship Detection in SAR Images with YOLO-OBB (HRSID)

**Task:** Object detection (ship detection) in Synthetic Aperture Radar (SAR) images.

**Dataset:** HRSID (High Resolution SAR Images Dataset) — COCO-style annotations with rotated ship instances.

**Model:** Ultralytics YOLO **OBB** (oriented bounding boxes).

---

## Notebook roadmap
1. Environment setup
2. Download / place the HRSID dataset
3. Parse COCO annotations and convert to **YOLO OBB label format** (`class x1 y1 ... x4 y4`, normalized)
4. Visual sanity checks (overlay boxes)
5. Train YOLO-OBB
6. Evaluate and run inference

> Notes
> - This notebook focuses on a clean, reproducible pipeline.
> - Training needs a GPU for reasonable speed. CPU training works but is slow.


In [None]:
# (Recommended) Run on a GPU runtime
import os, sys, platform
print('Python:', sys.version)
print('Platform:', platform.platform())

# Optional: quick CUDA check (PyTorch may not be installed yet)
try:
    import torch
    print('Torch:', torch.__version__)
    print('CUDA available:', torch.cuda.is_available())
    if torch.cuda.is_available():
        print('GPU:', torch.cuda.get_device_name(0))
except Exception as e:
    print('Torch not available yet (will install below).')


In [None]:
# Install dependencies
# Ultralytics provides YOLO training/inference; pycocotools reads COCO JSON; shapely helps geometry.

!pip -q install --upgrade ultralytics
!pip -q install pycocotools shapely opencv-python matplotlib tqdm pyyaml

# Restart runtime if Ultralytics asks you to.


## 1) Get the HRSID dataset

HRSID images are large; downloading may take a while.

You have two options:

### Option A — Manual download (recommended)
1. Download HRSID (JPG) from the links in the official repository.
2. Extract it into a folder like:

```
./HRSID_ROOT/
  images/...
  annotations/...
```

### Option B — Programmatic download (Google Drive)
If the dataset is hosted on Google Drive, you can often download it with `gdown` using the file id.

> If your download fails (quota, permissions), use manual download.


In [None]:
from pathlib import Path

# Point this to where you placed/extracted the dataset
HRSID_ROOT = Path('HRSID_ROOT')  # <-- change me

# Expected subfolders (adjust if your extraction differs)
ANNOT_DIR = HRSID_ROOT / 'annotations'
IMG_DIR = HRSID_ROOT / 'images'

print('HRSID_ROOT:', HRSID_ROOT.resolve())
print('ANNOT_DIR exists:', ANNOT_DIR.exists())
print('IMG_DIR exists:', IMG_DIR.exists())

# List files (helps you discover exact structure)
if ANNOT_DIR.exists():
    print('Annotation files:', sorted([p.name for p in ANNOT_DIR.glob('*.json')])[:10])
if IMG_DIR.exists():
    exts = ['*.jpg','*.jpeg','*.png']
    imgs = []
    for e in exts:
        imgs += list(IMG_DIR.rglob(e))
    print('Found images:', len(imgs))


## 2) Choose the COCO annotation JSON(s)

HRSID uses COCO-like JSON. Some releases include separate files for train/val/test; others provide a single JSON.

- If you have `train.json` and `test.json`, set them below.
- If you only have one JSON, we will create a train/val split.


In [None]:
from pathlib import Path

# Set these based on what exists in your annotations folder.
COCO_TRAIN_JSON = None
COCO_VAL_JSON = None
COCO_TEST_JSON = None

# Auto-detect common names
cands = list((ANNOT_DIR).glob('*.json')) if ANNOT_DIR.exists() else []
name2path = {p.name.lower(): p for p in cands}

for key in ['train.json','train2017.json','hrsid_train.json','instances_train.json']:
    if key in name2path:
        COCO_TRAIN_JSON = name2path[key]
        break

for key in ['val.json','valid.json','hrsid_val.json','instances_val.json','val2017.json']:
    if key in name2path:
        COCO_VAL_JSON = name2path[key]
        break

for key in ['test.json','hrsid_test.json','instances_test.json']:
    if key in name2path:
        COCO_TEST_JSON = name2path[key]
        break

# If no train JSON found, fall back to the first json in the folder
if COCO_TRAIN_JSON is None and cands:
    COCO_TRAIN_JSON = cands[0]

print('COCO_TRAIN_JSON:', COCO_TRAIN_JSON)
print('COCO_VAL_JSON:', COCO_VAL_JSON)
print('COCO_TEST_JSON:', COCO_TEST_JSON)


## 3) Convert COCO rotated instances to Ultralytics YOLO-OBB labels

Ultralytics YOLO OBB expects one `*.txt` label per image:

```
class_index x1 y1 x2 y2 x3 y3 x4 y4
```

All coordinates are **normalized to [0, 1]** (divide by image width/height).

HRSID COCO annotations typically store ship outlines as polygons in `segmentation`.
We will:
- read each annotation polygon
- reduce it to 4 corners if it is already a rotated rectangle (8 numbers)
- otherwise compute a minimum-area rectangle (requires shapely)


In [None]:
import json
import random
from pathlib import Path
from collections import defaultdict

import numpy as np
from shapely.geometry import Polygon
from tqdm import tqdm

RNG_SEED = 42
random.seed(RNG_SEED)
np.random.seed(RNG_SEED)

# Output dataset in Ultralytics format
OUT_ROOT = Path('hrsid_yolo_obb')
OUT_IMAGES = OUT_ROOT / 'images'
OUT_LABELS = OUT_ROOT / 'labels'

(OUT_IMAGES / 'train').mkdir(parents=True, exist_ok=True)
(OUT_IMAGES / 'val').mkdir(parents=True, exist_ok=True)
(OUT_LABELS / 'train').mkdir(parents=True, exist_ok=True)
(OUT_LABELS / 'val').mkdir(parents=True, exist_ok=True)

# Single-class dataset: ship -> class 0
CLASS_NAME = 'ship'
CLASS_ID = 0


def load_coco_json(path: Path):
    with open(path, 'r', encoding='utf-8') as f:
        return json.load(f)


def coco_images_by_id(coco):
    return {img['id']: img for img in coco.get('images', [])}


def group_annotations_by_image(coco):
    grouped = defaultdict(list)
    for ann in coco.get('annotations', []):
        grouped[ann['image_id']].append(ann)
    return grouped


def polygon_from_segmentation(seg):
    # COCO segmentation can be list-of-lists or a flat list.
    # We only handle polygon segmentations here.
    if seg is None:
        return None
    if isinstance(seg, list) and len(seg) == 0:
        return None
    if isinstance(seg, list) and isinstance(seg[0], list):
        # take the largest polygon part
        seg = max(seg, key=lambda s: len(s))
    if not isinstance(seg, list):
        return None
    if len(seg) < 6:
        return None
    pts = np.array(seg, dtype=float).reshape(-1, 2)
    return pts


def min_area_rect_corners(pts_xy):
    # Use shapely's minimum_rotated_rectangle
    poly = Polygon(pts_xy)
    if not poly.is_valid:
        poly = poly.buffer(0)
    if poly.is_empty:
        return None
    mrr = poly.minimum_rotated_rectangle
    coords = np.array(mrr.exterior.coords)[:-1]  # 4 points
    if coords.shape != (4, 2):
        return None
    return coords


def corners_to_yolo_line(corners_xy, w, h, class_id=0):
    # corners_xy: (4,2) in pixel coords
    corners = corners_xy.copy().astype(float)
    corners[:, 0] = np.clip(corners[:, 0] / w, 0, 1)
    corners[:, 1] = np.clip(corners[:, 1] / h, 0, 1)
    flat = corners.reshape(-1)
    return str(class_id) + ' ' + ' '.join([f'{v:.6f}' for v in flat])


def resolve_image_path(file_name: str):
    # HRSID may store relative paths. Try a few common patterns.
    p = IMG_DIR / file_name
    if p.exists():
        return p
    # search recursively as fallback (slower)
    matches = list(IMG_DIR.rglob(Path(file_name).name))
    return matches[0] if matches else None


def convert_coco_to_yolo_obb(coco_json: Path, split_name: str, make_split=False, val_ratio=0.15):
    coco = load_coco_json(coco_json)
    imgs = coco.get('images', [])
    img_by_id = coco_images_by_id(coco)
    anns_by_img = group_annotations_by_image(coco)

    # If this JSON should be split into train/val
    if make_split:
        ids = [im['id'] for im in imgs]
        random.shuffle(ids)
        n_val = int(len(ids) * val_ratio)
        val_ids = set(ids[:n_val])
        train_ids = set(ids[n_val:])
        split_to_ids = {'train': train_ids, 'val': val_ids}
    else:
        split_to_ids = {split_name: set([im['id'] for im in imgs])}

    stats = {'images': 0, 'instances': 0, 'skipped_instances': 0, 'skipped_images': 0}

    for split, idset in split_to_ids.items():
        for img_id in tqdm(sorted(idset), desc=f'Converting {split}'):
            im = img_by_id.get(img_id)
            if im is None:
                continue

            file_name = im.get('file_name')
            width = im.get('width')
            height = im.get('height')

            # If width/height are missing, try reading the image
            img_path = resolve_image_path(file_name)
            if img_path is None:
                stats['skipped_images'] += 1
                continue

            if width is None or height is None:
                import cv2
                arr = cv2.imread(str(img_path), cv2.IMREAD_GRAYSCALE)
                if arr is None:
                    stats['skipped_images'] += 1
                    continue
                height, width = arr.shape[:2]

            lines = []
            for ann in anns_by_img.get(img_id, []):
                seg = ann.get('segmentation')
                pts = polygon_from_segmentation(seg)
                corners = None

                # Fast path: segmentation is a rotated rectangle (4 points => 8 numbers)
                if pts is not None and len(pts) == 4:
                    corners = pts
                else:
                    # Compute minimum-area rectangle from polygon
                    if pts is not None:
                        corners = min_area_rect_corners(pts)

                # Fallback: if no segmentation, try bbox (axis-aligned) and convert to corners
                if corners is None:
                    bbox = ann.get('bbox')  # [x,y,w,h] in COCO
                    if bbox is not None and len(bbox) == 4:
                        x, y, bw, bh = bbox
                        corners = np.array([[x, y], [x+bw, y], [x+bw, y+bh], [x, y+bh]], dtype=float)

                if corners is None:
                    stats['skipped_instances'] += 1
                    continue

                lines.append(corners_to_yolo_line(corners, width, height, CLASS_ID))
                stats['instances'] += 1

            # Write label file (even if empty -> negative sample)
            label_path = (OUT_LABELS / split / (Path(img_path).stem + '.txt'))
            label_path.write_text('
'.join(lines) + ('
' if lines else ''), encoding='utf-8')

            # Copy image
            out_img_path = (OUT_IMAGES / split / Path(img_path).name)
            if not out_img_path.exists():
                out_img_path.write_bytes(img_path.read_bytes())

            stats['images'] += 1

    return stats


In [None]:
# Run conversion

# Case 1: train/val JSONs already exist
if COCO_TRAIN_JSON is not None and COCO_VAL_JSON is not None:
    train_stats = convert_coco_to_yolo_obb(Path(COCO_TRAIN_JSON), split_name='train', make_split=False)
    val_stats = convert_coco_to_yolo_obb(Path(COCO_VAL_JSON), split_name='val', make_split=False)
    print('Train stats:', train_stats)
    print('Val stats:', val_stats)

# Case 2: only a single JSON -> create split inside it
else:
    assert COCO_TRAIN_JSON is not None, 'No COCO annotation JSON found.'
    stats = convert_coco_to_yolo_obb(Path(COCO_TRAIN_JSON), split_name='train', make_split=True, val_ratio=0.15)
    print('Split stats:', stats)

print('Output root:', OUT_ROOT.resolve())


## 4) Visual sanity check: draw OBB labels on images

We will sample a few converted images and overlay the 4-corner OBB polygons.


In [None]:
import random
from pathlib import Path
import cv2
import matplotlib.pyplot as plt
import numpy as np


def read_yolo_obb_labels(label_path: Path):
    lines = label_path.read_text(encoding='utf-8').strip().splitlines() if label_path.exists() else []
    polys = []
    for ln in lines:
        if not ln.strip():
            continue
        parts = ln.strip().split()
        cls = int(parts[0])
        coords = np.array([float(x) for x in parts[1:]], dtype=float).reshape(4, 2)
        polys.append((cls, coords))
    return polys


def denorm(coords_norm, w, h):
    xy = coords_norm.copy()
    xy[:, 0] *= w
    xy[:, 1] *= h
    return xy


def show_samples(split='train', k=6):
    img_paths = sorted((OUT_IMAGES / split).glob('*'))
    if not img_paths:
        print('No images found in', OUT_IMAGES / split)
        return
    picks = random.sample(img_paths, min(k, len(img_paths)))

    for p in picks:
        img = cv2.imread(str(p), cv2.IMREAD_GRAYSCALE)
        if img is None:
            continue
        h, w = img.shape[:2]
        img_rgb = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)

        label_path = (OUT_LABELS / split / (p.stem + '.txt'))
        polys = read_yolo_obb_labels(label_path)

        for cls, poly_n in polys:
            poly = denorm(poly_n, w, h).astype(int)
            cv2.polylines(img_rgb, [poly.reshape(-1, 1, 2)], isClosed=True, color=(0, 255, 0), thickness=2)

        plt.figure(figsize=(6, 6))
        plt.title(f'{split}: {p.name} | ships: {len(polys)}')
        plt.imshow(img_rgb)
        plt.axis('off')
        plt.show()


show_samples('train', k=4)


## 5) Create Ultralytics dataset YAML

Ultralytics uses a YAML file that points to the dataset root and image folders.


In [None]:
import yaml
from pathlib import Path

dataset_yaml = {
    'path': str(OUT_ROOT),
    'train': 'images/train',
    'val': 'images/val',
    'names': {0: CLASS_NAME},
}

yaml_path = OUT_ROOT / 'hrsid_obb.yaml'
with open(yaml_path, 'w', encoding='utf-8') as f:
    yaml.safe_dump(dataset_yaml, f, sort_keys=False)

print('Wrote:', yaml_path)
print(yaml_path.read_text())


## 6) Train a YOLO-OBB model

Choose a small model to start (fast iteration), then scale up.

Ultralytics provides pretrained OBB models (e.g. `yolov8n-obb.pt` or newer OBB models).

> Tip: If your ships are tiny, try increasing `imgsz` (e.g. 1024) and using multi-scale training.


In [None]:
from ultralytics import YOLO

# Pick an OBB pretrained checkpoint.
# If you get a "file not found" error, run `!yolo help` or check Ultralytics docs for the latest model names.
model = YOLO('yolov8n-obb.pt')

results = model.train(
    data=str(yaml_path),
    epochs=100,
    imgsz=1024,
    batch=8,
    device=0,         # set to 'cpu' if no GPU
    workers=4,
    project='runs_hrsid',
    name='yolov8n_obb_hrsid',
)


## 7) Evaluate (validation)

Ultralytics will compute OBB detection metrics during training. You can also run validation explicitly.


In [None]:
# Validate the best checkpoint
best = Path('runs_hrsid/yolov8n_obb_hrsid/weights/best.pt')
model = YOLO(str(best) if best.exists() else 'yolov8n-obb.pt')

metrics = model.val(data=str(yaml_path), imgsz=1024)
print(metrics)


## 8) Inference + visualization

We run inference on a few validation images and draw predicted OBBs.


In [None]:
import random
from pathlib import Path
import cv2
import matplotlib.pyplot as plt

val_imgs = sorted((OUT_IMAGES / 'val').glob('*'))
if not val_imgs:
    print('No val images found.')
else:
    sample_imgs = random.sample(val_imgs, min(6, len(val_imgs)))

    # Use best checkpoint if present
    best = Path('runs_hrsid/yolov8n_obb_hrsid/weights/best.pt')
    model = YOLO(str(best) if best.exists() else 'yolov8n-obb.pt')

    preds = model.predict(source=[str(p) for p in sample_imgs], imgsz=1024, conf=0.25)

    for p, r in zip(sample_imgs, preds):
        img = cv2.imread(str(p), cv2.IMREAD_GRAYSCALE)
        img_rgb = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)

        # r.obb.xyxyxyxy gives (N,4,2) in pixel coords when available
        if getattr(r, 'obb', None) is not None and r.obb.xyxyxyxy is not None:
            polys = r.obb.xyxyxyxy.cpu().numpy().astype(int)
            for poly in polys:
                cv2.polylines(img_rgb, [poly.reshape(-1,1,2)], True, (255,0,0), 2)

        plt.figure(figsize=(6,6))
        plt.title(f'Predictions: {p.name}')
        plt.imshow(img_rgb)
        plt.axis('off')
        plt.show()


## 9) What to report (project write-up checklist)

Include these in your final project report:

- **Dataset**: size, train/val split, number of instances, typical ship sizes.
- **Preprocessing**: conversion to YOLO-OBB, any filtering, handling of invalid polygons.
- **Model**: YOLO OBB variant, image size, training epochs, augmentations.
- **Metrics**: mAP@50, mAP@50-95 (OBB), PR curves if available.
- **Qualitative results**: example detections (inshore vs offshore), failure cases.
- **Ablations** (optional): imgsz 640 vs 1024; nano vs small model; conf threshold.

---

### Ideas for extensions (optional)
- Train separate models for **inshore vs offshore** or add a domain label.
- Add SAR-specific augmentations (speckle noise simulation, contrast stretching).
- Use tiling (split large images into crops) for better small-object recall.
