# ðŸ“¦ COCO Dataset Viewer with `pycocotools`

## Overview
This interactive notebook provides comprehensive visualization and analysis tools for COCO-format datasets. It leverages the powerful `pycocotools` library to load, explore, and validate COCO annotations with rich visualization capabilities.

## Features
- **Dataset Loading**: Native COCO format support using pycocotools
- **Interactive Visualization**: View annotations with bounding boxes and segmentation masks
- **Statistical Analysis**: Dataset statistics, class distributions, and annotation quality metrics
- **Batch Viewing**: Display multiple images with annotations simultaneously
- **Quality Control**: Identify potential annotation issues or dataset inconsistencies
- **Export Capabilities**: Save visualizations and analysis results

## Use Cases
- **Dataset Validation**: Verify annotation quality and correctness
- **Data Exploration**: Understand dataset characteristics and class distributions
- **Debugging**: Identify issues in converted datasets
- **Presentation**: Generate visualizations for reports and presentations
- **Quality Assurance**: Check dataset consistency before training

## Requirements
- COCO-format annotation file (`annotations.json`)
- Image directory with corresponding images
- Python packages: `pycocotools`, `matplotlib`, `opencv-python`, `numpy`

## Usage Instructions
1. **Set Dataset Path**: Update the path to your COCO annotation file and image directory
2. **Run Analysis**: Execute cells to load dataset and generate statistics
3. **Visualize Samples**: View individual images or batches with annotations
4. **Export Results**: Save visualizations and analysis reports

---

### Imports

In [None]:
#!/usr/bin/env python3
import os
import cv2
import random
import matplotlib
import numpy as np

from pycocotools.coco import COCO
import matplotlib.pyplot as plt
import matplotlib.patches as patches

### Paths to COCO JSON and image directory

In [None]:
dataset_dir = "coco_new/"  # make sure this matches your folder structure
coco_annotation_path = "{}/annotations.json".format(dataset_dir)
image_dir = "{}/PNGImages/".format(dataset_dir)

### Load COCO dataset

In [None]:
coco = COCO(coco_annotation_path)

### Choose image IDs from the dataset

In [None]:
# Set this flag to True to select a specific image, or False to select 10 random images
select_specific_image = False
specific_image_id = 178  # Change this to your desired image ID

if select_specific_image:
    img_datas = coco.loadImgs([specific_image_id])
    print(f"Loaded image with ID {specific_image_id}.")
else:
    random_img_ids = random.sample(coco.getImgIds(), 10)
    img_datas = coco.loadImgs(random_img_ids)
    print(f"Loaded {len(img_datas)} images.")

# Display image file names and IDs
for img_data in img_datas:
    print(f"Image: {img_data['file_name']}   (ID: {img_data['id']})")

### Load and display the image with annotations

In [None]:
for img_data in img_datas:
    img_path = os.path.join(image_dir, img_data["file_name"])
    image = cv2.imread(img_path)
    if image is None:
        print(f"Image not found at {img_path}")
        continue

    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image_id = img_data["id"]

    ann_ids = coco.getAnnIds(imgIds=image_id)
    annotations = coco.loadAnns(ann_ids)

    cmap = matplotlib.colormaps["tab20"]
    colors = [cmap(i / 20.0) for i in range(len(annotations))]

    plt.figure(figsize=(10, 6))
    plt.imshow(image)
    ax = plt.gca()

    for idx, ann in enumerate(annotations):
        x, y, w, h = ann["bbox"]
        color = colors[idx]

        rect = patches.Rectangle((x, y), w, h, linewidth=2, edgecolor=color, facecolor='none')
        ax.add_patch(rect)

        if "segmentation" in ann and ann["segmentation"]:
            for seg in ann["segmentation"]:
                poly = np.array(seg).reshape((-1, 2))
                polygon = patches.Polygon(poly, linewidth=1, edgecolor=color, facecolor=color, alpha=0.4)
                ax.add_patch(polygon)

        category = coco.loadCats(ann["category_id"])[0]["name"]
        plt.text(x, y - 5, category, color=color, fontsize=12, backgroundcolor="white")

    plt.axis("off")
    plt.title(f"Annotations for {img_data['file_name']}")
    plt.tight_layout()
    plt.show()
