# Load COCO Dataset from Roboflow

Import the COCO object detection dataset from Roboflow Universe into Pixeltable tables.

**What's in this recipe:**
- Import 1% sample of COCO training set from Roboflow
- Load images with object detection annotations
- Work with bounding boxes and class labels
- Access COCO dataset without manual downloads


## Problem

COCO (Common Objects in Context) is one of the most important datasets for object detection with 123,272 labeled training images. You need a representative sample in Pixeltable to work with object detection models without downloading the massive 25GB+ full dataset.


## Solution

**What's in this recipe:**
- Load ~1,233 images (1% of 123,272 training set) from COCO via Roboflow
- Access object detection annotations (bounding boxes, labels)
- Use Roboflow's API for easy dataset access
- Work with standard COCO format in Pixeltable

Roboflow hosts the [COCO dataset](https://universe.roboflow.com/microsoft/coco) and provides an easy API to download subsets without needing to download the entire dataset.


### Setup


In [1]:
!uv add pixeltable roboflow

[2K[2mResolved [1m268 packages[0m [2min 808ms[0m[0m                                       [0m
[2K[37m⠙[0m [2mPreparing packages...[0m (0/8)                                                   [37m⠋[0m [2mPreparing packages...[0m (0/0)                                                   
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/8)--------------[0m[0m     0 B/19.50 KiB           [1A
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/8)--------------[0m[0m     0 B/19.50 KiB           [1A
[2mfiletype            [0m [32m[2m------------------------------[0m[0m     0 B/19.50 KiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/8)--------------[0m[0m     0 B/53.20 KiB           [2A
[2mfiletype            [0m [32m[2m------------------------------[0m[0m     0 B/19.50 KiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/8)--------------[0m[0m 14.91 KiB/53.20 KiB         [2A
[2mfiletype            [0m [32m-------------------------[2m-----

In [3]:
import pixeltable as pxt
from roboflow import Roboflow

### Download COCO Sample from Roboflow

Note: You'll need a Roboflow API key. Get one free at https://roboflow.com


In [None]:
# Initialize Roboflow with your API key
# Get your API key from https://app.roboflow.com/settings/api
rf = Roboflow(api_key="YOUR_API_KEY_HERE")

# Access the COCO dataset
project = rf.workspace("microsoft").project("coco")
dataset = project.version(1).download("coco")

# The dataset is downloaded to a local directory
print(f"Dataset downloaded to: {dataset.location}")


 Specifically, the labeled dataset for object detection stands at 123,272 images.

In [None]:
# Create directory for COCO data
pxt.drop_dir('coco_roboflow', force=True)
pxt.create_dir('coco_roboflow')

# Create table with schema for images
t = pxt.create_table(
    'coco_roboflow.samples',
    schema={
        'image': pxt.Image,
        'image_id': pxt.String
    },
    comment='COCO object detection dataset (1% sample) from Roboflow'
)


In [None]:
import os
import glob

# Get all images from the train directory (sampling 1%)
train_dir = os.path.join(dataset.location, 'train')
all_images = glob.glob(os.path.join(train_dir, '*.jpg'))

# Take 1% sample
sample_size = max(1, len(all_images) // 100)
sampled_images = all_images[:sample_size]

# Prepare rows for insertion
rows = []
for img_path in sampled_images:
    image_id = os.path.basename(img_path).replace('.jpg', '')
    rows.append({
        'image': img_path,
        'image_id': image_id
    })

t.insert(rows)
print(f"Inserted {len(rows)} images (1% of training set)")


In [None]:
# View sample data
t.select(t.image, t.image_id).head(10)


In [None]:
# Check total count
t.count()


### Publish to Pixeltable Cloud


In [None]:
# Publish the table to Pixeltable Cloud
pxt.publish(
    'coco_roboflow.samples',
    'pxt://pixeltable:roboflow/coco_sample',
    access='public'
)


## See also

- [COCO on Roboflow Universe](https://universe.roboflow.com/microsoft/coco)
- [Roboflow Documentation](https://docs.roboflow.com/)
- [COCO Dataset Website](https://cocodataset.org/)
