# Detect objects in images

Automatically identify and locate objects in images using YOLOX object detection models.


## Problem

You have images that need object detection—identifying what objects are present and where they're located. Manual labeling is slow and expensive.

| Use case | Images | Need |
|----------|--------|------|
| Inventory counting | 5K product photos | Count items per image |
| Security monitoring | 10K frames | Detect people, vehicles |
| Quality control | 20K inspection images | Find defects |


## Solution

**What's in this recipe:**
- Detect objects using YOLOX models (runs locally, no API needed)
- Get bounding boxes and class labels
- Filter detections by confidence threshold

You add a computed column that runs YOLOX on each image. Detection happens automatically when you insert new images.


### Setup


In [None]:
%pip install -qU pixeltable pixeltable-yolox


In [None]:
import pixeltable as pxt
from pixeltable.functions.yolox import yolox


### Load images


In [None]:
# Create a fresh directory
pxt.drop_dir('detection_demo', force=True)
pxt.create_dir('detection_demo')


In [None]:
# Create table for images
images = pxt.create_table('detection_demo.images', {'image': pxt.Image})


In [None]:
# Insert sample images (COCO dataset samples with common objects)
image_urls = [
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000036.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000090.jpg',
    'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000106.jpg',
]

images.insert([{'image': url} for url in image_urls])


In [None]:
# View images
images.collect()


### Run object detection

Add a computed column that runs YOLOX on each image:


In [None]:
# Run YOLOX object detection
# model_id options: yolox_nano, yolox_tiny, yolox_s, yolox_m, yolox_l, yolox_x
images.add_computed_column(
    detections=yolox(images.image, model_id='yolox_m', threshold=0.5)
)


In [None]:
# View detection results
images.select(images.image, images.detections).collect()


### Extract detection details

Parse the detection output to get object counts and classes:


In [None]:
# Extract number of detections
@pxt.udf
def count_objects(detections: dict) -> int:
    """Count the number of detected objects."""
    return len(detections.get('labels', []))

images.add_computed_column(object_count=count_objects(images.detections))


In [None]:
# Extract unique object classes
@pxt.udf
def get_classes(detections: dict) -> list:
    """Get list of detected object classes."""
    return list(set(detections.get('labels', [])))

images.add_computed_column(object_classes=get_classes(images.detections))


In [None]:
# View summary
images.select(images.image, images.object_count, images.object_classes).collect()


## Explanation

**YOLOX model sizes:**

| Model | Speed | Accuracy | Use case |
|-------|-------|----------|----------|
| `yolox_nano` | Fastest | Lower | Real-time, edge devices |
| `yolox_tiny` | Fast | Good | Mobile, quick processing |
| `yolox_s` | Medium | Better | Balanced performance |
| `yolox_m` | Slower | High | General use (recommended) |
| `yolox_l` | Slow | Higher | High accuracy needs |
| `yolox_x` | Slowest | Highest | Maximum accuracy |

**Detection output format:**

The `detections` dictionary contains:
- `labels`: List of class names (e.g., "person", "car", "dog")
- `boxes`: Bounding box coordinates [x1, y1, x2, y2]
- `scores`: Confidence scores (0-1)

**Adjusting threshold:**

- Higher threshold (0.7-0.9): Fewer detections, higher confidence
- Lower threshold (0.3-0.5): More detections, may include false positives


## See also

- [Extract frames from videos](https://docs.pixeltable.com/howto/cookbooks/video/video-extract-frames) - Detect objects in video frames
- [Analyze images in batch](https://docs.pixeltable.com/howto/cookbooks/images/vision-batch-analysis) - AI vision analysis
- [Find similar images](https://docs.pixeltable.com/howto/cookbooks/search/search-similar-images) - Visual similarity search
