Skip to content

ZED 2i Camera – Vision System

Jgocunha edited this page Apr 1, 2026 · 1 revision

This page describes how the ZED 2i camera is used for object detection and hand tracking in the collaborative packaging task.

The vision system uses a line-scan approach over a Region of Interest (ROI) to:

  • Count RED objects
  • Count GREEN objects
  • Estimate object positions
  • Track human hand position
  • Convert pixel positions to centimeters

The system runs CPU-only on Ubuntu 22.04 using OpenCV and V4L2.


Requirements

Install required packages:

sudo apt update
sudo apt install -y python3-opencv v4l-utils

If OpenCV was compiled without Qt, avoid GUI overlay functions.


Detecting the ZED Camera

List available video devices:

v4l2-ctl --list-devices

Example output:

ZED 2i:
  /dev/video4
  /dev/video5
Device Description
/dev/video4 Stereo side-by-side stream
/dev/video5 Mono stream (not always usable)

Use /dev/video4 and select the right camera.


Vision Scripts Overview

The vision tools are located in:

src/tests/zed-2i/
Script Purpose
zed_preview.py Camera preview
zed_record.py Record video
zed_line_count.py Count objects in videos
zed_line_centers.py Detect object centers
zed_hand_tracking.py Hand tracking
zed_online.py Live object + hand tracking

The main script used in the experiment is:

zed_online.py

ROI and Coordinate System

ROI format:

x, y, width, height

Coordinates:

  • x = 0 is left side
  • Position is converted to cm from the right side
  • Workspace width ≈ 60 cm

Conversion formula:

cm = ((W - 1 - x_px) * roi_width_cm / (W - 1))

This means:

  • Right side = 0 cm
  • Left side = 60 cm

Running the Vision System (Live)

Objects + Hand Tracking

python3 zed_online.py --dev /dev/video4 --size 2560x720 --fps 60 --fmt MJPG \
  --half right --mode both \
  --roi-objects 607,400,235,100 --roi-width-cm-objects 60 \
  --roi-hand    607,350,215,110  --roi-width-cm-hand 60 \
  --only-changes --change-th 6 --show

Objects Only

python3 zed_online.py --dev /dev/video4 --size 2560x720 --fps 60 --fmt MJPG \
  --half right --mode objects \
  --roi-objects 607,400,235,100 --roi-width-cm-objects 60 \
  --only-changes --change-th 5 --show

Hand Only

python3 zed_online.py --dev /dev/video4 --size 2560x720 --fps 60 --fmt MJPG \
  --half right --mode hand \
  --roi-hand 607,400,235,100 --roi-width-cm-hand 60 \
  --only-changes --change-th 6 --show

If frames fail to open:

  • Try --fmt YUYV
  • Or --size 1280x720

How the Line-Scan Detection Works

The system does not use full image segmentation. Instead, it uses a horizontal line-scan approach:

Object Detection

  1. Crop image to ROI

  2. Extract thin horizontal band

  3. Convert to HSV

  4. Create color masks:

    • Red mask
    • Green mask
  5. Compute color fraction per column

  6. Smooth signal along x

  7. Apply threshold + hysteresis

  8. Perform 1D morphological closing

  9. Detect segments

  10. Compute object centers

Hand Tracking

  1. Apply skin mask (YCrCb + HSV)
  2. Compute per-column fraction
  3. Apply threshold
  4. Morphological filtering
  5. Select largest segment
  6. Compute hand center

Console output example:

OBJ  RED:2 [12.48, 38.22] cm | GREEN:1 [55.73] cm
HAND 24.15 cm (x=118px)

Important Parameters

Parameter Description
--band Line-scan band height
--smooth Moving average smoothing
--frac-th Minimum color fraction
--close-w Morphological closing window
--gap-fill Merge small gaps
--s-min Minimum saturation
--v-min Minimum brightness
--only-changes Print only when values change
--change-th Change threshold

Troubleshooting

Problem Solution
No preview window OpenCV built without Qt
Frame drops Use MJPG format
Wrong image Use --half right
Detection flickers Increase smoothing
White labels detected Increase saturation threshold

Summary

The ZED 2i vision system:

  • Uses V4L2 (no CUDA required)
  • Uses line-scan detection
  • Tracks objects and hand position
  • Converts positions to centimeters
  • Runs in real time
  • Provides input to the DNF controller