# Data Annotation for Traffic Video Analysis

## Summary of Data Preprocessing (Notebook 01)

### What Was Done
A preprocessing workflow was set up on the HPC tower (running Debian Linux 24.04) to extract frames from traffic camera recordings.

### Recording System
- **7 traffic cameras** in Atlanta recording continuously
- **Local storage**: `/home/trauco/traffic-recordings/`
- **Video specifications**: 480x270 pixels, 15 fps, H.264, MP4 format
- **Recording structure**: 15-minute video files created every 70 seconds with 60-second content

### Camera Locations and Frame Distribution

| Camera ID | Location | Frames Extracted |
|-----------|----------|------------------|
| ATL-0610 | [10th Street and Monroe Drive](https://www.google.com/maps/search/10th+Street+and+Monroe+Drive,+Atlanta,+GA) | 1,290 |
| ATL-0907 | [Piedmont Avenue and 14th Street](https://www.google.com/maps/search/Piedmont+Avenue+and+14th+Street,+Atlanta,+GA) | 1,269 |
| ATL-0972 | [Peachtree Street and 5th Street](https://www.google.com/maps/search/Peachtree+Street+and+5th+Street,+Atlanta,+GA) | 1,290 |
| ATL-0997 | [West Peachtree Street and 5th Street](https://www.google.com/maps/search/West+Peachtree+Street+and+5th+Street,+Atlanta,+GA) | 1,290 |
| ATL-0998 | [West Peachtree Street and 17th Street](https://www.google.com/maps/search/West+Peachtree+Street+and+17th+Street,+Atlanta,+GA) | 1,139 |
| ATL-0999 | [West Peachtree Street and 14th Street](https://www.google.com/maps/search/West+Peachtree+Street+and+14th+Street,+Atlanta,+GA) | 1,273 |
| ATL-1005 | [Peachtree Street and 12th Street](https://www.google.com/maps/search/Peachtree+Street+and+12th+Street,+Atlanta,+GA) | 1,290 |
| **Total** | | **8,841** |

### Preprocessing Script (`preprocess_daytime_videos.py`)
- Processes videos from 13:00 EST
- Extracts frames from first 60 seconds of each video
- Samples at 2 fps (down from original 15 fps)
- Applies quality thresholds: brightness > 30, blur score > 100
- Saves frames as JPG files with metadata.json

### Results
- **Videos processed**: 69 (all 13:00 EST videos from June 9-12)
- **Total frames extracted**: 8,841
- **Average frames per video**: ~128
- **Output structure**: `frames/[CAMERA_ID]/[DATE]/[VIDEO_NAME]/`

### Quality Metrics
*Note: The quality thresholds (brightness > 30, blur score > 100) were determined using a script that recorded a single video at night. These metrics are not reliable due to drift as workflows are refined and will be corrected in the next iteration.*

### Output Files
- Frame files: `frame_000000.jpg`, `frame_000001.jpg`, etc.
- Metadata: `metadata.json` per video folder
- Summary: `preprocessing_summary.csv` with all processing statistics

Note: Frames were intentionally extracted from only the first 60 seconds of each 15-minute video to create a manageable dataset.

# Local Annotation Setup for Multi-Object Detection

## Install LabelImg

```bash
conda activate traffic-vision
pip install labelImg
```

## Launch

```bash
labelImg /home/trauco/data-science-sad/frames/
```

## Create classes.txt
```
vehicle
pedestrian
```

## Annotation Output
- Format: YOLO 
- Each `.jpg` gets a `.txt` file
- Contains: `class_id x_center y_center width height` (normalized 0-1)

## Note
Future iterations will use CVAT with Docker (industry standard).

# Prepare Single Camera Dataset for Annotation

This cell prepares frames from camera **ATL-0610** ([10th Street and Monroe Drive](https://www.google.com/maps/search/10th+Street+and+Monroe+Drive,+Atlanta,+GA)) for annotation.

**Dataset details:**
- Camera: ATL-0610
- Date: June 9, 2025
- Time: 13:00 EST (1 PM)
- Expected frames: ~129 frames
- Coverage: First 60 seconds of the 13:00 recording

This creates a manageable dataset from a single location and timepoint to establish our annotation workflow.

In [2]:
# setup paths
import shutil
from pathlib import Path

frames_dir = Path("/home/trauco/data-science-sad/frames")
annotation_dir = Path("/home/trauco/data-science-sad/annotation_sample")
annotation_dir.mkdir(exist_ok=True)

# one camera, one day
source = frames_dir / "ATL-0610" / "2025-06-09"
frames = list(source.rglob("*.jpg"))

print(f"Found {len(frames)} frames from ATL-0610 on 2025-06-09")

# copy frames
for frame in frames:
   shutil.copy(frame, annotation_dir / frame.name)

# create classes
with open(annotation_dir / "classes.txt", "w") as f:
   f.write("vehicle\npedestrian")

print(f"Ready to annotate in: {annotation_dir}")

Found 387 frames from ATL-0610 on 2025-06-09
Ready to annotate in: /home/trauco/data-science-sad/annotation_sample


# List All Video Directories for ATL-0610 on June 9

This cell shows all video directories containing frames for camera ATL-0610 on June 9, 2025.

In [4]:
# check directories
from pathlib import Path

frames_dir = Path("/home/trauco/data-science-sad/frames")
camera_date_path = frames_dir / "ATL-0610" / "2025-06-09"

# list all video dirs
video_dirs = sorted([d for d in camera_date_path.iterdir() if d.is_dir()])

print(f"Found {len(video_dirs)} video directories for ATL-0610 on 2025-06-09:\n")
for vid_dir in video_dirs:
    frame_count = len(list(vid_dir.glob("*.jpg")))
    print(f"{vid_dir.name}: {frame_count} frames")

Found 3 video directories for ATL-0610 on 2025-06-09:

ATL-0610_20250609_131130: 129 frames
ATL-0610_20250609_132751: 129 frames
ATL-0610_20250609_134412: 129 frames


# Prepare ATL-0610_20250609_131130 for Annotation

**Action Required:** Review and refine `preprocess_daytime_videos.py` to extract only ONE video per camera per day - the first video closest to 13:00:00.

**Likely Cause:** Script is filtering by hour (13) instead of selecting the single video with timestamp closest to 13:00:00.

Using a single video directory with 129 frames for focused annotation workflow.

**Dataset:**
- Camera: ATL-0610 ([10th Street and Monroe Drive](https://www.google.com/maps/search/10th+Street+and+Monroe+Drive,+Atlanta,+GA))
- Video: ATL-0610_20250609_131130
- Time: 13:11:30 EST (June 9, 2025)
- Frames: 129

In [5]:
# setup paths
import shutil
from pathlib import Path

frames_dir = Path("/home/trauco/data-science-sad/frames")
annotation_dir = Path("/home/trauco/data-science-sad/annotation_sample")

# clean start
if annotation_dir.exists():
    shutil.rmtree(annotation_dir)
annotation_dir.mkdir()

# copy frames
source = frames_dir / "ATL-0610" / "2025-06-09" / "ATL-0610_20250609_131130"
frames = list(source.glob("*.jpg"))

for frame in frames:
    shutil.copy(frame, annotation_dir / frame.name)

# create classes
with open(annotation_dir / "classes.txt", "w") as f:
    f.write("vehicle\npedestrian")

print(f"Copied {len(frames)} frames to {annotation_dir}")
print(f"\nNext: Run 'labelImg {annotation_dir}' in terminal")

Copied 129 frames to /home/trauco/data-science-sad/annotation_sample

Next: Run 'labelImg /home/trauco/data-science-sad/annotation_sample' in terminal


# Switching to CVAT Due to LabelImg Instability

LabelImg is crashing repeatedly on our system. Moving to CVAT (Computer Vision Annotation Tool) which is the industry standard and more stable.

## Installing CVAT with Docker

CVAT requires Docker for local installation. This provides a stable, production-ready annotation environment.

### Prerequisites
- Docker and Docker Compose installed on Debian 24.04
- Port 8080 available

### Next Steps
1. Install Docker
2. Clone and run CVAT
3. Upload our 129 frames from ATL-0610_20250609_131130
4. Annotate with vehicle and pedestrian classes

# Install Docker on Debian 24.04

Install Docker to run CVAT for stable annotation.

```bash
# remove old docker
sudo apt remove docker docker-engine docker.io containerd runc

# use ubuntu repo
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu noble stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# install
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin

# add user to group
sudo usermod -aG docker $USER
```

# Install and Run CVAT

Clone and start CVAT with Docker.
```bash
# clone cvat
cd ~
git clone https://github.com/opencv/cvat
cd cvat

# start cvat
docker compose up -d

# check status
docker ps
```

# Docker Permission Issue

Need to logout/login for docker group to take effect.


```bash
# logout and login
exit

# or restart docker group
newgrp docker

# verify docker works
docker run hello-world
```

**Run CVAT again***


```bash
cd ~/cvat
docker compose up -d
```


# Restart Terminal and Verify Docker

The terminal exit is expected. Open a new terminal to apply group changes.

```bash
# verify docker works
docker run hello-world

# if successful, start cvat
cd ~/cvat
docker compose up -d
```

**Permission Denied**

```bash
# manual fix
sudo systemctl restart docker
sudo chmod 666 /var/run/docker.sock
```

# Fix CVAT Port 8080 Conflict

Port 8080 is already in use (likely by label-studio or another service). Change CVAT to use port 8090 instead.

```bash
# kill process on 8080
sudo kill -9 $(sudo lsof -t -i:8080)

# change cvat to port 8090
cd ~/cvat
sed -i 's/8080/8090/g' docker-compose.yml
docker compose up -d
```

After successful startup, access CVAT at: **http://localhost:8090**

Default credentials: admin / admin

# Create CVAT Annotation Project

## Access CVAT
```bash
# open browser
firefox http://localhost:8090 &
```

## Login
- Username: `admin`
- Password: `admin`

## Create Project
1. Click **Projects** → **Create a new project**
2. Enter:
   - Project name: `ATL-0610_Traffic_Annotation`
   - Labels:
     - Click **Add label**
     - Name: `vehicle` → **Continue**
     - Click **Add label**
     - Name: `pedestrian` → **Done**
3. Click **Submit**

## Create Task
1. Click on `ATL-0610_Traffic_Annotation` project
2. Click **Create a new task**
3. Enter:
   - Task name: `ATL-0610_20250609_131130`
   - Click **Select files**
   - Navigate to `/home/trauco/data-science-sad/annotation_sample/`
   - Select all 129 `.jpg` files
4. Click **Submit**

## Start Annotating
- Task will appear in list
- Click **Open** to begin annotation

# Data Annotation Session Summary

## Annotation Tool Setup
- **Tool**: CVAT (Computer Vision Annotation Tool)
- **Access**: http://localhost:8090
- **Installation**: Docker-based on Debian 24.04

## Dataset Annotated
- **Camera**: ATL-0610 ([10th Street and Monroe Drive](https://www.google.com/maps/search/10th+Street+and+Monroe+Drive,+Atlanta,+GA))
- **Video**: ATL-0610_20250609_131130
- **Time**: 13:11:30 EST, June 9, 2025
- **Total frames**: 129
- **Frames annotated**: 29

## Annotation Details
- **Classes defined**: 
  - vehicle
  - pedestrian
- **Objects found**: Vehicles only (no pedestrians at this intersection)
- **Export format**: YOLO 1.1
- **Output location**: `/home/trauco/data-science-sad/annotations/yolo/`

## YOLO Export Structure
```
annotations/yolo/
├── obj.names          # Class names (vehicle, pedestrian)
├── obj.data           # Dataset configuration
├── train.txt          # List of training images
└── obj_train_data/    # Images and annotations
    ├── frame_*.jpg    # Original frames
    └── frame_*.txt    # YOLO format annotations
```

## Key Findings
- Traffic at 10th & Monroe is vehicle-only during sampled period
- 480x270 resolution is grainy but sufficient for vehicle detection
- CVAT export to YOLO format working correctly

## Next Steps
- Train initial YOLO model with 29 annotated frames
- Evaluate if more annotations needed
- Consider automated pre-labeling for remaining frames