Real-Time Object Detection System

A single-stage object detection system built with TensorFlow that predicts bounding boxes and class probabilities for objects in images and videos in real-time. The model architecture is inspired by YOLO (You Only Look Once) and is optimized for both speed and accuracy.

Features

Single-Stage Detection: Fast, real-time object detection with a single forward pass through the network
Multi-Scale Detection: Detection at three different scales for improved accuracy across object sizes
Anchor-Based Prediction: Uses pre-defined anchor boxes to improve detection of objects with different aspect ratios
Real-Time Inference: Optimized for real-time detection on GPU and CPU hardware
Comprehensive Training Pipeline: Complete with data loading, augmentation, and validation
Visualization Tools: Real-time visualization of detection results with class labels and confidence scores
Performance Metrics: Evaluation using standard metrics like mAP (mean Average Precision)
Configurable System: Highly customizable through YAML configuration files
Camera & Video Support: Process live camera feeds or pre-recorded videos

Directory Structure

machine_learning/
├── configs/               # Configuration files
│   └── config.yaml        # Main configuration file
├── model_architecture/    # Model architecture definitions
│   └── model.py           # Single-stage detector implementation
├── utils/                 # Utility functions
│   ├── data_processing.py # Data loading and preprocessing
│   ├── visualization.py   # Visualization utilities
│   └── metrics.py         # Evaluation metrics
├── training/              # Training components
│   └── train.py           # Training script
├── checkpoints/           # Model checkpoints (created during training)
├── data/                  # Dataset directory (not included)
│   ├── train.txt          # Training annotations
│   ├── val.txt            # Validation annotations
│   └── coco_classes.txt   # Class names
├── main.py                # Main script for real-time detection
└── README.md              # Project documentation

Installation

Prerequisites

Python 3.10+
CUDA-compatible GPU (recommended for training)
Linux, Windows, or macOS

Setup

Clone this repository:

git clone https://github.com/yourusername/object-detection.git
cd object-detection

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install dependencies:
```
pip install -r requirements.txt
```

Dependencies

TensorFlow 2.18.0+
OpenCV 4.11.0+
NumPy
Matplotlib
PyYAML
scikit-learn

Configuration

The system is configured through configs/config.yaml. The main configuration sections are:

Model Parameters

model:
  name: "SingleStageDetector"
  input_size: [416, 416]         # Input image dimensions
  channels: 3                     # RGB channels
  backbone: "darknet"             # Feature extraction backbone
  num_classes: 80                 # Number of object classes to detect
  grid_sizes: [13, 26, 52]        # Feature map sizes for multi-scale detection
  anchors: [...]                  # Anchor box dimensions for each scale

Training Parameters

training:
  batch_size: 8                   # Batch size for training
  epochs: 100                     # Total training epochs
  optimizer: "adam"               # Optimizer type
  learning_rate: 0.001            # Initial learning rate
  lr_scheduler: "cosine"          # Learning rate scheduler type
  checkpoint_dir: "checkpoints"   # Directory for model checkpoints

Data Parameters

data:
  train_annotations: "data/train.txt"  # Path to training annotations
  val_annotations: "data/val.txt"      # Path to validation annotations
  augmentation:
    enabled: true                      # Enable/disable data augmentation
    # ... augmentation parameters
  preprocess:
    normalize: true                    # Normalize pixel values
    mean: [0.485, 0.456, 0.406]        # RGB mean for normalization
    std: [0.229, 0.224, 0.225]         # RGB standard deviation

Runtime Parameters

runtime:
  confidence_threshold: 0.5      # Minimum confidence score for detections
  nms_threshold: 0.45            # IoU threshold for non-maximum suppression
  max_boxes: 100                 # Maximum boxes to detect per image

Dataset Preparation

The system expects annotation files in the following format:

path/to/image1.jpg x1,y1,x2,y2,class_id x1,y1,x2,y2,class_id ...
path/to/image2.jpg x1,y1,x2,y2,class_id x1,y1,x2,y2,class_id ...

Where:

Each line starts with the path to an image
Followed by one or more bounding box annotations
Each bounding box is in format: x1,y1,x2,y2,class_id
Coordinates are in absolute pixel values
Class IDs start from 0

Class Names File

Create a file data/coco_classes.txt with one class name per line:

person
bicycle
car
...

Training

Basic Training

To train the model with default settings:

python training/train.py --config configs/config.yaml

Advanced Training Options

# Resume training from a checkpoint
python training/train.py --config configs/config.yaml --resume checkpoints/model_epoch_050_loss_0.1234.h5

# Use specific GPU(s)
python training/train.py --config configs/config.yaml --gpu 0  # Use first GPU
python training/train.py --config configs/config.yaml --gpu 0,1  # Use multiple GPUs

# Enable debug mode for more information
python training/train.py --config configs/config.yaml --debug

Training Outputs

Trained model checkpoints in checkpoints/ directory
Training logs in training.log
Training metrics visualizations in checkpoints/figures/

Inference & Detection

Real-time Detection with Camera

python main.py --checkpoint checkpoints/your_model.h5 --input 0  # Use camera index 0

Process Video File

python main.py --checkpoint checkpoints/your_model.h5 --input path/to/video.mp4 --output results/output.mp4 --save

Detection Options

# Adjust detection confidence
python main.py --checkpoint checkpoints/your_model.h5 --confidence 0.7

# Adjust NMS threshold
python main.py --checkpoint checkpoints/your_model.h5 --nms 0.5

# Disable display (for headless systems)
python main.py --checkpoint checkpoints/your_model.h5 --no-display --save

GPU Requirements and Setup

Hardware Recommendations

Training: NVIDIA GPU with at least 8GB VRAM (16GB+ recommended for larger batch sizes)
Inference: NVIDIA GPU with 4GB+ VRAM, or CPU for slower inference

CUDA Setup

The system requires:

CUDA 11.8+ (for TensorFlow 2.18.0)
cuDNN 8.6+

After installing CUDA, the script automatically manages GPU memory growth to avoid consuming all VRAM. You can specify which GPU(s) to use with the --gpu argument.

Performance Optimization

Use smaller input sizes (e.g., [320, 320]) for faster inference
Use larger input sizes (e.g., [608, 608]) for more accurate detections
Adjust confidence threshold for precision-recall trade-off
Export to TensorFlow Lite or ONNX for deployment on edge devices

Acknowledgments

This implementation is inspired by:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-Time Object Detection System

Features

Directory Structure

Installation

Prerequisites

Setup

Dependencies

Configuration

Model Parameters

Training Parameters

Data Parameters

Runtime Parameters

Dataset Preparation

Class Names File

Training

Basic Training

Advanced Training Options

Training Outputs

Inference & Detection

Real-time Detection with Camera

Process Video File

Detection Options

GPU Requirements and Setup

Hardware Recommendations

CUDA Setup

Performance Optimization

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
data		data
model_architecture		model_architecture
training		training
utils		utils
.gitignore		.gitignore
README.md		README.md
create_sample_checkpoint.py		create_sample_checkpoint.py
detection.log		detection.log
main.py		main.py
requirements.txt		requirements.txt
sample_video.mp4		sample_video.mp4
test_detection.py		test_detection.py
test_image_detections.jpg		test_image_detections.jpg
test_image_original.jpg		test_image_original.jpg

kiamaikocoders/object-detection

Folders and files

Latest commit

History

Repository files navigation

Real-Time Object Detection System

Features

Directory Structure

Installation

Prerequisites

Setup

Dependencies

Configuration

Model Parameters

Training Parameters

Data Parameters

Runtime Parameters

Dataset Preparation

Class Names File

Training

Basic Training

Advanced Training Options

Training Outputs

Inference & Detection

Real-time Detection with Camera

Process Video File

Detection Options

GPU Requirements and Setup

Hardware Recommendations

CUDA Setup

Performance Optimization

Acknowledgments

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages