Skip to content

xclusivecyberdev/Computer-Vision---Object-Detection-Tracking

Repository files navigation

Object Detection & Tracking System

A comprehensive real-time object detection and tracking system built with deep learning. This system supports multiple state-of-the-art detection models (YOLO, SSD, Faster R-CNN), object tracking algorithms, custom model training, and provides both REST API and web interface for easy integration.

Features

  • Multiple Detection Models

    • YOLOv8 (Nano, Small, Medium, Large, XL variants)
    • Faster R-CNN with ResNet-50 FPN backbone
    • SSD300 with VGG16 backbone
  • Object Tracking

    • SORT (Simple Online and Realtime Tracking)
    • DeepSORT with appearance features
    • Multi-object tracking with trajectory visualization
  • Real-time Processing

    • Live video stream processing
    • Webcam support
    • Video file processing with output
    • Frame-by-frame image detection
  • Analytics & Counting

    • Object counting and tracking
    • Class-wise statistics
    • Trajectory heatmap generation
    • FPS monitoring
  • Custom Model Training

    • YOLO model training pipeline
    • Dataset organization utilities
    • Model export (ONNX, TensorRT, TorchScript)
  • Model Evaluation

    • Precision and Recall metrics
    • mAP (mean Average Precision) calculation
    • FPS benchmarking
  • Performance Optimization

    • ONNX export for cross-platform deployment
    • TensorRT optimization for NVIDIA GPUs
    • FP16 half-precision support
    • Model quantization
  • REST API

    • Image detection endpoint
    • Video processing endpoint
    • WebSocket streaming support
    • Analytics API
  • Web Interface

    • Live webcam detection
    • Image upload and visualization
    • Video processing with download
    • Real-time analytics dashboard

Installation

Prerequisites

  • Python 3.8 or higher
  • CUDA-capable GPU (recommended for real-time performance)
  • CUDA Toolkit 11.8+ and cuDNN (for GPU acceleration)

Install from Source

# Clone the repository
git clone https://github.com/xclusivecyberdev/Computer-Vision---Object-Detection-Tracking.git
cd Computer-Vision---Object-Detection-Tracking

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .

Install with Docker (Coming Soon)

docker build -t object-detection .
docker run -p 8000:8000 object-detection

Quick Start

1. Image Detection

Detect objects in a single image:

python demo_image.py path/to/image.jpg --output result.jpg

Options:

  • --model: Model type (yolov8, faster_rcnn, ssd)
  • --variant: Model variant for YOLO (n, s, m, l, x)
  • --conf: Confidence threshold (default: 0.5)
  • --device: Device (cuda or cpu)

2. Video Processing

Process a video file:

python demo_video.py path/to/video.mp4 --output output.mp4

Use webcam (camera index 0):

python demo_video.py 0

Options:

  • --tracker: Tracker type (sort, deepsort)
  • --no-display: Disable video display
  • --model, --variant, --conf, --device: Same as image detection

3. Run REST API Server

Start the API server:

python run_api.py

The server will start at http://localhost:8000

4. Use Python API

from models import create_detector
from tracking import create_tracker
from processor import VideoProcessor
import cv2

# Create detector
detector = create_detector(
    model_type='yolov8',
    variant='n',
    confidence_threshold=0.5,
    device='cuda'
)

# Detect objects in image
image = cv2.imread('image.jpg')
detections = detector.detect(image)

print(f"Detected {len(detections['boxes'])} objects")

# Process video with tracking
processor = VideoProcessor()
stats = processor.process_video(
    source='video.mp4',
    output_path='output.mp4',
    show_display=True
)

Configuration

Edit configs/config.yaml to customize:

  • Model settings (type, confidence, IOU thresholds)
  • Tracking parameters (max age, min hits)
  • Video processing options
  • Analytics settings
  • API configuration

Example configuration:

model:
  type: "yolov8"
  variant: "n"
  confidence_threshold: 0.5
  device: "cuda"

tracking:
  type: "sort"
  max_age: 30
  min_hits: 3

video:
  output_path: "outputs/videos"
  save_output: true
  display_fps: true

Custom Model Training

Prepare Dataset

Organize your dataset in YOLO format:

dataset/
  ├── images/
  │   ├── train/
  │   ├── val/
  │   └── test/
  └── labels/
      ├── train/
      ├── val/
      └── test/

Create dataset configuration:

from training import ModelTrainer

trainer = ModelTrainer()
trainer.create_dataset_config(
    dataset_path='data/my_dataset',
    class_names=['person', 'car', 'bike'],
    output_path='data/dataset.yaml'
)

Train Model

python demo_train.py data/dataset.yaml --model yolov8n.pt --epochs 100

Options:

  • --model: Base model to start from
  • --epochs: Number of training epochs
  • --batch: Batch size
  • --imgsz: Input image size

Export Model

from training import ModelTrainer

trainer = ModelTrainer()

# Export to ONNX
trainer.export_model('runs/train/exp/weights/best.pt', format='onnx')

# Export to TensorRT
trainer.export_model('runs/train/exp/weights/best.pt', format='tensorrt')

Model Evaluation

from evaluation import ModelEvaluator
from models import create_detector

evaluator = ModelEvaluator()
detector = create_detector('yolov8')

# Prepare test data
test_data = [
    {
        'image': image,
        'annotations': [
            {'box': [x1, y1, x2, y2], 'class_id': 0},
            # ...
        ]
    }
]

# Evaluate model
results = evaluator.evaluate_model(detector, test_data)

print(f"mAP@0.5: {results['mAP']['mAP@0.5']:.3f}")
print(f"Precision: {results['precision']:.3f}")
print(f"Recall: {results['recall']:.3f}")
print(f"FPS: {results['fps']['mean_fps']:.1f}")

REST API Endpoints

Image Detection

POST /detect/image

  • Upload image and get detection results (JSON)

POST /detect/image/visualized

  • Upload image and get visualized result (image)

Video Processing

POST /detect/video

  • Upload video for processing
  • Returns statistics and download link

Analytics

GET /analytics

  • Get current analytics data

POST /reset

  • Reset analytics counters

WebSocket Streaming

WS /ws/stream

  • Real-time video streaming endpoint

Performance Optimization

Export to ONNX

from optimization import ModelOptimizer

optimizer = ModelOptimizer()
onnx_path = optimizer.export_to_onnx(
    'yolov8n.pt',
    'model.onnx',
    img_size=640
)

Export to TensorRT

trt_path = optimizer.export_to_tensorrt(
    'yolov8n.pt',
    'model.engine',
    img_size=640,
    fp16=True
)

Benchmark Performance

results = optimizer.benchmark_optimization(
    original_model='yolov8n.pt',
    optimized_model='model.engine',
    test_images=images,
    num_runs=100
)

print(f"Speedup: {results['speedup']:.2f}x")

Project Structure

Computer-Vision---Object-Detection-Tracking/
├── src/
│   ├── models/              # Detection models
│   │   ├── yolo_detector.py
│   │   ├── faster_rcnn_detector.py
│   │   └── ssd_detector.py
│   ├── tracking/            # Tracking algorithms
│   │   ├── sort_tracker.py
│   │   └── deepsort_tracker.py
│   ├── training/            # Training pipeline
│   ├── evaluation/          # Evaluation metrics
│   ├── api/                 # REST API
│   ├── utils/               # Utilities
│   ├── processor.py         # Video processor
│   └── optimization.py      # Performance optimization
├── static/                  # Web interface assets
├── templates/               # HTML templates
├── configs/                 # Configuration files
├── data/                    # Data directory
├── outputs/                 # Output directory
├── demo_image.py           # Image detection demo
├── demo_video.py           # Video processing demo
├── demo_train.py           # Training demo
├── run_api.py              # API server
└── requirements.txt        # Dependencies

Supported Models

YOLO (You Only Look Once)

  • YOLOv8n: Nano - Fastest, lowest accuracy
  • YOLOv8s: Small - Balanced
  • YOLOv8m: Medium - Good accuracy
  • YOLOv8l: Large - High accuracy
  • YOLOv8x: Extra Large - Best accuracy, slowest

Faster R-CNN

  • ResNet-50 FPN backbone
  • Pre-trained on COCO dataset

SSD (Single Shot Detector)

  • SSD300 with VGG16 backbone
  • Pre-trained on COCO dataset

Performance Benchmarks

Tested on NVIDIA RTX 3080 with CUDA 11.8:

Model Input Size FPS (TensorRT FP16) mAP@0.5
YOLOv8n 640x640 120+ 0.52
YOLOv8s 640x640 85+ 0.60
YOLOv8m 640x640 55+ 0.67
Faster R-CNN 800x800 25+ 0.58
SSD300 300x300 90+ 0.50

Troubleshooting

CUDA Out of Memory

  • Reduce batch size
  • Use smaller model variant (e.g., YOLOv8n instead of YOLOv8x)
  • Lower input image resolution

Low FPS

  • Use GPU acceleration (set device: cuda)
  • Export model to TensorRT
  • Enable FP16 precision
  • Reduce input resolution
  • Skip frames (frame_skip in config)

Detection Quality Issues

  • Adjust confidence threshold
  • Use larger model variant
  • Train custom model on your specific dataset
  • Check lighting and image quality

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

MIT License - see LICENSE file for details

Citation

If you use this project in your research, please cite:

@software{object_detection_tracking,
  title={Real-time Object Detection and Tracking System},
  author={Computer Vision Team},
  year={2024},
  url={https://github.com/xclusivecyberdev/Computer-Vision---Object-Detection-Tracking}
}

Acknowledgments

  • YOLOv8 by Ultralytics
  • PyTorch and TorchVision
  • OpenCV
  • FastAPI
  • SORT and DeepSORT algorithms

Support

For issues and questions:

  • Create an issue on GitHub
  • Check documentation and examples
  • Review configuration options

Roadmap

  • Support for additional models (YOLOv9, YOLOv10)
  • Multi-camera support
  • Real-time video streaming from RTSP
  • Docker containerization
  • Cloud deployment guides (AWS, GCP, Azure)
  • Mobile deployment (TensorFlow Lite, ONNX Runtime Mobile)
  • Advanced analytics (people counting, zone crossing)
  • Database integration for analytics storage

Built with ❤️ by the Computer Vision Team

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published