Automating Traffic Compliance in India using Attention-Centric Real-Time Object Detection.
- Abstract
- Key Features
- System Architecture
- Methodology & Logic
- Tech Stack
- Performance Evaluation
- Installation
- Usage
- References & Citations
Indiaβs rapid urbanization has led to a surge in vehicular density, rendering traditional manual traffic monitoring inefficient. Manual oversight is prone to human error, corruption, and inability to handle high-volume traffic 24/7.
ITSS (Intelligent Traffic Surveillance System) is a vision-based automated solution designed to enforce traffic laws with high precision. By leveraging the YOLOv12 (You Only Look Once) architecture, which introduces attention-centric mechanisms for superior real-time performance, this system detects complex violations such as red-light jumping, speeding, and helmet-less riding.
Furthermore, the system integrates an OCR (Optical Character Recognition) pipeline to extract license plates and communicates with a backend API to verify Insurance and Pollution Control (PUC) status, automatically generating e-challans for offenders.
- π΄ Red Light Violation: Detects the traffic signal state and identifies vehicles that cross the stop line while the signal is red.
- βοΈ Helmet-less Rider Detection: Classifies two-wheeler riders and specifically detects the absence of a helmet on both the rider and pillion.
- β‘ Speeding Detection: Uses perspective transformation and object tracking to estimate vehicle speed in real-time.
- π Plate Extraction: High-accuracy cropping of license plates from moving vehicles.
- π OCR Processing: Converts plate images to text strings using advanced OCR engines.
- βοΈ API Integration: Queries a central database using the extracted plate number.
- π« Document Verification: Checks for valid Insurance and PUC certificates.
- π© E-Challan Generation: Automatically creates a violation report with time, location, violation type, and evidence image.
The system operates on a pipeline approach:
- Input Acquisition: CCTV feed or pre-recorded video.
- Object Detection (YOLOv12): Detects classes:
Vehicle,License Plate,Traffic Light,Person,Helmet. - Object Tracking: Assigns unique IDs to vehicles across frames using SORT/DeepSORT to handle occlusion.
- Violation Logic Module:
- If Red Light AND Vehicle Center > Stop Line β Violation.
- If Vehicle Speed > Speed Limit β Violation.
- If Motorbike AND Head AND No Helmet β Violation.
- Post-Processing:
- Crop License Plate β Pass to OCR β Get String.
- Send String to API β specific JSON response.
- Output: Overlay visuals on video and log data to CSV/Database.
We utilize YOLOv12 by Ultralytics. Unlike previous iterations, YOLOv12 introduces an attention-centric architecture that significantly improves feature extraction in complex urban environments (e.g., crowded Indian roads). It balances the speed of CNNs with the global context awareness of Transformers.
- Reference: Ultralytics YOLOv12 Documentation
To calculate speed from a 2D video feed:
- Perspective Transform: We map the Region of Interest (ROI) on the road to a "bird's eye view."
- Euclidean Distance: Calculate the distance moved by the vehicle centroid between frames.
-
Formula:
$Speed = \frac{Distance (meters)}{Time (seconds)} \times 3.6 (km/h)$
| Component | Tool/Technology | Description |
|---|---|---|
| Language | Python 3.9+ | Core logic and scripting. |
| Detection Model | YOLOv12 | Custom trained on Indian Traffic Dataset. |
| Framework | PyTorch / Ultralytics | Model training and inference. |
| Computer Vision | OpenCV | Image processing, perspective transforms. |
| OCR | EasyOCR / Tesseract | Text extraction from license plates. |
| Tracking | DeepSORT | Multi-object tracking for vehicle persistence. |
| Backend | Flask / FastAPI | Handling API requests for challan generation. |
The model was tested on a custom dataset comprising diverse Indian road scenarios (day, night, rain, high density).
- Overall Accuracy: 93.76%
- mAP@0.5: 0.95
- Inference Speed: ~45 FPS on NVIDIA RTX 3060
- OCR Accuracy: 89% (Dependent on plate visibility)
- Python 3.8 or higher
- CUDA-enabled GPU (Recommended for real-time performance)
-
Clone the Repository
git clone [https://github.com/yourusername/traffic-surveillance-system.git](https://github.com/yourusername/traffic-surveillance-system.git) cd traffic-surveillance-system -
Create Virtual Environment (Optional but recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Dependencies
# Install Ultralytics for YOLOv12 pip install ultralytics # Install other requirements pip install -r requirements.txt
-
Download Weights Place your trained
yolov12_custom.ptfile in theweights/directory.
To run the system on a video file:
python main.py --source data/input_video.mp4 --weights weights/yolov12_custom.pt --conf 0.5Arguments:
--source: Path to video file or0for webcam.--weights: Path to the trained YOLOv12 model.--conf: Confidence threshold for detection.--save-txt: Save violation logs to a text file.
This project is built upon the cutting-edge research in object detection. If you use this repository or the YOLOv12 architecture in your research, please cite the original authors:
@article{tian2025yolo12,
title={YOLOv12: Attention-Centric Real-Time Object Detectors},
author={Tian, Yunjie and Ye, Qixiang and Doermann, David},
journal={arXiv preprint arXiv:2502.12524},
year={2025}
}
@software{yolo12,
author = {Tian, Yunjie and Ye, Qixiang and Doermann, David},
title = {YOLO12: Attention-Centric Real-Time Object Detectors},
year = {2025},
url = {[https://github.com/sunsmarterjie/yolov12](https://github.com/sunsmarterjie/yolov12)},
license = {AGPL-3.0}
}
- Ultralytics: For the YOLO framework implementation. Docs
- OpenCV: For image processing tools.