🚗 Automated Vehicle Re-Identification & Temporal Matching System

A production-ready computer vision pipeline for matching vehicle entry/exit events using multi-engine OCR, fuzzy temporal matching, and optimized image enhancement strategies.

📋 Table of Contents

Overview
System Architecture
Key Features
Technical Stack
Installation
Usage
Configuration
Project Structure
Performance
Testing
License

🎯 Overview

This system automatically identifies and matches vehicle entry/exit events from a dataset of 2,000+ images by:

Extracting metadata and bounding boxes from cloud storage headers
Detecting and cropping license plates using computer vision
Running multi-strategy OCR with Fast-ALPR and EasyOCR fallback
Applying fuzzy string matching with temporal logic for entry/exit pairing

Challenge Solved: Given randomized vehicle images with no explicit pairing information, the system achieves 98.15% OCR accuracy and 60.31% vehicle pairing rate through intelligent enhancement strategies and temporal filtering.

🏗️ System Architecture

graph LR
    A[Image URLs] --> B[Metadata Extractor]
    B --> C[SQLite Database]
    C --> D[Plate Detection YOLOv8]
    D --> E[Plate Cropping]
    E --> F[Multi-Engine OCR]
    F --> G[Image Enhancement]
    G --> H[Fast-ALPR Primary]
    G --> I[EasyOCR Fallback]
    H --> J[Text Normalization]
    I --> J
    J --> K[Fuzzy Matcher]
    K --> L[Temporal Filter]
    L --> M[Greedy Pairing]
    M --> N[Submission Output]

    style F fill:#f9f,stroke:#333,stroke-width:2px
    style K fill:#bbf,stroke:#333,stroke-width:2px
    style M fill:#bfb,stroke:#333,stroke-width:2px

Pipeline Stages

Metadata Parsing 🔍
- Async HTTP HEAD requests (30 concurrent)
- Extracts Last-Modified timestamps for temporal logic
- Parses GCS metadata headers for bbox coordinates
- 99.85% bbox extraction success rate
License Plate Detection 📸
- YOLOv8 object detection for plates
- Metadata bbox fallback (99.85% coverage)
- ~115 images/second processing speed
Multi-Engine OCR 🔤
- Fast-ALPR (Primary): Specialized ALPR with ONNX runtime
- EasyOCR (Fallback): General-purpose OCR engine
- 6-Strategy Enhancement: CLAHE, sharpening, bilateral filtering, adaptive thresholding, morphological operations
- Character normalization (0/O, 1/I, 8/B, 5/S)
Fuzzy Temporal Matching 🎯
- Two-phase greedy algorithm:
  - Phase 1: Exact matches (100% similarity)
  - Phase 2: Fuzzy matches (75-99% similarity)
- Temporal filtering (<72 hours between entry/exit)
- Levenshtein distance with time proximity weighting

✨ Key Features

🎨 Multi-Strategy Image Enhancement

Applies 6 different enhancement techniques and selects the best OCR result:

CLAHE: Contrast-limited adaptive histogram equalization
Sharpening: Kernel-based edge enhancement
Bilateral: Noise reduction while preserving edges
Adaptive Threshold: Binarization for varying lighting
Morphological: Closing operations for character connectivity

🧠 Intelligent Character Normalization

Handles common OCR ambiguities:

character_mapping = {
    "0" ↔ "O",  # Zero and letter O
    "1" ↔ "I",  # One and letter I
    "8" ↔ "B",  # Eight and letter B
    "5" ↔ "S",  # Five and letter S
}

⏱️ Temporal Logic

Timestamp-based filtering prevents incorrect matches
Time proximity weighting: closer timestamps = higher confidence
Configurable max time difference (default: 72 hours)

📊 Production-Grade Error Handling

Async retry logic for network failures
Database transaction rollbacks
Comprehensive logging with levels (DEBUG, INFO, WARNING, ERROR)
Graceful degradation (EasyOCR fallback if Fast-ALPR fails)

🛠️ Technical Stack

Category	Technology
Language	Python 3.8+
Computer Vision	OpenCV 4.8+, PIL
Object Detection	Ultralytics YOLOv8
Deep Learning	PyTorch 2.0+, TorchVision
OCR	Fast-ALPR (ONNX), EasyOCR
String Matching	FuzzyWuzzy, python-Levenshtein
Async I/O	aiohttp, asyncio
Database	SQLite3
Configuration	PyYAML
Testing	pytest, pytest-asyncio

📦 Installation

Prerequisites

Python 3.8 or higher
CUDA-capable GPU (optional, for faster inference)

Setup

Clone the repository

git clone https://github.com/KPandya1903/Vehicle-Matching-System.git
cd Vehicle-Matching-System

Create virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

🚀 Usage

Quick Start

Run the complete pipeline:

python main.py --input vehicle_images_input.txt --output submission.txt

Step-by-Step Execution

Extract Metadata Only

python main.py --phase metadata --input vehicle_images_input.txt

Run OCR Extraction

python main.py --phase ocr

Perform Matching

python main.py --phase matching

Export Submission

python main.py --phase export --output submission.txt

Configuration

Edit configs/config.yaml to customize:

OCR confidence thresholds
Matching similarity thresholds
Maximum time difference
Image enhancement strategies
Logging levels

Example:

matching:
  min_similarity_threshold: 75
  max_time_difference_hours: 72

ocr:
  fast_alpr:
    confidence_threshold: 0.5

📁 Project Structure

Vehicle-Matching-System/
├── src/                          # Source code
│   ├── data/                     # Data loading & metadata extraction
│   │   ├── __init__.py
│   │   └── metadata_extractor.py
│   ├── detection/                # YOLOv8 plate detection
│   │   └── __init__.py
│   ├── ocr/                      # Multi-engine OCR
│   │   ├── __init__.py
│   │   └── alpr_engine.py
│   ├── matching/                 # Fuzzy temporal matching
│   │   ├── __init__.py
│   │   └── vehicle_matcher.py
│   └── utils/                    # Utilities
│       ├── __init__.py
│       ├── config_loader.py
│       ├── logger.py
│       └── text_normalizer.py
├── configs/                      # Configuration files
│   └── config.yaml
├── tests/                        # Unit tests
│   ├── test_text_normalizer.py
│   └── test_matching.py
├── notebooks/                    # Jupyter notebooks (EDA, visualization)
├── assets/                       # Diagrams, screenshots
├── main.py                       # Main entry point
├── requirements.txt              # Python dependencies
├── .gitignore
├── LICENSE
└── README.md

📊 Performance Metrics

Metric	Value
Total Images	2,000
Bbox Extraction	99.85% (1,997/2,000)
OCR Success Rate	98.15% (1,960/1,997)
OCR Confidence (Avg)	87.8%
Matched Pairs	591
Match Rate	60.31% of vehicles
Avg Similarity	94.28%
Avg Time Diff	6.35 hours
Processing Speed	~115 images/sec (cropping)

OCR Breakdown

Fast-ALPR (Optimized): 96.67% (1,896 plates)
EasyOCR (Fallback): 3.27% (64 plates)

Matching Breakdown

Exact Matches: 368 pairs (100% similarity)
Fuzzy Matches: 223 pairs (75-99% similarity)

🧪 Testing

Run unit tests:

pytest tests/ -v

Run with coverage:

pytest tests/ --cov=src --cov-report=html

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🏆 Acknowledgments

Fast-ALPR: Open-source ALPR library (GitHub)
Ultralytics YOLOv8: State-of-the-art object detection
EasyOCR: Robust general-purpose OCR engine

Built with ❤️ for computer vision and intelligent systems

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚗 Automated Vehicle Re-Identification & Temporal Matching System

📋 Table of Contents

🎯 Overview

🏗️ System Architecture

Pipeline Stages

✨ Key Features

🎨 Multi-Strategy Image Enhancement

🧠 Intelligent Character Normalization

⏱️ Temporal Logic

📊 Production-Grade Error Handling

🛠️ Technical Stack

📦 Installation

Prerequisites

Setup

🚀 Usage

Quick Start

Step-by-Step Execution

Configuration

📁 Project Structure

📊 Performance Metrics

OCR Breakdown

Matching Breakdown

🧪 Testing

📄 License

🏆 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
archive		archive
configs		configs
src		src
tests		tests
.gitignore		.gitignore
GIT_COMMIT_GUIDE.md		GIT_COMMIT_GUIDE.md
LICENSE		LICENSE
QUICK_REFERENCE.md		QUICK_REFERENCE.md
README.md		README.md
REFACTORING_SUMMARY.md		REFACTORING_SUMMARY.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🚗 Automated Vehicle Re-Identification & Temporal Matching System

📋 Table of Contents

🎯 Overview

🏗️ System Architecture

Pipeline Stages

✨ Key Features

🎨 Multi-Strategy Image Enhancement

🧠 Intelligent Character Normalization

⏱️ Temporal Logic

📊 Production-Grade Error Handling

🛠️ Technical Stack

📦 Installation

Prerequisites

Setup

🚀 Usage

Quick Start

Step-by-Step Execution

Configuration

📁 Project Structure

📊 Performance Metrics

OCR Breakdown

Matching Breakdown

🧪 Testing

📄 License

🏆 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages