A performance comparison tool for GPU-accelerated versus CPU-based histogram equalization on video frames.
This project benchmarks CUDA-accelerated histogram equalization against CPU processing. It measures FPS, memory usage, and temperature to demonstrate GPU acceleration benefits for image processing tasks.
Histogram equalization improves image contrast by redistributing pixel intensities across the full 0-255 range. It transforms low-contrast images by stretching confined histograms to use the entire intensity spectrum.
How it works:
- Calculate image histogram
- Compute cumulative distribution function (CDF)
- Map old intensity values to new stretched values
- Apply transformation to image
pip install opencv-contrib-python numpy matplotlib seaborn psutil pynvmlpython main.pyPlace your input video as input1.mp4 in the project directory.
reports/benchmark_dashboard.png- Performance comparison dashboardreports/benchmark_report.txt- Detailed statisticsoutput_gpu/gpu_output.mp4- GPU processed outputoutput_cpu/cpu_output.mp4- CPU processed output
gpu_benchmark/
├── main.py # Main entry point
├── gpu_process.py # GPU histogram equalization pipeline
├── cpu_process.py # CPU histogram equalization pipeline
├── report_generator.py # Dashboard and report generation
├── histoexplain.py # CLAHE examples and theory
├── histo.ipynb # Jupyter notebook with visualizations
└── input1.mp4 # Your input video
- Upload frame to GPU memory
- Apply Gaussian blur (GPU-accelerated)
- Apply histogram equalization (CUDA kernel)
- Download result to CPU
- Track VRAM and temperature
- Write to output video
- Read frame from video
- Apply Gaussian blur (CPU)
- Apply histogram equalization (CPU)
- Track RAM usage
- Write to output video
The benchmark generates a 6-panel dashboard:
- GPU Compute Time - Total processing time in seconds
- FPS Comparison - GPU vs CPU frame rates
- Memory Timeline - VRAM and RAM usage per frame
- Speedup Factor - Overall GPU acceleration multiplier
- Temperature Timeline - GPU and CPU temperature during processing
- Before/After - Visual comparison of histogram equalization effect
GPU:
Frame Count: 1000 frames
Compute Time: 5.234 seconds
FPS: 191.07 frames/sec
Peak VRAM Usage: 2048 MB
CPU:
Frame Count: 1000 frames
Compute Time: 26.891 seconds
FPS: 37.19 frames/sec
Peak RAM Usage: 512 MB
Speedup:
GPU is 5.14× faster than CPU
CLAHE is an improved version of histogram equalization:
- Divides image into small tiles (8×8)
- Applies equalization locally per tile
- Limits contrast to prevent noise amplification
- Better for medical imaging and satellite data
See histoexplain.py for CLAHE implementation examples.
- Typical speedup: 3-15× depending on GPU model
- Use high-resolution video (1080p+) for maximum GPU advantage
- Close other GPU applications before benchmarking
- Monitor with
nvidia-smiduring execution
- Python 3.8+
- NVIDIA GPU with CUDA support (or CPU fallback)
- 4GB RAM minimum
- OpenCV, NumPy, Matplotlib, psutil, pynvml
CUDA init failed: Install NVIDIA drivers and CUDA toolkit. Verify with nvidia-smi.
Cannot open input video: Ensure input1.mp4 exists. Try different formats (avi, mov).
Low FPS: Use high-resolution video or SSD storage. Close other GPU applications.
MIT License