Image Convolution OpenCL

A performance comparison study between CPU-based scalar processing and GPU-accelerated OpenCL implementation for image convolution operations using Sobel edge detection filters.

Overview

This project implements and compares two approaches for applying convolution filters to images:

CPU Implementation: Traditional scalar processing using OpenCV
GPU Implementation: Parallel processing using OpenCL kernels

The assignment demonstrates the performance benefits of parallel computing by applying a 3x3 Sobel filter to a dataset of images and measuring execution times.

Dependencies

Required Libraries

OpenCV 4.x: Image processing and I/O operations
OpenCL: GPU computing framework
C++17: Standard library features (filesystem)

System Requirements

OpenCL compatible GPU (NVIDIA, AMD, or Intel)
OpenCL drivers installed
CMake 3.10 or higher
GCC/Clang with C++17 support

Installation

Ubuntu/Debian

# Install OpenCV
sudo apt update
sudo apt install libopencv-dev

# Install OpenCL
sudo apt install opencl-headers ocl-icd-opencl-dev

# For NVIDIA GPUs
sudo apt install nvidia-opencl-dev

# For AMD GPUs
sudo apt install mesa-opencl-icd

Building the Project

git clone https://github.com/yourusername/parallel-image-convolution.git
cd parallel-image-convolution

# Create build directory
mkdir build && cd build

# Configure and build
cmake ..
make

# Or compile directly
g++ -std=c++17 -o scalar ../src/Scalar.cpp `pkg-config --cflags --libs opencv4`
g++ -std=c++17 -o opencl ../src/OpenCl.cpp `pkg-config --cflags --libs opencv4` -lOpenCL

Algorithm Details

Sobel Edge Detection Filter

The implementation uses a 3x3 Sobel filter for vertical edge detection:

[ 1  0 -1 ]
[ 1  0 -1 ]
[ 1  0 -1 ]

CPU Implementation Features

Sequential pixel-by-pixel processing
OpenCV's copyMakeBorder for edge handling
Automatic normalization to 0-255 range
Timing measurement using chrono

GPU Implementation Features

Parallel kernel execution using OpenCL
2D work-group distribution (width × height)
Vectorized operations with float4
Boundary checking within kernel
Memory-efficient buffer management

Performance Analysis

Key Metrics

Processing Time: Per-image execution time
Throughput: Images processed per second
Speedup: GPU time vs CPU time ratio
Memory Usage: Buffer allocation efficiency

Expected Results

GPU implementation typically shows 10-50x speedup for large images
Performance gain increases with image size
Memory transfer overhead affects small images

Technical Implementation

OpenCL Kernel Highlights

__kernel void convolution(__global const float* input, 
                         __global float* output, 
                         __constant float* filter, 
                         int width, int height, int filtersize)

Global memory access patterns optimized
Boundary condition handling
Vectorized computation using float4

Memory Management

Input Buffer: Read-only image data
Output Buffer: Write-only result data
Filter Buffer: Constant kernel coefficients
Efficient host-device memory transfers

Troubleshooting

Common Issues

OpenCL Platform Not Found

# Check available platforms
clinfo

# Install OpenCL runtime
sudo apt install ocl-icd-libopencl1

OpenCV Not Found

# Verify installation
pkg-config --modversion opencv4

# If not found, install development packages
sudo apt install libopencv-contrib-dev

Compilation Errors

Ensure C++17 support: -std=c++17
Link OpenCL library: -lOpenCL
Include OpenCV flags: pkg-config --cflags --libs opencv4

Extending the Project

Possible Enhancements

Additional Filters: Gaussian blur, Laplacian, custom kernels
Color Images: RGB channel processing
Memory Optimization: Shared memory usage in kernels
Batch Processing: Multiple images simultaneously
Performance Profiling: Detailed timing analysis
Multi-GPU Support: Distributed processing

Research Directions

Compare with CUDA implementation
Analyze memory bandwidth utilization
Study work-group size optimization
Investigate different OpenCL devices

Contributing

This is an academic project, but suggestions and improvements are welcome:

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Convolution.cl		Convolution.cl
OpenCL.cpp		OpenCL.cpp
README.md		README.md
Report.pdf		Report.pdf
Scalar.cpp		Scalar.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Convolution OpenCL

Overview

Dependencies

Required Libraries

System Requirements

Installation

Ubuntu/Debian

Building the Project

Algorithm Details

Sobel Edge Detection Filter

CPU Implementation Features

GPU Implementation Features

Performance Analysis

Key Metrics

Expected Results

Technical Implementation

OpenCL Kernel Highlights

Memory Management

Troubleshooting

Common Issues

Extending the Project

Possible Enhancements

Research Directions

Contributing

About

Uh oh!

Releases

Packages

Languages

rehan-trq/Image-Convolution-OpenCL-CPP-Performance-Analysis

Folders and files

Latest commit

History

Repository files navigation

Image Convolution OpenCL

Overview

Dependencies

Required Libraries

System Requirements

Installation

Ubuntu/Debian

Building the Project

Algorithm Details

Sobel Edge Detection Filter

CPU Implementation Features

GPU Implementation Features

Performance Analysis

Key Metrics

Expected Results

Technical Implementation

OpenCL Kernel Highlights

Memory Management

Troubleshooting

Common Issues

Extending the Project

Possible Enhancements

Research Directions

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages