Image Convolution Accelerator for CNNs

A high-performance hardware accelerator designed for image convolution operations in Convolutional Neural Networks (CNNs). This implementation provides an efficient FPGA-based solution for accelerating the computationally intensive convolution operations commonly found in deep learning inference.

Overview

This accelerator is designed to perform 2D convolution operations on image data using dedicated hardware resources including BRAM buffers, matrix multiplication units, and optimized line buffer architectures. The design focuses on maximizing throughput while minimizing resource utilization for deployment on FPGA platforms.

Features

High-throughput convolution processing - Optimized for real-time image processing
BRAM-based buffering - Efficient memory management using block RAM resources
Pipelined architecture - Maximizes clock frequency and data throughput
Configurable parameters - Supports various kernel sizes and input dimensions
Line buffer optimization - Minimizes memory access patterns for improved performance

Architecture

The accelerator consists of several key components:

ImageConv: Top-level convolution engine that orchestrates the entire operation
bramBuffer: BRAM-based buffer management for input/output data storage
MatrixMult: Optimized matrix multiplication unit for convolution computation
linebufferBRAM: Line buffer implementation using BRAM for sliding window operations

File Structure

├── src/
│   ├── ImageConv/          # Top-level convolution accelerator module
│   ├── bramBuffer/         # BRAM buffer management components
│   ├── MatrixMult/         # Matrix multiplication engine
│   ├── linebufferBRAM/     # Line buffer implementation
│   └── ImageConvTB/        # Testbench for the accelerator
├── tests/                  # Test vectors and validation scripts
└── README.md              # This file

Getting Started

Prerequisites

FPGA development tools (Vivado, Quartus, or similar)
HDL simulator (ModelSim, Vivado Simulator, etc.)

Building the Project

Clone this repository:

git clone <repository-url>
cd image-convolution-accelerator

Open your FPGA development environment and add all source files from the src/ directory.
Set ImageConv as the top-level module for synthesis.
Configure synthesis and implementation settings based on your target FPGA device.

Running Tests

Set ImageConvTB as top module
Uncomment desired test from lines 60-130
Verify that all tests pass and check the generated waveforms for correctness.

Configuration

The accelerator supports several configurable parameters:

Parameter	Description	Default Value
`PIXEL_WIDTH`	Bits required to represent pixel	8
`ROW_LENGTH`	Input image width	512

Modify these parameters in the top-level module to match your specific application requirements.

Performance

The accelerator has been optimized for the following performance characteristics:

Throughput: Can achieve 5.74 GOPs (device dependent)
Latency: 2583 clock cycles for first output (pipeline latency)
Resource Usage: Optimized for minimal LUT and BRAM utilization
Clock Frequency: Achieves at least 274 MHz on Arria 10 GX FPGAs

Usage Example

// Instantiate the Image Convolution Accelerator
ImageConv #(
    .PIXEL_WIDTH(8),
    .ROW_LENGTH(512)
) conv_accelerator (
    .clk(clk),
    .reset(rst),
    .i_f(filter), //3x3 filter of 8 bit pixels
    .i_valid(i_valid),
    .i_ready(i_ready),
    .i_x(i_x), //8 bit input pixel
    .o_valid(o_valid),
    .o_ready(o_ready),
    .o_y(o_y) //8 bit output pixel
);

Testing

The project includes a comprehensive testbench located at src/ImageConvTB.v. This verifies:

Functional correctness against software reference models
Edge case handling (boundary conditions, overflow, etc.)
Performance benchmarks

Acknowledgments

Originally Designed for my Reconfigurable FPGA Architecture class, however since optimizing the design for performance and area, I no longer have access to a Quartus II license to obtain measurements on an Arria 10 GX FPGA
Based on research in hardware acceleration for deep learning
Optimized for modern FPGA architectures
Inspired by state-of-the-art CNN acceleration techniques

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Convolution Accelerator for CNNs

Overview

Features

Architecture

File Structure

Getting Started

Prerequisites

Building the Project

Running Tests

Configuration

Performance

Usage Example

Testing

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
tests		tests
README.md		README.md

theebank/Image-Convolution-Accelerator

Folders and files

Latest commit

History

Repository files navigation

Image Convolution Accelerator for CNNs

Overview

Features

Architecture

File Structure

Getting Started

Prerequisites

Building the Project

Running Tests

Configuration

Performance

Usage Example

Testing

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages