Skip to content

A two-staged CNN hardware accelerator using Verilog RTL for machine learning applications.

Notifications You must be signed in to change notification settings

ajayraobg/2-Stage-CNN-Accelerator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2-Stage-CNN-Accelerator

A two-staged CNN hardware accelerator using Verilog RTL for machine learning applications.

A hardware accelerator is designed to accelerate the calculation of simplified two stage version of Convolutional Neural Network. The first layer is a feature extraction layer from the input and the second layer is a fully connected layer to identify classes. The 12x12 input matrix is stored in SRAM (Input memory) along with the four 1x9 B vectors and eight 1x64 M vectors (Vector memory). The 8x1 output vector is also written back to SRAM (Output memory). The design intends to balance the tradeoffs between area of the chip and delay to complete the computation. To overcome the possible contention of vector memory bus, all the elements of B vectors are fetched from the SRAM and stored in internal registers before the starting the computations. Removal of this contention facilitated a two stage pipelined design, where the feature extraction (step 1) and class identification (step 2) computations were parallelized.

About

A two-staged CNN hardware accelerator using Verilog RTL for machine learning applications.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published