Skip to content

DDjackson272/CUDA-Parallel-Programming-CNN

Repository files navigation

CUDA Parallel Programming in CNN

  1. Created both CPU and GPU implementations of the neural-network convolution layer forward pass.
  2. Accelerated the GPU implementation by 40% with multiple optimizations (atomic operations, matrix unrolling etc.).
  3. Used NVIDIA Visual Profiler to analyze performance of the optimizations.

About

CUDA GPU optimization on CNN

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published