Skip to content

jhson989/fast-conv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fast Convoluion Implementation via CUDA

1. Introduction

  • Implementation list 0. Naive convolution (CPU) - include/conv_cpu.cuh - parallelized via OpenMP
    1. Naive convolution (GPU)
      • include/conv_gpu_naive.cuh
    2. GEMM (im2col)
      • include/conv_gpu_matmul.cuh
    3. (TODO) FFT
    4. (TODO) Strassen's method
    5. (TODO) Winograd's method

2. How to Run

  • build
    • make DEBUG=OFF
      • Skip a routine for checking computation results
    • make DEBUG=ON
      • Do a routine for checking computation results
  • execute
    • make run

About

Fast Convoluion Implementation via CUDA

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published