CudaInference

Cuda NN inference. Example: ResNet18 in source/main.cpp.

Functionality implemented:

Convolution (via im2col) - with/without bias, arbitrary padding, arbitrary stride. Uses cuBLAS and thrust
Linear - with/without bias. Uses cuBLAS and thrust.
BatchNorm.
ReLU.
MaxPool - arbitrary padding, arbitrary stride.
AvgPool - arbitrary padding, arbitrary stride.
Tensor operations:
- common operations (+, -, *, /).
- transpose - arbitrary number of dimentions, arbitrary axes permutation.
- reshape.

Features:

Inference works with arbitrary batch size.
NN weights are read from files on the disk. python directory contains weights and scripts to save pretrained weights to the disk.
Any ResNet can be implemented with this functionality.
Result is fully equivalent to Pytorch forward pass.
Input image must:
- be RGB image with 3 channels
- be in PPM format
- be exactly 224x224

Build:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release .. 
make -j

Usage:

./Release/cuda_proj --input ../images/cat.ppm --weights_dir ../python/weights/ --batch_size 16 --iters 100

The program will fill all inputs in the batch with image ../images/cat.ppm and will perform 100 forward passes. Predicted labels and FPS will be prited.

Benchmarks:

Benchmarks were done with batch_size == 16.

FPS:

Mode	FPS
CPU (Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz) (Pytorch, 4 threads)	48
CPU (Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz) (Pytorch, 16 threads)	81
GPU (GeForce GTX 1080 Ti) (Pytorch)	2050
GPU (GeForce GTX 1080 Ti) (This repo)	445

Memory:

Mode	Memory usage
Pytorch	1317 MB
This Repo	2571 MB

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
images		images
include		include
python		python
source		source
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

include

include

python

python

source

source

.gitignore

.gitignore

CMakeLists.txt

CMakeLists.txt

README.md

README.md

Repository files navigation

CudaInference

Functionality implemented:

Features:

Build:

Usage:

Benchmarks:

FPS:

Memory:

About

Releases

Packages

Languages

BorisLestsov/CudaInference

Folders and files

Latest commit

History

Repository files navigation

CudaInference

Functionality implemented:

Features:

Build:

Usage:

Benchmarks:

FPS:

Memory:

About

Topics

Resources

Stars

Watchers

Forks

Languages