Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Considerable speedup(VGG model:1.5x, AlexNet:1.1x)
Optimizations focus on the gpu-related features, such as avoiding bank conflict, employing wider band width of shared memory, and using vectorized data type, etc..
- Loading branch information