Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix CUDNN convolution workspace: Use dedicated function to determine workspace size for alogorithm #118
The workspace size used for the fastest may not be the workspace size required when running the same algorithm, which may happen if the same algorithm was profiled with different math type (FP32 and FP16) requiring different workspace sizes. This fix makes another call to a cudnn function to get the max required workspace size for the selected algorithm.