This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
[Feature Request] Cache the CUDNN convolution optimization result #10567
Labels
Comments
@eric-haibin-lin : Please label : CUDA, Feature |
Just want to +1. I've talked to quite a few MXNet users who could really use this functionality. |
+1 |
My team has an implementation of this in a fork. We'll try and contribute it back, but no promises on a timeline. |
Any updates @KellenSunderland? That feature sounds very useful. |
@KellenSunderland would love to know more, this is very relevant to us. |
+1 |
May be fixed as part of cudnn 8 integration? https://docs.nvidia.com/deeplearning/sdk/cudnn-api/index.html cc @DickJC123 |
This would be super useful, especially in distributed computing: the time required for optimization, scales badly with number of GPUs/computation nodes. Thank you for your efforts. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
On every script run, I get the CUDNN convolution optimization algorithm running. This can take a few seconds, I wonder if we could cache the result locally based on a hash of MXNet + CUDA + CUDNN version for each device ID (or whatever could cause a change in algorithm selection) ?
The text was updated successfully, but these errors were encountered: