-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-491] Use depthwise convolution by cuDNNv7 if available, updated version #11076
Conversation
I still think this is too much duplicated code. |
Also please correct indentation. |
Actually, I guess this is fine since we'll eventually remove all the logic for older versions of CUDNN. |
@austingg I merged this before seeing your comment. Do you have any concerns? |
@piiswrong need more speed benchmark on different architecture gpu and more accurate cudnn version macro like nvidia-caffe. |
…d version (apache#11076) * Use group convolution by cuDNNv7 if available * Fix coding style * ident-- for #if statements * more ident-- * more ident-- * prefer cudnnv7 depthwise convolution
I have tested mobilenetv2 in V100.
|
@BiranLi do you do some more benchmark, like more batch size and take backward into consideration. |
@austingg Yes, I have tested the same case with batchsize 128.
|
@BiranLi Can you share some more details on how you're doing this benchmark? Thanks! |
Did some extra benchmarks and verified multi-precision training speed improvement on single V100 GPU with mobilenet + ImageNet dataset: |
…d version (apache#11076) * Use group convolution by cuDNNv7 if available * Fix coding style * ident-- for #if statements * more ident-- * more ident-- * prefer cudnnv7 depthwise convolution
…d version (apache#11076) * Use group convolution by cuDNNv7 if available * Fix coding style * ident-- for #if statements * more ident-- * more ident-- * prefer cudnnv7 depthwise convolution
I observed barely no improvement when using mxnet 1.2.1 + cuda8 + cudnn 7.2.1 on 1080ti |
this pull request is based on #10804
with the following further changes:
still use the explicit #if #else #endif statement over
the new variable effective_num_group solution for backward code path compability
because the new variable effective_num_group may confuse readers with standard group convolution