-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
depth wise convolution #5649
Comments
Do you mean the parameter group in conv layer ?
On 26 May 2017 9:40 a.m., "zjchuyp" <notifications@github.com> wrote:
Caffe train depth wise convolution is very slow. Is there have plan to
reimplement the depth wise convolution?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#5649>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADfjjE3ju8Qv8kSVwdXbe2MygHJWK0vDks5r9oIIgaJpZM4NnRlG>
.
|
Yes, depth wise convolution is in paper "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications" https://arxiv.org/abs/1704.04861 |
I also tried it several weeks ago, you are right, low speed and high memory consuming. |
I met this problem either. I saw the TF function called "DepthwiseConv2DKernel" , I didn't find any difference except TF uses EIGEN. Do you solve this problem? |
You may be interested in this #5665 |
I think Caffe doesnt perform im2col #group times:
|
@lolongcovas |
@willyd |
@lolongcovas, @willyd |
@zjchuyp |
It is still slow using cudnn implementation? According to the code, cudnn convolution calls w.r.t to all groups are all asynchronous at different cuda streams and will be synchronized at the end of forward/backward. Therefore GPU should be make use of as much as possible. |
@gzygzy9211 |
@willyd |
@birdwcp I think you should digging into it to find the reason |
hi |
To get faster depthwise convolutions there is a separate gemm call that needs to be implemented. As far as I know, no one against this version of Caffe has submitted a PR to do so. |
Caffe training depth wise convolution is very slow. Is there has plan to reimplement the depth wise convolution?
The text was updated successfully, but these errors were encountered: