-
Notifications
You must be signed in to change notification settings - Fork 957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1x1 Convolution of 2 stride code issue #95
Comments
Hi @sj-leo,
So even if address of the actual src data is changed (i.e. another src is taken) the workspace address remains the same. In other words I would not expect ws to be changed. |
In workflow, you said that 1x1 convolutions w/ non-unit stride compress source image into workspace. So, in the situation where ws does not change address, I think that convolution for all space of ws is not performed and I don’t know it is right implementation. I wonder why the iterative convolution is performed on the same area without changing ih and iw of ws. |
The |
@rsdubtso As you say, I check the code the copied data to remove strides and workspace comes from different parts of the input in thread level. But in single thread, there seems to be no part where iw and ih are changed. |
Convolution kernel goes over the whole image |
As far as I know, the convolution kernel goes over only a fraction of the image ih * iw by the bcast_step. I would like to ask for further information. |
Yeah, my bad -- you are right, kernel does not always process According to the line we reduce src from the right place. Do you have a particular failing example or this question is mostly due to curiosity? :) |
In my case, I am looking at how the convolution operation of Resnet 50 is performed and whether there is room for performance optimization on my curiosity.. |
Loop order is set in jit_avx512_common_1x1_conv_kernel::init_conf method from jit_avx512_common_1x1_conv_kernel.cpp. There are two possible loop orders for 1x1 convolutions with non-unit strides: loop_blr (bcast-load-reduce) or loop_rbl (reduce-bcast-load). The condition in line 169 is correct for these loop orders and it controls that reducing performs only once for certain blocks of input channels and certain fraction of the image ih * iw. |
iw and ih are changed on each iteration of loop over bcast dimension. E.g. line 190 |
Hi, @emfomenk I wonder if this issue within 1x1 convolution is resolved. Thank you for your help
|
Hi @haidj, My suspicion was incorrect. |
Dear MKLDNN developers,
When I ran Resnet 50 with the Intel Caffe and MKLDNN, there is no change of memory address in 1x1 convolution of stride 2.
In the code, your team defined 1x1 convolution of 2 stride to ‘reduce’ and defined address of ‘reduce’ to ‘jcp.ws’.
When I print the address of ‘jcp.ws’, there is only one address of ‘jcp.ws’(I think it should change).
So, I want to know that it is true the address changes or not.
For a detailed explanation, I write the setup and the file of 1x1 convolution code.
Setup
code
src/cpu/jit_avx512_common_1x1_convolution.cpp (line: 166-175)
Thank you.
The text was updated successfully, but these errors were encountered: