-
Notifications
You must be signed in to change notification settings - Fork 2k
webgpu: support LRNGrad kernel #7196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gyagp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one nit
| } | ||
| else if (k >= depthBegin && k < depthEnd){ | ||
| var dyi = -2.0 * uniforms.alpha * uniforms.beta | ||
| * getInputImage(b ,r ,c, k) * getOutputImage(b, r, c, d) / norm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
| * getInputImage(b ,r ,c, k) * getOutputImage(b, r, c, d) / norm; | |
| * getInputImage(b, r, c, k) * getOutputImage(b, r, c, d) / norm; |
You may also fix the WebGL code.
| dispatch: [number, number, number]; | ||
| variableNames = ['inputImage', 'outputImage', 'dy']; | ||
| uniforms = | ||
| 'depth : i32, depthRadius : i32, bias : f32, alpha : f32, beta : f32,'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that depth is not necessary as a uniform. You can use let MAX_DEPTH_END = uniforms.outShape[3]; in shader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| let depthEnd = min(uniforms.depth, d + uniforms.depthRadius + 1); | ||
|
|
||
| let MIN_DEPTH_BEGIN = 0; | ||
| let MAX_DEPTH_END = uniforms.depth; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move L54-L55 out of for (var d = 0; d < uniforms.depth; d++). It seems that you can also use MAX_DEPTH_END for the outermost for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
qjia7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more comments. Sorry missed them in last review.
| continue; | ||
| } | ||
| else if (k >= depthBegin && k < depthEnd) { | ||
| norm += getInputImage(b, r, c, k) * getInputImage(b, r, c, k); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems getInputImage(b, r, c, k) is called twice. Is it better to cache it like below:
let inputValue = getInputImage(b, r, c, k);
norm += inputValue * inputValue ;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may cost one more register here, let compiler optimizes the code, Is it OK?
| } | ||
| else if (k >= depthBegin && k < depthEnd){ | ||
| var dyi = -2.0 * uniforms.alpha * uniforms.beta | ||
| * getInputImage(b, r, c, k) * getOutputImage(b, r, c, d) / norm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOutputImage(b, r, c, d) should be put out of for(var k = MIN_DEPTH_BEGIN; k < MAX_DEPTH_END; k++) since the arguments are never changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because getOutputImage(b, r, c, d) is in if-else branch, other branches may does not need to access memory, it is not necessary to put the IO access out of for-loop.
qjia7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is