-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix][TOPI] Fix the integer overflow problem of the scatter_nd op. #8415
[BugFix][TOPI] Fix the integer overflow problem of the scatter_nd op. #8415
Conversation
I've implemented an alternative fix in #8419 |
@tkonolige Thank you for review this PR. Just curious, is there any specific reason why you created another PR? I understand the alternative fix you proposed maybe better than the existing one, in someway probably. But why bother create a new separate PR? We can discuss and improve the code here, I think that's exactly what code review is about, right? |
@zhuwenxi Sorry, I definitely shouldn't have opened a new PR. I just got a little hasty and did the fix myself. Feel free to copy the code from that PR into this one. I'll end up closing the other one. |
@tkonolige That's OK, it happens. |
You hit a flaky test, I found a fix here: #8431 |
1. Existing scatter_nd cuda implementation has a very large bound, which could overflow int32 range when input tensor shape is large enough; 2. The overflow could cause the if statement always evaluate to true, thus conducts invalid memory accesses; 3. We fix this problem in this commit by reducing the bound, the original large bound is not only unnecessary, but also degrading the performance; With this fix, scatter_op's performance improves 100x on some cases.
842243b
to
cd30a24
Compare
@mbrookhart I've fixed the UT failure. |
Thanks @zhuwenxi @tkonolige |
…apache#8415) * Fix the integer overflow problem of the scatter_nd op. * Fix scatter_nd's crash problem: 1. Existing scatter_nd cuda implementation has a very large bound, which could overflow int32 range when input tensor shape is large enough; 2. The overflow could cause the if statement always evaluate to true, thus conducts invalid memory accesses; 3. We fix this problem in this commit by reducing the bound, the original large bound is not only unnecessary, but also degrading the performance; With this fix, scatter_op's performance improves 100x on some cases. Co-authored-by: wenxizhu <wenxizhu@tencent.com>
…apache#8415) * Fix the integer overflow problem of the scatter_nd op. * Fix scatter_nd's crash problem: 1. Existing scatter_nd cuda implementation has a very large bound, which could overflow int32 range when input tensor shape is large enough; 2. The overflow could cause the if statement always evaluate to true, thus conducts invalid memory accesses; 3. We fix this problem in this commit by reducing the bound, the original large bound is not only unnecessary, but also degrading the performance; With this fix, scatter_op's performance improves 100x on some cases. Co-authored-by: wenxizhu <wenxizhu@tencent.com>
Problem Statement
scatter_nd
crashes on cuda backend, when input data shape is slightly larger than usual.Code to reproduce
Error Message
Root Cause
We can see the problem more clearly from the cuda code generated. The TIR implementation of scatter_nd would cause a int32 overflow when "i" is large, thus the if statement is always evaluate to true, and conducts a invalid memory access.