-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regex out-of-bounds write in strided rows logic #11797
Fix regex out-of-bounds write in strided rows logic #11797
Conversation
@@ -44,8 +44,10 @@ __global__ void for_each_kernel(ForEachFunction fn, reprog_device const d_prog, | |||
|
|||
auto const thread_idx = threadIdx.x + blockIdx.x * blockDim.x; | |||
auto const stride = s_prog.thread_count(); | |||
for (auto idx = thread_idx; idx < size; idx += stride) { | |||
fn(idx, s_prog, thread_idx); | |||
if (thread_idx < stride) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, sir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix makes me a bit worried as I see this (before fixing) pattern in many other places. Wonder if there's still similar bugs hidden around.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.10 #11797 +/- ##
===============================================
Coverage ? 87.40%
===============================================
Files ? 133
Lines ? 21833
Branches ? 0
===============================================
Hits ? 19084
Misses ? 2749
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
I meant to mention this earlier. I was able to repro this failure from a
The |
@gpucibot merge |
I've attached the repro code to this issue. At a later point, it would be good to have a test that covers this case. |
Description
Fixes an out-of-bounds write error when a large number of strings requires a strided loop to meet an internal memory maximum. For row sizes that do not require strided loops, the row index never exceeds the size of the column preventing any out-of-bounds access. For large row counts, the CUDA
thread index
may be larger than the minimal count used for building the working-memory buffer. Since the kernel is launched with a thread-count with a specific block size, extra threads past the end of the minimal count are necessary to fill out the last block. These threads never contribute to the overall result but will attempt to access past the end of the working memory. Writing to this memory may corrupt memory for another kernel launched in parallel from another CPU thread. This change adds logic to prevent the extra threads from doing any work.Fixes #11768
Checklist