Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regex out-of-bounds write in strided rows logic #11797

Merged

Conversation

davidwendt
Copy link
Contributor

@davidwendt davidwendt commented Sep 27, 2022

Description

Fixes an out-of-bounds write error when a large number of strings requires a strided loop to meet an internal memory maximum. For row sizes that do not require strided loops, the row index never exceeds the size of the column preventing any out-of-bounds access. For large row counts, the CUDA thread index may be larger than the minimal count used for building the working-memory buffer. Since the kernel is launched with a thread-count with a specific block size, extra threads past the end of the minimal count are necessary to fill out the last block. These threads never contribute to the overall result but will attempt to access past the end of the working memory. Writing to this memory may corrupt memory for another kernel launched in parallel from another CPU thread. This change adds logic to prevent the extra threads from doing any work.

Fixes #11768

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added bug Something isn't working 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) non-breaking Non-breaking change labels Sep 27, 2022
@davidwendt davidwendt requested a review from a team as a code owner September 27, 2022 20:29
@davidwendt davidwendt self-assigned this Sep 27, 2022
@davidwendt davidwendt added this to PR-WIP in v22.10 Release via automation Sep 27, 2022
@@ -44,8 +44,10 @@ __global__ void for_each_kernel(ForEachFunction fn, reprog_device const d_prog,

auto const thread_idx = threadIdx.x + blockIdx.x * blockDim.x;
auto const stride = s_prog.thread_count();
for (auto idx = thread_idx; idx < size; idx += stride) {
fn(idx, s_prog, thread_idx);
if (thread_idx < stride) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, sir.

v22.10 Release automation moved this from PR-WIP to PR-Reviewer approved Sep 27, 2022
Copy link
Contributor

@ttnghia ttnghia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix makes me a bit worried as I see this (before fixing) pattern in many other places. Wonder if there's still similar bugs hidden around.

@codecov
Copy link

codecov bot commented Sep 27, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@5a416a0). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-22.10   #11797   +/-   ##
===============================================
  Coverage                ?   87.40%           
===============================================
  Files                   ?      133           
  Lines                   ?    21833           
  Branches                ?        0           
===============================================
  Hits                    ?    19084           
  Misses                  ?     2749           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@mythrocks
Copy link
Contributor

mythrocks commented Sep 28, 2022

I meant to mention this earlier. I was able to repro this failure from a gtest, and compute-sanitizer. This change fixes the failure.
Edit: Here is the relevant section of the compute-sanitizer trace:

=========
========= Invalid __global__ write of size 1 bytes
=========     at 0xa840 in void cudf::strings::detail::transform_kernel<cudf::strings::detail::<unnamed>::contains_fn, bool>(T1, cudf::strings::detail::reprog_device, T2 *, int)
=========     by thread (124,0,0) in block (81380,0,0)
=========     Address 0x7f9d24a92b64 is out of bounds
=========     and is 133 bytes after the nearest allocation at 0x7f9bc4000000 of size 5916666592 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame: [0x23b56c]
=========                in /lib/x86_64-linux-gnu/libcuda.so.1
=========     Host Frame: [0x141cc]
=========                in /home/mithunr/anaconda3/envs/cudf-dev-4/lib/libcudart.so.11.0
=========     Host Frame:cudaLaunchKernel_ptsz [0x4c348]
=========                in /home/mithunr/anaconda3/envs/cudf-dev-4/lib/libcudart.so.11.0
=========     Host Frame:void cudf::strings::detail::launch_transform_kernel<cudf::strings::detail::_GLOBAL__N__43ea0ff1_11_contains_cu_c9293be1::contains_fn, bool>(cudf::strings::detail::_GLOBAL__N__43ea0ff1_11_contains_cu_c9293be1::contains_fn, cudf::strings::detail::reprog_device&, bool*, int, rmm::cuda_stream_view) [0x26497ba]
=========                in /home/mithunr/workspace/dev/cudf/2/cpp/build/libcudf.so
...

The gtest to repro is attached:
strings_regex_oob_test.cu.txt

@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit da04725 into rapidsai:branch-22.10 Sep 28, 2022
v22.10 Release automation moved this from PR-Reviewer approved to Done Sep 28, 2022
@davidwendt davidwendt deleted the bug-regex-kernel-oob-write branch September 28, 2022 18:13
@mythrocks
Copy link
Contributor

I've attached the repro code to this issue. At a later point, it would be good to have a test that covers this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[BUG] Suspected memory corruption with regexp calls
4 participants