Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regex quantifier check to include capture groups #11373

Merged
merged 17 commits into from
Aug 4, 2022

Conversation

davidwendt
Copy link
Contributor

Description

Adds regex compile logic to check quantifier can be used with the previous item even if its within a capture group.
This prevents an infinite loop occurring when evaluating the expression.
Additional gtests are included to check for this condition which should throw an error.

Closes #11311

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added bug Something isn't working 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) non-breaking Non-breaking change labels Jul 27, 2022
@davidwendt davidwendt self-assigned this Jul 27, 2022
@davidwendt davidwendt added this to PR-WIP in v22.10 Release via automation Jul 27, 2022
@codecov
Copy link

codecov bot commented Jul 27, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@9429099). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.10   #11373   +/-   ##
===============================================
  Coverage                ?   86.47%           
===============================================
  Files                   ?      144           
  Lines                   ?    22856           
  Branches                ?        0           
===============================================
  Hits                    ?    19765           
  Misses                  ?     3091           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jul 29, 2022
@davidwendt davidwendt moved this from PR-WIP to PR-Needs review in v22.10 Release Jul 29, 2022
@davidwendt davidwendt marked this pull request as ready for review July 29, 2022 14:54
@davidwendt davidwendt requested a review from a team as a code owner July 29, 2022 14:54
@davidwendt davidwendt requested review from upsj and elstehle July 29, 2022 14:54
cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
@davidwendt davidwendt requested a review from upsj August 3, 2022 14:54
Copy link
Contributor

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adding a minor comment. will re-parse now with the updates suggested by @upsj

cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
v22.10 Release automation moved this from PR-Needs review to PR-Reviewer approved Aug 3, 2022
Copy link
Contributor

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for getting me to learn another digestible bit from the regex universe 💡

@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 53a2f15 into rapidsai:branch-22.10 Aug 4, 2022
v22.10 Release automation moved this from PR-Reviewer approved to Done Aug 4, 2022
@davidwendt davidwendt deleted the bug-hang-capture-nothing branch August 4, 2022 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[BUG] regexp: hanging when attempting to repeat string anchor inside capture group
3 participants