Fix multibyte support in the regexp node handler #8989
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Auto-correction by Style/RedundantRegexpEscape can corrupt a program by removing a character at the wrong index when it deals with a regexp pattern containing multibyte characters.
Note that the closing slash is removed instead of the redundant backslash.
The problem is that the cop mixes byte index with character index and results in manipulating source code with the wrong index values. Regexp::Parser stores byte index in Regexp::Expression#ts and that is the source of confusion.
It can be fixed by converting index values like this, but ideally it should be done in Regexp::Parser.
Before submitting the PR make sure the following are checked:
[Fix #issue-number]
(if the related issue exists).master
(if not - rebase it).{change_type}_{change_description}.md
if the new code introduces user-observable changes. See changelog entry format for details.bundle exec rake default
. It executes all tests and RuboCop for itself, and generates the documentation.