Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with using - inside brackets [] with using Style/RedundantRegexpEscape #8623

Open
ebeursken opened this issue Aug 31, 2020 · 3 comments
Open

Comments

@ebeursken
Copy link

When creating a regex expression, Rubocop automatically says escaping a dash is not necessary.
When you do not escape a dash (-) inside brackets ([]) it means something else and causes unexpected behavior, since it is seen as a range value instead of an actual dash inside the regex.
Thus, this rubocop rule results in regex expressions that are not the same as with escaping those characters!


Expected behavior

No backslash is needed for a -
And thus within a regex expression, this - means an actual -

Actual behavior

Not escaping this - results in unexpected behavior when using this inside a [], so that the regex expression allows something that you did not expect.
When using dash inside those type of brackets without backslash, it can mean a range of characters (e.g. [a-d] means a, b, c or d)
See : https://ruby-doc.org/core-2.5.3/Regexp.html#class-Regexp-label-Character+Classes

Steps to reproduce the problem

Code without backslash, according to the rubocop style:

NAME = /([[[:alnum:]]+-_\.])+/

This seems to be a range instead of an actual dash, since it is used inside a []

Code with a backslash

NAME = /([[[:alnum:]]+\-_\.])+/

RuboCop version

0.88.0 (using Parser 2.7.1.4, rubocop-ast 0.2.0, running on ruby 2.7.1 x86_64-linux-musl)

@owst
Copy link
Contributor

owst commented Aug 31, 2020

Hi @ebeursken - the issue of \- being reported as unnecessary was fixed in 0adf3cd and was released in v0.89.0 and later versions.

However, I wonder if there is a mistake in your regexp - you have [[[:alnum:]], which has one too many leading [s:

> /[[[:alnum:]]]/ =~ 'b'
=> nil

Edit: apologies, this example is incorrect - I had typoed on :alnum: and missed the trailing : in my pry session, but fixed when pasting here.

I don't know what you're trying to match, but maybe NAME = /([[:alnum:]+\-_\.])+/ is what you need?

@ebeursken
Copy link
Author

Thank you for your quick response! @owst
The extra bracket at the start is closed just before before the last ), but thanks for sharing your thoughts!

@owst
Copy link
Contributor

owst commented Sep 1, 2020

Hi @ebeursken apologies, I realised I typoed on my example above.

It seems Ruby allows redundant nesting of character classes, which I hadn't realised (maybe I can alter Style/RedundantRegexpCharacterClass to catch such patterns). For example:

[1] pry(main)> RUBY_VERSION
=> "2.7.1"
[2] pry(main)> /[a]/ =~ 'a'
=> 0
[3] pry(main)> /[[a]]/ =~ 'a'
=> 0
[4] pry(main)> /[[[a]]]/ =~ 'a'
=> 0

Therefore I believe your regexp:

 /([[[:alnum:]]+\-_\.])+/

is equivalent to

 /([[:alnum:]+\-_\.])+/

and then . does not need to be escaped within a character class, so we have:

 /([[:alnum:]+\-_.])+/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants