Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in regexp.QuoteMeta() #40378

Closed
davidsmd opened this issue Jul 24, 2020 · 5 comments
Closed

Possible bug in regexp.QuoteMeta() #40378

davidsmd opened this issue Jul 24, 2020 · 5 comments

Comments

@davidsmd
Copy link

@davidsmd davidsmd commented Jul 24, 2020

What version of Go are you using (go version)?

According to https://www.hackerrank.com/environment
they are using 1.13... it also appears to happen on play.golang.org

$ go version

Does this issue reproduce with the latest release?

Don't know, don't have time to check right now, came across this while refreshing for job interviews...

What operating system and processor architecture are you using (go env)?

Don't know. Whatever hackrank uses, and whatever the playground uses. Not given on environment info page.

go env Output
$ go env

What did you do?

Writeup and code are here:

https://play.golang.org/p/t6jbBNFx44m

Summary version is that with a string of special chars as input to regexp.QuoteMeta() and passed to regexp.MustCompile(),
one ordering of the characters in the string produce an accurate count of occurrences of a minus sign/dash, and a different ordering returns an incorrect count.

Please see writeup in comments at the playground link; I found this in passing and don't have time to investigate further.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 24, 2020

This is working as expected. regexp.QuoteMeta only works on a complete regular expression. Your example is taking the result of regexp.QuoteMeta and putting [ and ] around it to form a new regular expression, and using that. That doesn't work. You need to pass the complete string, including the [ and ], to regexp.QuoteMeta. You can't add them later and expect sensible behavior.

@davidsmd
Copy link
Author

@davidsmd davidsmd commented Jul 24, 2020

https://play.golang.org/p/qR1Dz0M5l7B

Here's a version with the braces included in the regexp.QuoteMeta(). Passing the complete string for conversion to regex by MustCompile, with braces included, just escapes the braces and yields a regex that looks for the whole group of symbols together, rather than any of the symbols in the set.

Escaping the braces is to be expected, though, since the example for QuoteMeta in the documentation demonstrates exactly that.

@davidsmd
Copy link
Author

@davidsmd davidsmd commented Jul 24, 2020

https://play.golang.org/p/VAHU5YbWyTv

Okay, final go at this for right now. This has a bunch of cases, and shows the output from regexp.QuoteMeta() ... the output does not appear to be correct, because minus signs are not escaped, which leaves them to be treated as a regular expression range metacharacter, which is not the behaviour described in the documentation.

The only reason that one case works here, at all, is that if the "-" is at the end of the string to escape and then compile, it gets treated as escaped (even though it's not, I guess?) because there's nothing after it to suggest that it should be treated as a regex range metachar...

You can further demonstrate this by putting another character class in the string to escape, like 1a-+, which will fail to compile and panic if you add the braces after quotemeta, and again, won't do anything, if you add the braces to the string arg to quotemeta:

Panic version
https://play.golang.org/p/xEiaho_ADVO

No panic, no minus
https://play.golang.org/p/d0p_ca5BIB3

No panic, with minus
https://play.golang.org/p/vLGqPE7LTIr

So something is wrong, unless there is a reason for QuoteMeta not to escape "-" despite use as a metachar...

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 24, 2020

Minus signs do not have to be quoted, because they are not regexp meta-characters. The fact that they are special inside of [] doesn't matter here, because regexp.QuoteMeta doesn't return a string that behaves specially inside [].

What regexp.QuoteMeta does is take a string of characters and return a regular expression that matches exactly that string of characters. If you think there is a bug here, please show a test case where that fails.

@davidsmd
Copy link
Author

@davidsmd davidsmd commented Jul 24, 2020

Oh. Huh. I didn't realize that - isn't considered a meta-character generally, or that it errors if you try to escape it outside of []. My apologies, just didn't quite match my assumptions. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.