Join GitHub today
RegexpError "invalid pattern in look-behind" for certain Regexps since 220.127.116.11 #5086
We've been running a regular expression matching certain file extensions since JRuby 18.104.22.168, but after upgrading to 22.214.171.124, it breaks. 126.96.36.199 is the last release where it works.
We've stripped down the regular expression a lot and it still fails with
Another expression that fails is
Is there any other input you need?
added a commit
Mar 14, 2018
MRI has special case for us-ascii regexps and strings with 7-bit coderange where it uses regexp encoding for the match, whereas we're creating new regexp with actual string encoding. So ultimately this could be a big performance penalty for two reasons:
Fixed in 36b44df
I think the root cause also deserves some explanation. Character series like ss and ff are special because in Unicode for example ss is https://en.wikipedia.org/wiki/%C3%9F.
So right now there's some caviats regarding MRI "us-ascii regexps by default":
If either string or regexp happens to end up with unicode encoding, look-behind will blow.