Calling String#to_r causes loss of Regexp.last_match data #1563

Closed
GUI opened this Issue Mar 16, 2014 · 4 comments

Projects

None yet

2 participants

@GUI
Contributor
GUI commented Mar 16, 2014

If I call to_r on a string object, it causes the current Regexp.last_match data (along with the associated $1, etc type variables) to become nil. This has been tested on jRuby 1.7.11. This does not appear to happen under MRI (tested 1.9.3p545 and 2.1.1p76).

Here's an example demonstrating the issue: https://gist.github.com/GUI/9584219

At first I thought it might be an unsafe to assume Regexp.last_match will persist across method calls that might internally perform other regex matching (given last_match's seemingly psuedo-global behavior). However, as noted, this doesn't happen in MRI, and as seen in that test above, JRuby does seem to properly persist the last match data within each scope when I make an a call to my own method that performs regex matching. So I don't know if it's just specific to String#to_r, or if there are possibly other internal JRuby calls that might reset this match data, but String#to_r definitely makes it happen.

For a real-world example of this, see the fast_string_to_time method inside Rails. There are obviously different ways to structure this code to workaround this problem, but currently this issue is causing fast_string_to_time to return nil under JRuby. This triggers JRuby to always use the slower fallback_string_to_time for datetime parsing, even if the string was capable of being parsed by the faster method. This is causing degraded performance for ActiveRecord under JRuby whenever datetime or timestamp columns are involved (jruby/activerecord-jdbc-adapter#540).

@atambo
Member
atambo commented Mar 16, 2014

Looks like String#to_r uses a gsub to collapse multiple underscores into a single underscore here:

https://github.com/jruby/jruby/blob/jruby-1_7/core/src/main/java/org/jruby/RubyString.java#L7498

which ends up resetting your last_match to nil because there are no underscores in your strings.

@GUI GUI referenced this issue in jruby/activerecord-jdbc-adapter Mar 16, 2014
Open

improve AR timestamp column Date/Time parsing #540

@GUI
Contributor
GUI commented Mar 16, 2014

Thanks for the quick response. I updated the gist with other examples that call gsub inside other methods. Even if I add my own similar string method that performs a gsub (String#replace_test), the original scope doesn't lose the match data like it does when calling to_r.

I would expect the match data to go away if I inlined the gsub call directly into that bar method (and if I do that, it does consistently wipe the match data across all ruby implementations). However, it appears like any gsubs or matches in separate method calls shouldn't wipe the original scope's match data. This mostly seems to be happening properly in JRuby, with the exception of this to_r call. Is it perhaps something related to that specific gsub call, or somehow gsub calls originating out of Java? Thanks again!

@atambo atambo closed this Mar 16, 2014
@atambo atambo reopened this Mar 16, 2014
@atambo atambo added this to the JRuby 1.7.12 milestone Mar 16, 2014
@atambo
Member
atambo commented Mar 17, 2014

Fixed by #1564

@atambo atambo closed this Mar 17, 2014
@GUI
Contributor
GUI commented Mar 18, 2014

Thanks so much for the speedy fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment