Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling String#to_r causes loss of Regexp.last_match data #1563

Closed
GUI opened this Issue Mar 16, 2014 · 4 comments

Comments

Projects
None yet
2 participants
@GUI
Copy link
Contributor

GUI commented Mar 16, 2014

If I call to_r on a string object, it causes the current Regexp.last_match data (along with the associated $1, etc type variables) to become nil. This has been tested on jRuby 1.7.11. This does not appear to happen under MRI (tested 1.9.3p545 and 2.1.1p76).

Here's an example demonstrating the issue: https://gist.github.com/GUI/9584219

At first I thought it might be an unsafe to assume Regexp.last_match will persist across method calls that might internally perform other regex matching (given last_match's seemingly psuedo-global behavior). However, as noted, this doesn't happen in MRI, and as seen in that test above, JRuby does seem to properly persist the last match data within each scope when I make an a call to my own method that performs regex matching. So I don't know if it's just specific to String#to_r, or if there are possibly other internal JRuby calls that might reset this match data, but String#to_r definitely makes it happen.

For a real-world example of this, see the fast_string_to_time method inside Rails. There are obviously different ways to structure this code to workaround this problem, but currently this issue is causing fast_string_to_time to return nil under JRuby. This triggers JRuby to always use the slower fallback_string_to_time for datetime parsing, even if the string was capable of being parsed by the faster method. This is causing degraded performance for ActiveRecord under JRuby whenever datetime or timestamp columns are involved (jruby/activerecord-jdbc-adapter#540).

@atambo

This comment has been minimized.

Copy link
Member

atambo commented Mar 16, 2014

Looks like String#to_r uses a gsub to collapse multiple underscores into a single underscore here:

https://github.com/jruby/jruby/blob/jruby-1_7/core/src/main/java/org/jruby/RubyString.java#L7498

which ends up resetting your last_match to nil because there are no underscores in your strings.

@GUI

This comment has been minimized.

Copy link
Contributor Author

GUI commented Mar 16, 2014

Thanks for the quick response. I updated the gist with other examples that call gsub inside other methods. Even if I add my own similar string method that performs a gsub (String#replace_test), the original scope doesn't lose the match data like it does when calling to_r.

I would expect the match data to go away if I inlined the gsub call directly into that bar method (and if I do that, it does consistently wipe the match data across all ruby implementations). However, it appears like any gsubs or matches in separate method calls shouldn't wipe the original scope's match data. This mostly seems to be happening properly in JRuby, with the exception of this to_r call. Is it perhaps something related to that specific gsub call, or somehow gsub calls originating out of Java? Thanks again!

@atambo

This comment has been minimized.

Copy link
Member

atambo commented Mar 17, 2014

Fixed by #1564

@atambo atambo closed this Mar 17, 2014

@GUI

This comment has been minimized.

Copy link
Contributor Author

GUI commented Mar 18, 2014

Thanks so much for the speedy fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.