Fix tidy_bytes for JRuby #13919

jcoyne · 2014-02-02T04:16:24Z

The previous implementation was broken because JRuby (1.7.10) doesn't
have a code converter for UTF-8 to UTF8-MAC.

jcoyne · 2014-02-02T04:16:35Z

@norman @burke can I get your input on this?

jcoyne · 2014-02-02T04:16:58Z

@headius I wouldn't mind your review either.

burke · 2014-02-03T01:22:54Z

I wouldn't be surprised if this turns out to be measurably slower. I chose UTF-8 Mac because it's so nearly identical that I figured it should be pretty trivial to transcode, whereas UTF-16 is a completely different encoding.

I don't have any numbers to back this hunch up, however, and even if it is slower, it may not make much of a difference in real cases.

It should, in theory, work. I can't think of any case where a character would be representable in UTF-8 but not UTF-16 (or vice versa).

Comment at the top of the method is in need of minor rewording too.

jcoyne · 2014-02-03T02:52:37Z

I don't think it should be any slower. It will have a larger memory footprint, but seems better to get this working on JRuby than to save a few bytes.

tenderlove · 2014-02-09T07:22:06Z

Should we merge this?

norman · 2014-02-09T11:50:40Z

Has anyone actually benchmarked it? We've all chipped away at this method
over the years to make it fast, I'd hate to take a step backwards. In the
event it's slower on MRI we could just set the transitional encoding
conditionally based on the Ruby platform. I would do it myself right now
but I'm traveling.

Sent from my phone

jcoyne · 2014-02-09T20:07:09Z

Here's a benchmark run on ruby 2.1.0p0. I don't show any appreciable difference.

https://gist.github.com/jcoyne/8905114

norman · 2014-02-09T22:12:20Z

Awesome. Let's merge it!

arunagw · 2014-02-09T22:14:51Z

This will require a rebase with master as well

The previous implementation was broken because JRuby (1.7.10) doesn't have a code converter for UTF-8 to UTF8-MAC.

jcoyne · 2014-02-10T14:11:29Z

@arunagw rebased.

burke · 2014-02-10T16:44:31Z

👍 from me too.

Fix tidy_bytes for JRuby

Fix tidy_bytes for JRuby

ae28e4b

The previous implementation was broken because JRuby (1.7.10) doesn't have a code converter for UTF-8 to UTF8-MAC.

rafaelfranca added a commit that referenced this pull request Feb 10, 2014

Merge pull request #13919 from jcoyne/fix_jruby_encoding

e063dbc

Fix tidy_bytes for JRuby

rafaelfranca merged commit e063dbc into rails:master Feb 10, 2014

jcoyne deleted the fix_jruby_encoding branch February 10, 2014 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tidy_bytes for JRuby #13919

Fix tidy_bytes for JRuby #13919

jcoyne commented Feb 2, 2014

jcoyne commented Feb 2, 2014

jcoyne commented Feb 2, 2014

burke commented Feb 3, 2014

jcoyne commented Feb 3, 2014

tenderlove commented Feb 9, 2014

norman commented Feb 9, 2014

jcoyne commented Feb 9, 2014

norman commented Feb 9, 2014

arunagw commented Feb 9, 2014

jcoyne commented Feb 10, 2014

burke commented Feb 10, 2014

Fix tidy_bytes for JRuby #13919

Fix tidy_bytes for JRuby #13919

Conversation

jcoyne commented Feb 2, 2014

jcoyne commented Feb 2, 2014

jcoyne commented Feb 2, 2014

burke commented Feb 3, 2014

jcoyne commented Feb 3, 2014

tenderlove commented Feb 9, 2014

norman commented Feb 9, 2014

jcoyne commented Feb 9, 2014

norman commented Feb 9, 2014

arunagw commented Feb 9, 2014

jcoyne commented Feb 10, 2014

burke commented Feb 10, 2014