You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Encoding::Converter.new(Encoding::UTF_8, Encoding::UTF_8_MAC)
Encoding::ConverterNotFoundError: code converter not found (UTF-8 to UTF8-MAC)
from org/jruby/RubyConverter.java:162:in `initialize'
from org/jruby/RubyConverter.java:135:in `initialize'
The previous implementation was quite slow. This leverages some of the
transcoding abilities built into Ruby 1.9 instead. It is roughly 96%
The roundtrip through UTF_8_MAC here is because ruby won't let you
transcode from UTF_8 to UTF_8. I chose the closest encoding I could
find as an intermediate.
In order to support UTF_8_MAC we'll need to port the whole transcoding subsystem. Currently we're using Java's Charset logic to transcode, and it does not support UTF_8_MAC.
My understanding of UTF_8_MAC is that it prefers to use combining characters rather than single codepoints, so UTF_8 to UTF_8_MAC and back is not likely to round-trip in all cases.
I would suggest that instead of this hack, Rails should use some version of the pure-Ruby String#scrub I implemented (and I think @yorickpeterse improved) from this issue: rubinius/rubinius#2912
Note that this version does not successfully handle all bad characters on JRuby due to incompatibilities in the Charset-based transcoding pipeline (#1459), but for strings with malformed input or no errors, it will work fine and not have the error above.
I will mark this as a bug for JRuby 9k, since by then we should have a proper port of MRI's transcoding logic.