New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ISO-8859-16 #1214

Closed
ftomassetti opened this Issue Nov 10, 2013 · 9 comments

Comments

Projects
None yet
3 participants
@ftomassetti

ftomassetti commented Nov 10, 2013

Running:

code = IO.read('text_iso_8859_16',{ :encoding => 'ISO-8859-16', :mode => 'rb'})
code = code.encode('UTF-8')

On Jruby 1.7.6 gives me:

Encoding::ConverterNotFoundError: code converter not found for ISO-8859-16
   encode at org/jruby/RubyString.java:7597
   (root) at bugtest.rb:2

While on Ruby 2.0 runs flawlessly...

@ftomassetti

This comment has been minimized.

Show comment
Hide comment
@ftomassetti

ftomassetti Nov 10, 2013

It gives me an error with ruby 1.9.3 p 448

bugtest.rb:2:in `encode': code converter not found (ISO-8859-16 to UTF-8) (Encoding::ConverterNotFoundError)
from bugtest.rb:2:in `<main>'

I would say the ruby error is more clear.

I am still confused: the file is loaded but it can not be converted to UTF-8? Does it make any sense?

ftomassetti commented Nov 10, 2013

It gives me an error with ruby 1.9.3 p 448

bugtest.rb:2:in `encode': code converter not found (ISO-8859-16 to UTF-8) (Encoding::ConverterNotFoundError)
from bugtest.rb:2:in `<main>'

I would say the ruby error is more clear.

I am still confused: the file is loaded but it can not be converted to UTF-8? Does it make any sense?

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 11, 2013

Member

First off, it does not appear that OpenJDK has support for ISO-8859-16 in its charset subsystem, which we use for character transcoding. The only way to get support for that encoding would be for us to incorporate a third-party implementation of 8859-16.

We can certainly improve the error but I'd like to actually add support somehow. That may mean finally implementing (porting) the transcoding logic from MRI, so we have identical encoding support (not going to happen until JRuby 9k), or by pulling in some third-party 8859-16 charset impl.

Member

headius commented Nov 11, 2013

First off, it does not appear that OpenJDK has support for ISO-8859-16 in its charset subsystem, which we use for character transcoding. The only way to get support for that encoding would be for us to incorporate a third-party implementation of 8859-16.

We can certainly improve the error but I'd like to actually add support somehow. That may mean finally implementing (porting) the transcoding logic from MRI, so we have identical encoding support (not going to happen until JRuby 9k), or by pulling in some third-party 8859-16 charset impl.

@headius headius closed this in 8bd4963 Nov 11, 2013

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 11, 2013

Member

Please test out jruby/jruby@jruby-1_7 or jruby/jruby@master where I have implemented an ISO-8859-16 charset.

Member

headius commented Nov 11, 2013

Please test out jruby/jruby@jruby-1_7 or jruby/jruby@master where I have implemented an ISO-8859-16 charset.

@ftomassetti

This comment has been minimized.

Show comment
Hide comment
@ftomassetti

ftomassetti Nov 11, 2013

I cloned the repo but I got an error while running maven. I will try again checking out your commit.

ftomassetti commented Nov 11, 2013

I cloned the repo but I got an error while running maven. I will try again checking out your commit.

@ftomassetti

This comment has been minimized.

Show comment
Hide comment
@ftomassetti

ftomassetti Nov 11, 2013

Ok, checkout out the exact commit (8bd4963) I can build JRuby. Then I ran bin/irb but I got the same error. Should I somewhat specify to use the standard libraries from?

Putting the code in a script and running bin/jruby test.rb I got instead:
ISO_8859_16.java:73:in decodeLoop': java.lang.ArrayIndexOutOfBoundsException: -4 from CharsetDecoder.java:561:indecode'
from CharsetTranscoder.java:484:in transcode' from CharsetTranscoder.java:319:inprimitiveConvert'
from CharsetTranscoder.java:280:in transcode' from CharsetTranscoder.java:236:intranscode'
from EncodingUtils.java:873:in transcodeLoop' from EncodingUtils.java:801:instrTranscode0'
from EncodingUtils.java:736:in strTranscode' from EncodingUtils.java:707:instrEncode'
from RubyString.java:7599:in encode' from RubyString$INVOKER$i$encode.gen:-1:incall'
from CachingCallSite.java:326:in cacheAndCall' from CachingCallSite.java:170:incall'
from test.rb:3:in __file__' from test.rb:-1:inload'
from Ruby.java:811:in runScript' from Ruby.java:804:inrunScript'
from Ruby.java:673:in runNormally' from Ruby.java:522:inrunFromMain'
from Main.java:395:in doRunFromMain' from Main.java:290:ininternalRun'
from Main.java:217:in run' from Main.java:197:inmain'

Let me know If I can help with testing this

ftomassetti commented Nov 11, 2013

Ok, checkout out the exact commit (8bd4963) I can build JRuby. Then I ran bin/irb but I got the same error. Should I somewhat specify to use the standard libraries from?

Putting the code in a script and running bin/jruby test.rb I got instead:
ISO_8859_16.java:73:in decodeLoop': java.lang.ArrayIndexOutOfBoundsException: -4 from CharsetDecoder.java:561:indecode'
from CharsetTranscoder.java:484:in transcode' from CharsetTranscoder.java:319:inprimitiveConvert'
from CharsetTranscoder.java:280:in transcode' from CharsetTranscoder.java:236:intranscode'
from EncodingUtils.java:873:in transcodeLoop' from EncodingUtils.java:801:instrTranscode0'
from EncodingUtils.java:736:in strTranscode' from EncodingUtils.java:707:instrEncode'
from RubyString.java:7599:in encode' from RubyString$INVOKER$i$encode.gen:-1:incall'
from CachingCallSite.java:326:in cacheAndCall' from CachingCallSite.java:170:incall'
from test.rb:3:in __file__' from test.rb:-1:inload'
from Ruby.java:811:in runScript' from Ruby.java:804:inrunScript'
from Ruby.java:673:in runNormally' from Ruby.java:522:inrunFromMain'
from Main.java:395:in doRunFromMain' from Main.java:290:ininternalRun'
from Main.java:217:in run' from Main.java:197:inmain'

Let me know If I can help with testing this

@mkristian

This comment has been minimized.

Show comment
Hide comment
@mkristian

mkristian Nov 11, 2013

Member

on master the commit ecd3f56 corrupted a
method :(
jruby-1_7 branch seems OK

Member

mkristian commented Nov 11, 2013

on master the commit ecd3f56 corrupted a
method :(
jruby-1_7 branch seems OK

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 12, 2013

Member

I'm looking into the breakage.

Member

headius commented Nov 12, 2013

I'm looking into the breakage.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 12, 2013

Member

Signed bytes strike again... I have pushed an additional fix to both branches for byte values > 127 which should fix the additional issue you found.

I'll see if I can improve the tests for this encoding.

Member

headius commented Nov 12, 2013

Signed bytes strike again... I have pushed an additional fix to both branches for byte values > 127 which should fix the additional issue you found.

I'll see if I can improve the tests for this encoding.

@ftomassetti

This comment has been minimized.

Show comment
Hide comment
@ftomassetti

ftomassetti Nov 17, 2013

Running fine for me. Thanks!

ftomassetti commented Nov 17, 2013

Running fine for me. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment