Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Encoding::UndefinedConversionError: Input length = 1 #318

Open
lkfken opened this Issue · 6 comments

3 participants

lkfken Charles Oliver Nutter Thomas E Enebo
lkfken

SOURCE:

#coding: utf-8
string = %q{你好}

puts Encoding.default_external
puts string.encoding
File.open("test.xml", "w") do |f|
  f.write string
end

With jruby 1.7.0.RC1 (1.9.3p203) 2012-09-25 8e849de on Java HotSpot(TM) Client VM 1.6.0_35-b10 [Windows XP-x86], the output...

Windows-1252
UTF-8
Encoding::UndefinedConversionError: Input length = 1
   write at org/jruby/RubyIO.java:1401
  (root) at X:/ene/RubyScripts/1development/try_msword/app.rb:24
    open at org/jruby/RubyIO.java:1180
  (root) at X:/ene/RubyScripts/1development/try_msword/app.rb:23
    load at org/jruby/RubyKernel.java:1045
  (root) at -e:1

Process finished with exit code 1

With MRI ruby 1.9.3p125 (2012-02-16) [i386-mingw32], the output...

UTF-8
UTF-8

Process finished with exit code 0

Apparently, Encoding.default_external returns different values.
Also, I cannot write to a file if the string is UTF-8 under JRuby. However, the same code works fine under MRI Ruby.

Charles Oliver Nutter
Owner

This seems to be lingering issues with Windows-1252 being used as the default external encoding in JRuby when it should not be. Can you try passing -Eutf-8 when running your script, to force external to utf-8?

Thomas E Enebo
Owner

Should we be UTF-8 or should we be file.encoding? We are happy with UTF-8 on other platforms since Java file.coding defaults to that but MRI does not default to that on those platforms? I can see both sides of this.

If we did match defaults as specified by MRI we would probably get less bug reports.

lkfken

@headius with -Eutf-8, JRuby is able to generate the file test.xml just fine under Windows XP. No more conversion error. Thank you.

Charles Oliver Nutter
Owner

This will likely go away with 2.0 support, where the default encoding is always UTF-8.

Charles Oliver Nutter headius self-assigned this
Charles Oliver Nutter
Owner

Oops, I started to close this before realizing it was a problem on Windows. @lkfken can you test a recent jruby master build?

lkfken

jruby 1.7.16.1 (1.9.3p392) 2014-10-28 4e93f31 on Java HotSpot(TM) Client VM 1.7.0_07-b10 +jit [Windows XP-x86]

Same source code.

Even without the flag -Eutf-8, the test file is generated fine. No conversion error.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.