Encoding::UndefinedConversionError: Input length = 1 #318

lkfken opened this Issue · 6 comments

3 participants

lkfken Charles Oliver Nutter Thomas E Enebo


#coding: utf-8
string = %q{你好}

puts Encoding.default_external
puts string.encoding"test.xml", "w") do |f|
  f.write string

With jruby 1.7.0.RC1 (1.9.3p203) 2012-09-25 8e849de on Java HotSpot(TM) Client VM 1.6.0_35-b10 [Windows XP-x86], the output...

Encoding::UndefinedConversionError: Input length = 1
   write at org/jruby/
  (root) at X:/ene/RubyScripts/1development/try_msword/app.rb:24
    open at org/jruby/
  (root) at X:/ene/RubyScripts/1development/try_msword/app.rb:23
    load at org/jruby/
  (root) at -e:1

Process finished with exit code 1

With MRI ruby 1.9.3p125 (2012-02-16) [i386-mingw32], the output...


Process finished with exit code 0

Apparently, Encoding.default_external returns different values.
Also, I cannot write to a file if the string is UTF-8 under JRuby. However, the same code works fine under MRI Ruby.

Charles Oliver Nutter

This seems to be lingering issues with Windows-1252 being used as the default external encoding in JRuby when it should not be. Can you try passing -Eutf-8 when running your script, to force external to utf-8?

Thomas E Enebo

Should we be UTF-8 or should we be file.encoding? We are happy with UTF-8 on other platforms since Java file.coding defaults to that but MRI does not default to that on those platforms? I can see both sides of this.

If we did match defaults as specified by MRI we would probably get less bug reports.


@headius with -Eutf-8, JRuby is able to generate the file test.xml just fine under Windows XP. No more conversion error. Thank you.

Charles Oliver Nutter

This will likely go away with 2.0 support, where the default encoding is always UTF-8.

Charles Oliver Nutter headius self-assigned this
Charles Oliver Nutter

Oops, I started to close this before realizing it was a problem on Windows. @lkfken can you test a recent jruby master build?


jruby (1.9.3p392) 2014-10-28 4e93f31 on Java HotSpot(TM) Client VM 1.7.0_07-b10 +jit [Windows XP-x86]

Same source code.

Even without the flag -Eutf-8, the test file is generated fine. No conversion error.

Thank you.

