UTF-8 string interpolation in US-ASCII string sometimes returns wrong US-ASCII string #1242

Closed
rsim opened this Issue Nov 19, 2013 · 3 comments

Projects

None yet

2 participants

@rsim
rsim commented Nov 19, 2013

Create file quoting.rb with content

def quote(value)
  "'#{value}'"
end

And quote_encoding.rb with content

# encoding: utf-8

require "yaml"
require "quoting"

s = YAML.dump(abc: "āčē")

puts s.encoding # => UTF-8
puts quote(s).encoding  # => US-ASCII
puts quote(s.force_encoding("UTF-8")).encoding # => UTF-8

When executing quote_encoding.rb then output is

UTF-8
US-ASCII
UTF-8

Second result also should be UTF-8.

In JRuby 1.7.4 the result is UTF-8 in all cases but in JRuby 1.7.5, 1.7.6 and 1.7.8 second result is US-ASCII.

@headius
Member
headius commented Nov 19, 2013

Confirmed on 1.7.8. Appears to work right on master.

@headius
Member
headius commented Nov 19, 2013

Also confirmed on jruby-1_7, both interpreted and compiled. Strange that master does not exhibit the problem.

This is very similar to #1204 as well.

@headius headius added a commit that closed this issue Nov 21, 2013
@headius headius Re-port StringIO and methods required for it.
This incorporates missing flags, string modification, and encoding
logic into StringIO by doing a line-by-line behavior port from
MRI.

Fixes #1204.
Fixes #1242.
61014cf
@headius headius closed this in 61014cf Nov 21, 2013
@headius
Member
headius commented Nov 21, 2013

This is now fixed, but we need a test for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment