Skip to content

YAML.load_file() should return parsed content but returns false #1824

Closed
donv opened this Issue Jul 17, 2014 · 3 comments

3 participants

@donv
donv commented Jul 17, 2014

Given a file en.yml with the content

count: 2

The following command

ruby -ryaml -e "puts YAML.load_file('en.yml')"

should output

{"count"=>2}

This works on MRI 1.9, 2.0, and 2.1 and on JRuby 1.7.x, but fails on master where we get the output

false
@headius
JRuby Team member
headius commented Jul 17, 2014

I ran into this while getting tests green for new IO stuff. I think we need to re-port some of the native side of Psych...something's not doing what the .rb bits want.

@mjc
mjc commented Jul 18, 2014

This is likely similar to what I'm getting in #1815.

@headius
JRuby Team member
headius commented Sep 9, 2014

I have figured out the problem and I'm fixing it now.

In psych.rb, for load_file and parse_file, Psych opens a File with external encoding as BOM ("r:bom|utf-8"). This results in the IO partially reading the file in to check for a byte order mark. Because of the new IO implementation, the result is that the entire file gets buffered. This is a good thing.

However, our Psych Parser impl uses RubyIO.getInputStream, which creates an InputStream wrapping the Channel that backs the IO. That Channel has already been drained by the time we get to IOInputStream, so there's no data, so nothing parses.

I'm fixing it by making the RubyIO getChannel/get*Stream methods actually return an object aware of the buffering.

@headius headius added a commit that closed this issue Sep 9, 2014
@headius headius Make RubyIO.get*Stream return buffer-aware stream impls.
Because RubyIO has its own internal buffers, which are often used
at construction time to check a stream's BOM, we need to make sure
these pseudo-streams reflect those buffers by calling through the
same IO logic as Ruby.

I am unsure if we should do the same for getChannel, since there
are a number of consumers (socket subsystem, SSL subsystem) that
depend on getting the actual, unbuffered channel from the IO. For
now, added javadoc indicating that the channel returned should not
be used at the same time as the IO it came from.

Fixes #1824.
4768e02
@headius headius closed this in 4768e02 Sep 9, 2014
@headius headius added the stdlib label Sep 9, 2014
@headius headius added this to the JRuby 9000 milestone Sep 9, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.