New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JRuby starts executing Enumerator code too soon #4583

Closed
janko-m opened this Issue May 2, 2017 · 5 comments

Comments

Projects
None yet
2 participants
@janko-m

janko-m commented May 2, 2017

Environment

$ jruby -v
jruby 9.1.8.0 (2.3.1) 2017-03-06 90fc7ab Java HotSpot(TM) 64-Bit Server VM 25.40-b25 on 1.8.0_40-b27 +jit [darwin-x86_64]
$ uname -a
Darwin Jankos-MacBook-Pro.local 16.4.0 Darwin Kernel Version 16.4.0: Thu Dec 22 22:53:21 PST 2016; root:xnu-3789.41.3~3/RELEASE_X86_64 x86_64

Expected Behavior

For the given code

enumerator = Enumerator.new do |yielder|
  yielder << nil
  puts "EXECUTED"
end

enumerator.next

MRI 2.4.1 doesn't output anything. I expected JRuby to behave the same, to pause and not to execute any code after yield, since we requested only the first element of the Enumerator.

Actual Behavior

However, JRuby outputs

EXECUTED

In other words, JRuby started executing the code after yield, even though we only asked for the first element.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 3, 2017

Member

The problem is that the thread started to power Enumerator#next does not pause after yielding the nil result to the yielder. It doesn't pause until the next iteration, because there's nobody waiting for that value. As a result, it executes past the yield.

I believe I already fixed this for Fiber. What I didn't do is reimplement Enumerator#next in terms of Fiber, as MRI does.

Member

headius commented May 3, 2017

The problem is that the thread started to power Enumerator#next does not pause after yielding the nil result to the yielder. It doesn't pause until the next iteration, because there's nobody waiting for that value. As a result, it executes past the yield.

I believe I already fixed this for Fiber. What I didn't do is reimplement Enumerator#next in terms of Fiber, as MRI does.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 3, 2017

Member

If you are really just using this yielder pattern, you could swap in Fiber and it would work ok.

fiber = Fiber.new {
  Fiber.yield nil
  puts "EXECUTED"
}

fiber.resume

This will not output anything, because you only resume and get the first value, after which the Fiber properly transfers control back and waits.

Member

headius commented May 3, 2017

If you are really just using this yielder pattern, you could swap in Fiber and it would work ok.

fiber = Fiber.new {
  Fiber.yield nil
  puts "EXECUTED"
}

fiber.resume

This will not output anything, because you only resume and get the first value, after which the Fiber properly transfers control back and waits.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 3, 2017

Member

More on reimplementing Enumerator#next in terms of Fiber...

I have prototyped this before, and the amount of code required is very small (all implemented in Ruby). The piece that's missing is our short-circuited logic for when the data being enumerated comes from an Array.

RubyEnumerator, implemented in Java, has at least two kinds of next logic: one that uses a pseudo-fiber to actually drive the nexting like a fiber, and one that just acts as a cursor into the given collection. The latter is significantly lighter-weight, both because it doesn't have to transfer values out of a fiber and because it doesn't have to spin up a native thread.

If (when?) we move Enumerator#next to use a fiber, we will still want to keep the optimized versions for known collections.

Member

headius commented May 3, 2017

More on reimplementing Enumerator#next in terms of Fiber...

I have prototyped this before, and the amount of code required is very small (all implemented in Ruby). The piece that's missing is our short-circuited logic for when the data being enumerated comes from an Array.

RubyEnumerator, implemented in Java, has at least two kinds of next logic: one that uses a pseudo-fiber to actually drive the nexting like a fiber, and one that just acts as a cursor into the given collection. The latter is significantly lighter-weight, both because it doesn't have to transfer values out of a fiber and because it doesn't have to spin up a native thread.

If (when?) we move Enumerator#next to use a fiber, we will still want to keep the optimized versions for known collections.

@headius headius added this to the JRuby 9.2.0.0 milestone May 3, 2017

@janko-m

This comment has been minimized.

Show comment
Hide comment
@janko-m

janko-m commented May 4, 2017

Thanks you @headius!

janko-m added a commit to janko-m/http that referenced this issue May 19, 2017

Don't use a buffer as a workaround for a JRuby bug
As described in jruby/jruby#4583, JRuby starts
executing code after the first `yield` even though we requested only the
first element, resulting in the first chunk being overriden with the
second chunk before it was even returned.

We work around this by not using a buffer string, therefore each
retrieved chunk is a new string, so even if JRuby immediately retrieves
the second chunk, it won't affect the first chunk.

janko-m added a commit to janko-m/http that referenced this issue Jul 2, 2017

Remove Enumerator from Request::Body#each
Using Enumerator interface like Enumerator#next wouldn't work correctly
on JRuby with IOs, because JRuby eagerly loads the next element before
it returns the current, and because each next chunk overwrites the
previous one, it is impossible to retrieve first chunk via
Enumerable#next.

jruby/jruby#4583

In order to avoid potential bugs in the future of someone wanting to use
this Enumerator interface, we remove this entirely and mandate people to
pass in a block.

ixti added a commit to httprb/http that referenced this issue Aug 18, 2017

Remove Enumerator from Request::Body#each
Using Enumerator interface like Enumerator#next wouldn't work correctly
on JRuby with IOs, because JRuby eagerly loads the next element before
it returns the current, and because each next chunk overwrites the
previous one, it is impossible to retrieve first chunk via
Enumerable#next.

jruby/jruby#4583

In order to avoid potential bugs in the future of someone wanting to use
this Enumerator interface, we remove this entirely and mandate people to
pass in a block.
@janko-m

This comment has been minimized.

Show comment
Hide comment
@janko-m

janko-m Mar 3, 2018

I'm closing this one as a duplicate of #5007.

janko-m commented Mar 3, 2018

I'm closing this one as a duplicate of #5007.

@janko-m janko-m closed this Mar 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment