New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose two Enumerator-related bugs #3814

Merged
merged 3 commits into from Aug 24, 2016

Conversation

Projects
None yet
5 participants
@nightscape
Contributor

nightscape commented Apr 20, 2016

JRuby seems to have two bugs related to Enumerators yielding Arrays. They can be reproduced when comparing the following code snippets which correctly return true for MRI, but false on JRuby.

  1. << should be an alias for .yield

    Enumerator.new {|y| y.yield([1])}.to_a == [[1]] # MRI: true, JRuby: true
    Enumerator.new {|y| y << [1]}.to_a ==  [[1]] # MRI: true, JRuby: false ([[[1]]] instead)
  2. map on lazy Enumerators should not unwrap Arrays

    Enumerator.new {|y| y.yield([1])}.map {|e| e}.to_a == [[1]] # MRI: true, JRuby: true
    Enumerator.new {|y| y.yield([1])}.lazy.map {|e| e}.to_a == [[1]] # MRI: true, JRuby: false ([1] instead)

I added corresponding specs, but I'm not sure if they're in the right place and format.

@headius headius added this to the JRuby 9.1.1.0 milestone Apr 28, 2016

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Apr 28, 2016

Member

These are definitely still true and probably related to our block arg-processing logic.

We can merge this but until we fix it we'd need to tag them for JRuby.

Member

headius commented Apr 28, 2016

These are definitely still true and probably related to our block arg-processing logic.

We can merge this but until we fix it we'd need to tag them for JRuby.

@nightscape

This comment has been minimized.

Show comment
Hide comment
@nightscape

nightscape Apr 29, 2016

Contributor

So, this is the commit that breaks << vs yield. It doesn't contain a test so it's hard to see what the erroneous behavior without it was.
Does anyone remember?

Contributor

nightscape commented Apr 29, 2016

So, this is the commit that breaks << vs yield. It doesn't contain a test so it's hard to see what the erroneous behavior without it was.
Does anyone remember?

@nightscape

This comment has been minimized.

Show comment
Hide comment
@nightscape

nightscape May 2, 2016

Contributor

Ok, so the corresponding PR for the above commit is this one and the corresponding issue
The functionality that gets fixed by the commit is the following:

Enumerator.new{|y| y << [42]}.first == [42] # Without the commit it seems to be 42

I'll add a test for this case as well and hopefully find an implementation that passes all tests.

Contributor

nightscape commented May 2, 2016

Ok, so the corresponding PR for the above commit is this one and the corresponding issue
The functionality that gets fixed by the commit is the following:

Enumerator.new{|y| y << [42]}.first == [42] # Without the commit it seems to be 42

I'll add a test for this case as well and hopefully find an implementation that passes all tests.

@@ -5,4 +5,7 @@
describe "Enumerator::Lazy#map" do
it_behaves_like :enumerator_lazy_collect, :map

This comment has been minimized.

@eregon

eregon May 2, 2016

Member

There should be an extra blank line here.

@eregon

eregon May 2, 2016

Member

There should be an extra blank line here.

nightscape added some commits May 2, 2016

@nightscape

This comment has been minimized.

Show comment
Hide comment
@nightscape

nightscape May 2, 2016

Contributor

I added a spec for the bug that the commit fixes.
Does anyone have an idea how to properly fix this?

Contributor

nightscape commented May 2, 2016

I added a spec for the bug that the commit fixes.
Does anyone have an idea how to properly fix this?

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 2, 2016

Member

@nightscape We've realized for a while we need a rework of how Enumera{ble,tor} passes values through. The current structure tends to lose the incoming argument layout and not match it properly with parameter binding.

9.1.1 might be a good time for us to look into fixing this.

I'm curious: do you have an app or library failing to work because of the current bugs in Enumera{ble,tor}?

Member

headius commented May 2, 2016

@nightscape We've realized for a while we need a rework of how Enumera{ble,tor} passes values through. The current structure tends to lose the incoming argument layout and not match it properly with parameter binding.

9.1.1 might be a good time for us to look into fixing this.

I'm curious: do you have an app or library failing to work because of the current bugs in Enumera{ble,tor}?

@nightscape

This comment has been minimized.

Show comment
Hide comment
@nightscape

nightscape May 2, 2016

Contributor

Yes, we have a part of our code where we use Enumerators to efficiently stream data through a calculation pipeline to keep memory consumption low. We were running on MRI without issues but moved over to JRuby (we've ported parts over to Spark in order to speed up things) and suddenly things were failing in strange ways. We're not using lazy enumerators anymore now and convert the Enumerator into a real array in order to workaround this issue.

Contributor

nightscape commented May 2, 2016

Yes, we have a part of our code where we use Enumerators to efficiently stream data through a calculation pipeline to keep memory consumption low. We were running on MRI without issues but moved over to JRuby (we've ported parts over to Spark in order to speed up things) and suddenly things were failing in strange ways. We're not using lazy enumerators anymore now and convert the Enumerator into a real array in order to workaround this issue.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 2, 2016

Member

@nightscape Were the problems you had only in lazy Enumerator or in regular Enumerator too?

Member

headius commented May 2, 2016

@nightscape Were the problems you had only in lazy Enumerator or in regular Enumerator too?

@nightscape

This comment has been minimized.

Show comment
Hide comment
@nightscape

nightscape May 2, 2016

Contributor

Both. We ran into the two problems that I've mentioned in the first post. They almost "cancel each other out" but then again not really ;)

Contributor

nightscape commented May 2, 2016

Both. We ran into the two problems that I've mentioned in the first post. They almost "cancel each other out" but then again not really ;)

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 2, 2016

Member

Ok, I'll try to prioritize this for the first (hopefully quick) maintenance release of 9.1. Thank you for the test cases...they will help us sort out exactly what to fix.

Member

headius commented May 2, 2016

Ok, I'll try to prioritize this for the first (hopefully quick) maintenance release of 9.1. Thank you for the test cases...they will help us sort out exactly what to fix.

@nightscape

This comment has been minimized.

Show comment
Hide comment
@nightscape

nightscape May 2, 2016

Contributor

Cool, thanks a lot! If I can help with anything where one doesn't have to be an expert in JRuby internals let me know!

Contributor

nightscape commented May 2, 2016

Cool, thanks a lot! If I can help with anything where one doesn't have to be an expert in JRuby internals let me know!

@headius headius modified the milestones: JRuby 9.1.1.0, JRuby 9.1.2.0 May 11, 2016

@enebo enebo modified the milestones: JRuby 9.1.2.0, JRuby 9.1.3.0 May 25, 2016

@codeforkjeff

This comment has been minimized.

Show comment
Hide comment
@codeforkjeff

codeforkjeff Aug 14, 2016

To chime in, I encountered this bug in a data processing script I was trying to run on JRuby.

I have an understanding of this problem being caused by splats and arg destructuring in /core/src/main/ruby/jruby/kernel/enumerator.rb (and possibly RubyYielder.java?) but haven't come up with a fix. If anyone has any suggestions for an approach to fixing this, i could try to implement it and submit a PR. But as it stands, it's a bit dizzying to puzzle through.

codeforkjeff commented Aug 14, 2016

To chime in, I encountered this bug in a data processing script I was trying to run on JRuby.

I have an understanding of this problem being caused by splats and arg destructuring in /core/src/main/ruby/jruby/kernel/enumerator.rb (and possibly RubyYielder.java?) but haven't come up with a fix. If anyone has any suggestions for an approach to fixing this, i could try to implement it and submit a PR. But as it stands, it's a bit dizzying to puzzle through.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 23, 2016

Member

Sadly still broken in 9.1.3.0, even with arg fixes. We need a rework of Enumerable/Enumerator to fix this, I suspect.

Member

headius commented Aug 23, 2016

Sadly still broken in 9.1.3.0, even with arg fixes. We need a rework of Enumerable/Enumerator to fix this, I suspect.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 23, 2016

Member

This may simply be a bit of bad behavior in the Yielder#<< implementation. We do the re-wrapping dance...MRI does not.

/* :nodoc: */
static VALUE
yielder_yield(VALUE obj, VALUE args)
{
    struct yielder *ptr = yielder_ptr(obj);

    return rb_proc_call(ptr->proc, args);
}

/* :nodoc: */
static VALUE
yielder_yield_push(VALUE obj, VALUE args)
{
    yielder_yield(obj, args);
    return obj;
}

This patch gets your examples working. I am checking tests now.

diff --git a/core/src/main/java/org/jruby/RubyYielder.java b/core/src/main/java/org/jruby/RubyYielder.java
index 6494e4d..1376db4 100644
--- a/core/src/main/java/org/jruby/RubyYielder.java
+++ b/core/src/main/java/org/jruby/RubyYielder.java
@@ -96,17 +96,13 @@ public class RubyYielder extends RubyObject {
     }

     @JRubyMethod(rest = true)
-    public IRubyObject yield(ThreadContext context, IRubyObject[]args) {
+    public IRubyObject yield(ThreadContext context, IRubyObject[] args) {
         checkInit();
         return proc.call19(context, args, Block.NULL_BLOCK);
     }

     @JRubyMethod(name = "<<", rest = true)
-    public IRubyObject op_lshift(ThreadContext context, IRubyObject[]args) {
-        if (args.length == 1 &&
-                args[0] instanceof RubyArray &&
-                ((RubyArray) args[0]).getLength() == 1)
-            args[0] = RubyArray.newArray(context.runtime, args[0]);
+    public IRubyObject op_lshift(ThreadContext context, IRubyObject[] args) {
         yield(context, args);
         return this;
     }
Member

headius commented Aug 23, 2016

This may simply be a bit of bad behavior in the Yielder#<< implementation. We do the re-wrapping dance...MRI does not.

/* :nodoc: */
static VALUE
yielder_yield(VALUE obj, VALUE args)
{
    struct yielder *ptr = yielder_ptr(obj);

    return rb_proc_call(ptr->proc, args);
}

/* :nodoc: */
static VALUE
yielder_yield_push(VALUE obj, VALUE args)
{
    yielder_yield(obj, args);
    return obj;
}

This patch gets your examples working. I am checking tests now.

diff --git a/core/src/main/java/org/jruby/RubyYielder.java b/core/src/main/java/org/jruby/RubyYielder.java
index 6494e4d..1376db4 100644
--- a/core/src/main/java/org/jruby/RubyYielder.java
+++ b/core/src/main/java/org/jruby/RubyYielder.java
@@ -96,17 +96,13 @@ public class RubyYielder extends RubyObject {
     }

     @JRubyMethod(rest = true)
-    public IRubyObject yield(ThreadContext context, IRubyObject[]args) {
+    public IRubyObject yield(ThreadContext context, IRubyObject[] args) {
         checkInit();
         return proc.call19(context, args, Block.NULL_BLOCK);
     }

     @JRubyMethod(name = "<<", rest = true)
-    public IRubyObject op_lshift(ThreadContext context, IRubyObject[]args) {
-        if (args.length == 1 &&
-                args[0] instanceof RubyArray &&
-                ((RubyArray) args[0]).getLength() == 1)
-            args[0] = RubyArray.newArray(context.runtime, args[0]);
+    public IRubyObject op_lshift(ThreadContext context, IRubyObject[] args) {
         yield(context, args);
         return this;
     }
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 23, 2016

Member

Stuffed my change into a PR at #4106. We'll see how it looks.

Member

headius commented Aug 23, 2016

Stuffed my change into a PR at #4106. We'll see how it looks.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Well I realized after reading @nightscape's comment above, I realized that #4106 basically would just revert #2458, which fixed #2376. It's obvious the fix isn't quite right, but we regress a different way if we remove it.

Member

headius commented Aug 24, 2016

Well I realized after reading @nightscape's comment above, I realized that #4106 basically would just revert #2458, which fixed #2376. It's obvious the fix isn't quite right, but we regress a different way if we remove it.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Replacing our Yielder impl (ed: with a pure Ruby version) does not appear to fix the issue either. The problem seems to lie in the proc logic.

Member

headius commented Aug 24, 2016

Replacing our Yielder impl (ed: with a pure Ruby version) does not appear to fix the issue either. The problem seems to lie in the proc logic.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

Ok, replacing both Yielder#yield and Enumerable#first makes it behave the way we expect, for all the original cases plus the one from #2376:

class Enumerator::Yielder
    def yield(*args)
      @proc.call(*args)
    end
end

module Enumerable
  def first
    val = nil
    catch(:foo) do
      each do |x|
        val = x
        throw :foo
      end
    end
    val
  end
end

p Enumerator.new {|y| y.yield([1])}.to_a
p Enumerator.new {|y| y << [1]}.to_a
p Enumerator.new {|y| y.yield([1])}.map {|e| e}.to_a
p Enumerator.new {|y| y.yield([1])}.lazy.map {|e| e}.to_a
p Enumerator.new {|y| y.yield [1]}.first

Output

$ jruby blah.rb
[[1]]
[[1]]
[[1]]
[[1]]
[1]

This is more demonstration that our Java-based Enumerable has several hard-to-fix flaws left in it. I think we're going to need to move to a Ruby-based impl sooner rather than later.

Member

headius commented Aug 24, 2016

Ok, replacing both Yielder#yield and Enumerable#first makes it behave the way we expect, for all the original cases plus the one from #2376:

class Enumerator::Yielder
    def yield(*args)
      @proc.call(*args)
    end
end

module Enumerable
  def first
    val = nil
    catch(:foo) do
      each do |x|
        val = x
        throw :foo
      end
    end
    val
  end
end

p Enumerator.new {|y| y.yield([1])}.to_a
p Enumerator.new {|y| y << [1]}.to_a
p Enumerator.new {|y| y.yield([1])}.map {|e| e}.to_a
p Enumerator.new {|y| y.yield([1])}.lazy.map {|e| e}.to_a
p Enumerator.new {|y| y.yield [1]}.first

Output

$ jruby blah.rb
[[1]]
[[1]]
[[1]]
[[1]]
[1]

This is more demonstration that our Java-based Enumerable has several hard-to-fix flaws left in it. I think we're going to need to move to a Ruby-based impl sooner rather than later.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 24, 2016

Member

This PR kinda became a place to discuss the fix, but I'd like to get these specs in. I'm going to merge, tag, and open a separate bug for actually fixing the issue.

Member

headius commented Aug 24, 2016

This PR kinda became a place to discuss the fix, but I'd like to get these specs in. I'm going to merge, tag, and open a separate bug for actually fixing the issue.

@headius headius merged commit 3ce1432 into jruby:master Aug 24, 2016

0 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build failed
Details

headius added a commit that referenced this pull request Aug 24, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment