-
-
Notifications
You must be signed in to change notification settings - Fork 922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lazy map adds extra array layer to single-element array output #5044
Comments
I found another bug that might be related. After a take, map erroneously strips arrays, and has some weird behaviour, including an incorrect ArgumentError: [66] pry(main)> [[1],[2],[3]].lazy.take(2).map { |elem| p elem }.force
1
2
=> [1, 2]
[67] pry(main)> [[],[2],[3]].lazy.take(2).map { |elem| p elem }.force
nil
2
=> [nil, 2]
[68] pry(main)> [[1, 2],[2],[3]].lazy.take(2).map { |elem| p elem }.force
1
2
=> [1, 2]
[69] pry(main)> [[1, 3],[2],[3]].lazy.take(2).map { |elem| p elem }.force
1
2
=> [1, 2]
[70] pry(main)> [[1, 3],[2],[3]].lazy.take(2).map { |elem| elem.itself }.force
=> [1, 2]
[71] pry(main)> [[1, 3],[2],[3]].lazy.take(2).map(&:itself).force
ArgumentError: wrong number of arguments calling `itself` (1 for 0)
from uri:classloader:/jruby/kernel/enumerator.rb:72:in `block in map'
[72] pry(main)> |
Hmm...well if this issue is specific to |
It does look like it's just specific to lazy; I couldn't replicate any of the bugs in non-lazy enumeration. Looking at the ruby, it seems to me that it might be some issue with the argument splats (there is some imbalance with the splatting into the yielder, and splatting of arguments for the block between select, take, and map, for instance), but I haven't studied the code enough to be very sure. |
I know that other Ruby-in-Ruby impls have had to introduce special non-Ruby semantics for some of these argument passing scenarios. I had hoped we had such cases covered inside our "native" block and Enumerable logic, but perhaps we're missing something more. |
Could this be a bug in Yielder#<<. JRuby's RubyYielder.op_lshift is very different than RubyYielder.yield https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/RubyYielder.java#L105-L113 But MRI's yielder_yield_push just calls yielder_yield as is: https://github.com/ruby/ruby/blob/trunk/enumerator.c#L1239-L1243 This change seems to fix this specific issue: diff --git a/core/src/main/ruby/jruby/kernel/enumerator.rb b/core/src/main/ruby/jruby/kernel/enumerator.rb
index 9a0076d2e5..481ea7c828 100644
--- a/core/src/main/ruby/jruby/kernel/enumerator.rb
+++ b/core/src/main/ruby/jruby/kernel/enumerator.rb
@@ -73,7 +73,7 @@ class Enumerator
def map
_block_error(:map) unless block_given?
Lazy.new(self) do |yielder, *values|
- yielder << yield(*values)
+ yielder.yield yield(*values)
end.__set_inspect :map
end
alias_method :collect, :map This new functionality in op_lshift started here: And it looks like there are already comments there about how that commit may have caused issues. - however reverting that commit causes a regression. |
Oh it looks like this issue already has a full and rich history: #3814 |
Lot of layers to this, with some other bugs as well: [21] pry(main)> p Enumerator.new{|y| y << []}.first
nil
=> nil
[22] pry(main)> p Enumerator.new{|y| y.yield []}.first
nil
=> nil [11] pry(main)> p Enumerator.new{|y| y << [1]}.to_a
[[[1]]]
=> [[[1]]]
[12] pry(main)> p Enumerator.new{|y| y.yield [1]}.to_a
[[1]]
=> [[1]]
[13] pry(main)> p Enumerator.new{|y| y << [1]}.first
[1]
=> [1]
[14] pry(main)> p Enumerator.new{|y| y.yield [1]}.first
1
=> 1 @glucero yeah, looks like it. |
Hmm, Yielder is native and could be the culprit indeed. |
It might be worth it (and not too complicated) to make a pure-Ruby Yielder and see how it behaves. One area we have a lot of trouble is mapping all the various block-dispatch semantics to Java calls. |
Pure-ruby yielder should be relatively easy. I'm not fully versed in MRI's C API, but just looking at the sources, it seems roughly similar to: class Yielder
def initialize(&block)
@block = block
end
def yield(*args)
@block.call(*args)
end
alias_method :<<, :yield
end Testing with a simple implementation of a subset of Enumerator, it appears to work about as expected: class MyEnumerator
include Enumerable
def initialize(&block)
@block = block
end
def each(&block)
if block_given?
@block.call(Yielder.new(&block))
else
self
end
end
end This appears to work as expected in MRI, but JRuby still gives some erroneous results in some circumstances (I'm guessing due to faulty Enumerable methods): [11] pry(main)> MyEnumerator.new{|y| y << [1]}.to_a
=> [[1]]
[12] pry(main)> MyEnumerator.new{|y| y.yield [1]}.to_a
=> [[1]]
[13] pry(main)> MyEnumerator.new{|y| y << [1]}.first
=> 1
[14] pry(main)> MyEnumerator.new{|y| y.yield [1]}.first
=> 1
[15] pry(main)> MyEnumerator.new{|y| y << []}.first
=> nil
[16] pry(main)> MyEnumerator.new{|y| y.yield []}.first
=> nil |
Might it make sense to implement Enumerator, Enumerable, and Yielder all in pure Ruby? There'd have to be a lot of interaction between them and some Fiber magic, but it seems like it'd be more manageable to avoid a large amount of the Ruby/Java shuffling here. |
@Taywee It would indeed, and we've discussed this frequently because of exactly these problems. The weird vagaries of parameter passing for blocks works "perfectly" in pure Ruby because it's passing through the actual Ruby compiler/JIT, while Enumerable and Enumerator and Yielder all are trying to emulate that on a case-by-case basis. Also the Enumerable code sucks. I think it would be a good time to do this in 9.2. What do you think, @enebo? In the short term I think it would be totally fine to move Yielder into Ruby, since it's only a small amount of code. |
@headius I've written up a quick-and-dirty Enumerator and Enumerable implementation here, which should be complete (though not necessarily bug-free yet), short Lazy. I've done some minor testing of it and most of what I expect to be correct is correct. There are some changes due to make it behave in-line with the way JRuby does coercion and error detection, and I think there are places where it uses |
I'll have a look at what you've come up with! There are a few other pure-Ruby implementations out there, like the one from Rubinius and the adapted version in TruffleRuby. Both of those use some internal functions we'd need to expose, such as for coercing various values. I did try just a pure-ruby Yielder and it only regressed one spec. For 9.1.x this might be safest to just one-off fix, as we have with other Enumerable and Enumerator methods. But I think it's time we moved to pure Ruby for the whole Enumerable stack in 9.2. |
Yeah, adapting the Rubinius implementation might be a better option, being established and well-tested and all. Any route to getting to a more stable and trustable Enumera* set. |
We are now finally starting to incorporate some inlining, which means the possibility of moving more Enumerator and Enumerable methods to Ruby has real legs. I'm going to bump this one more version and we'll see if we can start getting some of these methods moved over. Note this would also help issues like #4108 #4212 and others. |
Started poking at this a bit, to see if only partially moving to Ruby would help. It turns out that it's a pretty mixed bag:
The next step would be making Enumerable#first be pure-Ruby, and this is where things start to break down. The only known compliant implementations of #first in Ruby originate in Rubinius here: https://github.com/rubinius/rubinius/blob/master/core/enumerable.rb#L583-L589 The logic for https://github.com/rubinius/rubinius/blob/master/core/enumerable.rb#L826-L843 And this shows that the argument processing for almost all the Enumerable methods has a bit more complexity than pure Ruby can actually handle, with this special We ought to be able to mimic that logic, though perhaps it will require work to the block/proc yield/call logic. However, this makes the task rather more complicated than we would do in a minor release. We also are not yet in a position to optimize through the various pure-Ruby @enebo Do we think that's a prerequisite, or not? The current implementation obviously doesn't yield and specialize any better, but it does avoid a lot of DynamicScope allocation by using native block bodies. We won't be able to eliminate that for pure-Ruby blocks without inlining. |
also from comments jrubyGH-3814 which are now passing `Enumerator.new { |y| y.yield([1]) }.lazy.map { |e| e }.to_a` seems to be left not working as in MRI
Environment
jruby 9.1.15.0 (2.3.3) 2017-12-07 929fde8 OpenJDK 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 +jit [linux-x86_64]
Linux nyarlathotep 4.14.11-gentoo-r2 #1 SMP Thu Jan 4 17:46:16 MST 2018 x86_64 Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz GenuineIntel GNU/Linux
Expected Behavior
The expected behavior should be as shown in standard Ruby:
Actual Behavior
Run from vanilla jruby's IRB, it looks like this:
The text was updated successfully, but these errors were encountered: