Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LazyCollection chunk using generator delegation #31761

Closed
ankr opened this issue Mar 5, 2020 · 5 comments
Closed

LazyCollection chunk using generator delegation #31761

ankr opened this issue Mar 5, 2020 · 5 comments

Comments

@ankr
Copy link

ankr commented Mar 5, 2020

  • Laravel Version: 6.18.0
  • PHP Version: 7.3.13

Description:

When using generator delegation in combination with LazyCollection::chunk() and a chunk size greater than amount of values yielded from initial generator, all values from delegated generator are lost.

Using a chunk smaller than (or equal to) the amount of values yielded from initial generator - everything works as expected.

Steps To Reproduce:

Make generator

<?php
function gen($stop = false) {
    yield 1;
    yield 2;
    yield 3;

    if (! $stop) {
        yield from gen(true);
    }
}

Make collection

<?php
$collection = LazyCollection::make(function () {
    yield from gen();
});

Chunk and inspect collection

<?php
// all good
$collection->chunk(1)->toArray(); // [[1], [2], [3], [1], [2], [3]]
$collection->chunk(2)->toArray(); // [[1, 2], [3, 1], [2, 3]]
$collection->chunk(3)->toArray(); // [[1, 2, 3], [1, 2, 3]]

// starting to loose values
$collection->chunk(4)->toArray(); // [[1, 2, 3]]
$collection->chunk(5)->toArray(); // [[1, 2, 3]]
@driesvints
Copy link
Member

@JosephSilber I guess this isn't possible with LazyCollections?

@JosephSilber
Copy link
Contributor

This has nothing to do with lazy collections. You can see the same behavior using native generators:

function generate($times = 1) {
    yield 1;
    yield 2;
    yield 3;

    if ($times > 1) {
        yield from generate($times - 1);
    }
}

iterator_to_array(generate(2));

The above will return 1, 2, 3 only once.


The reason this happens is because every time you yield within a generator function without an explicit key, it actually yields a numerically indexed key/value pair. Every time you yield, the index gets bumped.

However, when you yield from, the inner generator yields its own indices, starting again from 0. So in my example above, it would be equivalent to this:

function generate() {
    yield 0 => 1;
    yield 1 => 2;
    yield 2 => 3;

    yield 0 => 1;
    yield 1 => 2;
    yield 2 => 3;
}

When you later convert these to an array, the later indices overwrite the earlier ones.


The solution is to call values() on your collection, to reset all indices:

$collection->values()->chunk(4)->toArray(); // [[1, 2, 3, 1], [2, 3]]

@driesvints
Copy link
Member

Hey @ankr, please see the answer above.

@JosephSilber thanks!

@ankr
Copy link
Author

ankr commented Mar 5, 2020

@JosephSilber Thank you for the thorough explanation! It all makes sense.

@drupol
Copy link

drupol commented Aug 13, 2020

Hi,

I made a blog post explaining this here: https://not-a-number.io/2020/lazy-collection-oddities/

I saw this thread after when I started to share the link to the post, thanks @JosephSilber for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants