Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot traverse already closed generator - v8.35 change broke our LazyCollection #36837

Closed
jryd opened this issue Mar 31, 2021 · 13 comments
Closed

Comments

@jryd
Copy link
Contributor

jryd commented Mar 31, 2021

  • Laravel Version: 8.35.1
  • PHP Version: 7.4.14
  • Database Driver & Version: MySQL 5.6

Description:

We've been using LazyCollections in conjunction with a generator function to page from our APIs.

We've been doing this for about 6 months now.

In a recent upgrade from v8.34 to v8.35 some previously working code is now erroring out with:

Cannot traverse an already closed generator

I saw in a recent PR (#36738) that some changes were made to LazyCollections. I'm not entirely sure how the changes cause our code to break, but downgrading to v8.34 (before the change was made) resolves the issue.

Steps To Reproduce:

We have a service class that looks roughly like this:

class ExternalData
{
    private PendingRequest $client;

    public function __construct(PendingRequest $client)
    {
        $this->client = $client;
    }

    public function all()
    {
        return new LazyCollection($this->fetchData());
    }

    private function fetchData($page = 1)
    {
        do {
            $response = $this->client
                ->get(sprintf('/api/to/external/service?page=%s', $page++))
                ->json();

            foreach ($response['data'] as $result) { // $response['data'] is an array of objects/models
                yield $result;
            }

        } while ($response['next_page_url'] !== null);
    }
}

When we use this code, we've been using it like so:

$externalData = $this->externalData // $this->externalData is the injected ExternalData class
    ->all()
    ->filter(fn ($data) => true) // we perform actual filtering here but its enough to still cause the error
    ->count()

As mentioned, this code works in v8.34 but breaks in v8.35.

Let me know if you need any further information.

@driesvints
Copy link
Member

Ping @victorlap ^

@JosephSilber do you agree we should revert the PR that broke this?

@victorlap
Copy link
Contributor

victorlap commented Apr 1, 2021

@jryd I think this was working before because the lazy collection was not "truly" lazy. Upon initialization it would page through the whole api before continuing with the filter step (and are accessing the data more than once), that is exactly the reason I made the change.

The problem lies in the fact that the generator is used for a second time. If you would like this code to work, you could try and instantiate a regular collection and make it lazy:

Collection::make($this->fetchData())->lazy()

Or use the remember method:

new LazyCollection($this->fetchData())->remember()

I believe the code works in the intended way currently, as the following test demonstrates:

$iterable = function () {
    yield 'a' => 1;
    yield 'b' => 2;
    yield 'c' => 3;
};
$data = LazyCollection::make($iterable());
echo $data->filter(fn($_) => true)->count(); // 3

@dkulyk
Copy link
Contributor

dkulyk commented Apr 1, 2021

I have had same issue when trying reuse collection

LazyCollection(Model::query()->cursor())
->chunk(200)
->each(fn($collection) => EloquentClollection::make($collection)->load($with))
->flatten(1) // Reuse $collection from previous step
->each(...) 

After changing each to map works well.

@victorlap
Copy link
Contributor

Alright, it seems the previous behavior (although misleiding since it is not really lazy) is in use for a lot of people. I think reverting the PR would be the best option here.

@dkulyk
Copy link
Contributor

dkulyk commented Apr 1, 2021

Alright, it seems the previous behavior (although misleiding since it is not really lazy) is in use for a lot of people. I think reverting the PR would be the best option here.

In previous behaviour if you pass Generator instance to LazyCollection it converts it to array. It's wrong behaviour.

Code:

function generator()
{
    for ($i = 0; $i < 3; ++$i) {
        echo 'IN: ', $i, PHP_EOL;
        yield $i;
    }
}

LazyCollection::make(generator())
    ->each(function ($i) {
        echo 'OUT: ', $i, PHP_EOL;
    });

Laravel 8.34 converts generator/iterator instance to array

IN: 0
IN: 1
IN: 2
OUT: 0
OUT: 1
OUT: 2

Laravel 8.35 works well

IN: 0
OUT: 0
IN: 1
OUT: 1
IN: 2
OUT: 2

@crynobone
Copy link
Member

@dkulyk above logic only valid based on PR to 8.35. However the original behaviour suggest to pass a function to return a generator. https://laravel.com/docs/8.x/collections#creating-lazy-collections

This still seem to be a breaking change when we attempt to add support to pass in a generator directly.

@victorlap
Copy link
Contributor

Would this kind of change be acceptable for the next major version? I still think that the correct output should be the one from the latest version.

@dkulyk
Copy link
Contributor

dkulyk commented Apr 1, 2021

I mean what LazyCollection before 8.35 works with generator as like array. Yes, I must be pass closure for generator instead generator instance.

That mean what LazyCollection::make(generator()) is wrong used in LazyCollection before 8.35

@driesvints
Copy link
Member

Reverting this: #36844

@dkulyk
Copy link
Contributor

dkulyk commented Apr 1, 2021

@jryd Please test on 8.34 this

public function all()
{
    return new LazyCollection(function() {
        do {
            $response = $this->client
                ->get(sprintf('/api/to/external/service?page=%s', $page++))
                ->json();

            foreach ($response['data'] as $result) { // $response['data'] is an array of objects/models
                yield $result;
            }

        } while ($response['next_page_url'] !== null);
    });
}

@JosephSilber
Copy link
Member

The ability to pass a Generator directly to the LazyCollection constructor keeps on coming up again and again.

As I've outlined here, this is a bad idea. Not only because Generators aren't rewindable; it breaks the collection's immutability.


I should really add an explicit test that ensures passing a Generator to the constructor enumerates them all right away.

At least that way, repeat expeditions in the future will realize that this was a deliberate choice.

@driesvints
Copy link
Member

Thanks @JosephSilber. Would really appreciate that PR if you can send it in.

@JosephSilber
Copy link
Member

Would this kind of change be acceptable for the next major version? I still think that the correct output should be the one from the latest version.

This should not be changed in the next release, since it's wrong. I'll add a warning to the docs that passing in a Generator will enumerate the whole thing right away.

If anything, the next release should throw an exception when passing in a Generator directly, since that's most likely a mistake by the developer (as evident by the OP's code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants