Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async file downloads (sink option) are triggering complete handler too soon #1504

Closed
sebastiaanluca opened this issue Jun 26, 2016 · 7 comments
Labels
lifecycle/stale No activity for a long time

Comments

@sebastiaanluca
Copy link

Is it normal that an async request is considered completed almost instantly when using a pool in combination with the sink option to write to a file?

I've tried every bit of code and expect each transfer to finish completely before moving on to the next, especially since I define the number of parallel transfer, but instead it starts downloading the entire queue at once.

use GuzzleHttp\Pool;
use GuzzleHttp\Client;
use GuzzleHttp\Psr7\Request;

$client = new Client();

$requests = function ($total) use($client) {
    $uri = 'url/to/video.mp4';
    for ($i = 0; $i < $total; $i++) {
        // Write remote file to local source
        yield $client->getAsync($link, ['sink' => 'targetFile']);
    }
};

$pool = new Pool($client, $requests(100), [
    'concurrency' => 2,
    'fulfilled' => function ($response, $index) {
        // this is delivered each successful response
    },
    'rejected' => function ($reason, $index) {
        // this is delivered each failed request
    },
]);

// Initiate the transfers and create a promise
$promise = $pool->promise();

// Force the pool of requests to complete.
$promise->wait();
@mtdowling
Copy link
Member

Is it normal that an async request is considered completed almost instantly when using a pool in combination with the sink option to write to a file?

No? Can you provide more information? What do you see when you look at debug output? Are you using cURL or the stream wrapper?

@sebastiaanluca
Copy link
Author

How can I enable debug output and where can I find it? Not sure what I'm using but the defaults. Simply instantiating a new Guzzle client and requesting a video, then having it write to a file.

To check if it wasn't related to the pool or any other app-specific code, I placed this in the run method of a PHP thread:

$client = new Client();

$client->get($this->source, [
    // Write output to target file
    'sink' => $this->target,
]);

var_dump('request complete!');

Result is that it executes the request and triggers request complete after about half a second. The thread itself does not exit (when complete) until the target file has been completely downloaded. So in effect the GET request returns completed before it should and while the request is still being processed.

@mtdowling
Copy link
Member

Are you using threads (like with pthreads)? Guzzle is not thread safe and will not work in a multithreaded application.

On Jul 3, 2016, at 12:53 PM, Sebastiaan Luca notifications@github.com wrote:

How can I enable debug output and where can I find it? Not sure what I'm using but the defaults. Simply instantiating a new Guzzle client and requesting a video, then having it write to a file.

To check if it wasn't related to the pool or any other app-specific code, I placed this in the run method of a PHP thread:

$client = new Client();

$client->get($this->source, [
// Write output to target file
'sink' => $this->target,
]);

var_dump('request complete!');
Result is that it executes the request and triggers request complete after about half a second. The thread itself does not exit (when complete) until the target file has been completely downloaded. So in effect the GET request returns completed before it should and while the request is still being processed.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@sebastiaanluca
Copy link
Author

Yes, pthreads. And it works, but it has the same issue as when I'm not using threads (which was just a test), so doesn't really matter. Request is made, then I think it completes when it receives the headers, and then the file downloads.

@rbruhn
Copy link

rbruhn commented Apr 10, 2017

I'm experiencing this same issue. The code to dump the ids is executed, then the script doesn't end until all the downloads finish. I normally fire off a Beanstalk job using those ids, but if the files are still downloading it screws things up.

Is there a way to make it wait for the downloads to complete?

$requests = function () use ($api)
{
    foreach ($this->sources as $index => $source)
    {
        $filePath = config('config.classifications_download') . '/' . $index . '.csv';

        yield $index => function($poolOpts) use ($api, $source, $filePath) {
            $reqOpts = [
                'sink' => $filePath
            ];
            if (is_array($poolOpts) && count($poolOpts) > 0) {
                $reqOpts = array_merge($poolOpts, $reqOpts); // req > pool
            }

            return $api->getHttpClient()->getAsync($source, $reqOpts);
        };
    }
};

$responses = Pool::batch($api->getHttpClient(), $requests(), [
    'concurrency' => 10,
    'fulfilled'   => function ($response, $index)
    {
        return $index;
    },
    'rejected'    => function ($reason, $index)
    {
        return $index;
    }
]);

// Handle responses here.... collect ids into array

dd($ids);

// Fire new Beanstalk job using ids.

@Sadikk
Copy link

Sadikk commented Nov 14, 2018

Bumping this, got the same issue today

@stale
Copy link

stale bot commented Sep 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 2 weeks if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale No activity for a long time label Sep 25, 2020
@stale stale bot closed this as completed Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale No activity for a long time
Projects
None yet
Development

No branches or pull requests

4 participants