Parallel downloader #5293

Open
wants to merge 16 commits into
from

Conversation

Projects
None yet
@hirak
Contributor

hirak commented May 6, 2016

Parallel Prefetcher

porting from https://github.com/hirak/prestissimo

  • Fetching packages in parallel at Composer\Installer::doInstall()
  • This is NOT replacement of Composer\Util\RemoteFilesystem.
    • repository.json download in serial
  • If !extension_loaded('curl'), then fallback to RemoteFilesystem
  • If it fails to download, then fallback to RemoteFilesystem. (do not retry / do not show any prompt)

See also

#2696
#3930 (comment)

Question

github-token, gitlab-token options are still alive?

if (isset($options['github-token'])) {

if (isset($options['gitlab-token'])) {

I can't find any documentation about that options. In this P-R, I didn't implement it yet.

Parallel package downloader
porting from hirak/prestissimo
+
+ public static function generateUserAgent()
+ {
+ static $ua;

This comment has been minimized.

@staabm

staabm May 6, 2016

Contributor

Are all this caches really required?

@staabm

staabm May 6, 2016

Contributor

Are all this caches really required?

This comment has been minimized.

@staabm

staabm May 6, 2016

Contributor

Looks like some mirco ootimizing

@staabm

staabm May 6, 2016

Contributor

Looks like some mirco ootimizing

This comment has been minimized.

@hirak

hirak May 6, 2016

Contributor

Because User-Agent is never-changed string in Composer process. But it was off the subject(parallel downloading).
Should I remove it for simplicity ?

@hirak

hirak May 6, 2016

Contributor

Because User-Agent is never-changed string in Composer process. But it was off the subject(parallel downloading).
Should I remove it for simplicity ?

This comment has been minimized.

@staabm

staabm May 6, 2016

Contributor

If it doesnt improve by a measurable amount of time I would drop it

@staabm

staabm May 6, 2016

Contributor

If it doesnt improve by a measurable amount of time I would drop it

@staabm

This comment has been minimized.

Show comment
Hide comment
@staabm

staabm May 6, 2016

Contributor

Any numbers how much faster downloading gets with this PR?

Contributor

staabm commented May 6, 2016

Any numbers how much faster downloading gets with this PR?

+ /** @var resource<stream<plainfile>> */
+ protected $fp;
+
+ protected $success = false;

This comment has been minimized.

@stof

stof May 6, 2016

Contributor

I suggest using private visibility for all of them. It makes maintaining backward compatibility much easier (as anything private is outside the surface concerned with BC, while protected stuff is inside it due to inheritance)

@stof

stof May 6, 2016

Contributor

I suggest using private visibility for all of them. It makes maintaining backward compatibility much easier (as anything private is outside the surface concerned with BC, while protected stuff is inside it due to inheritance)

This comment has been minimized.

@hirak

hirak May 6, 2016

Contributor

👍 OK

@hirak

hirak May 6, 2016

Contributor

👍 OK

@hirak

This comment has been minimized.

Show comment
Hide comment
@hirak

hirak May 7, 2016

Contributor

@staabm It is a difficult question. Because it depends geolocation, composer command, etc..

For example, I benchmarked by create-project laravel/laravel.

San Francisco

  • composer-stable (1.0.3)
    [167.5MB/149.39s] Memory usage: 167.55MB (peak: 218.73MB), time: 149.39s
  • prestissimo-patch
    [168.4MB/97.77s] Memory usage: 168.39MB (peak: 215.27MB), time: 97.77s

Singapore

  • composer-stable (1.0.3)
    [167.6MB/338.51s] Memory usage: 167.56MB (peak: 218.76MB), time: 338.51s
  • prestissimo-patch
    [168.4MB/156.43s] Memory usage: 168.38MB (peak: 215.26MB), time: 156.43s

Details below
https://gist.github.com/hirak/fb720bb716c19e08a0ccb139f4ac8d81

Contributor

hirak commented May 7, 2016

@staabm It is a difficult question. Because it depends geolocation, composer command, etc..

For example, I benchmarked by create-project laravel/laravel.

San Francisco

  • composer-stable (1.0.3)
    [167.5MB/149.39s] Memory usage: 167.55MB (peak: 218.73MB), time: 149.39s
  • prestissimo-patch
    [168.4MB/97.77s] Memory usage: 168.39MB (peak: 215.27MB), time: 97.77s

Singapore

  • composer-stable (1.0.3)
    [167.6MB/338.51s] Memory usage: 167.56MB (peak: 218.76MB), time: 338.51s
  • prestissimo-patch
    [168.4MB/156.43s] Memory usage: 168.38MB (peak: 215.26MB), time: 156.43s

Details below
https://gist.github.com/hirak/fb720bb716c19e08a0ccb139f4ac8d81

@hirak

This comment has been minimized.

Show comment
Hide comment
@hirak

hirak May 7, 2016

Contributor

Test failure is not my fault.
Pull Requested.
Drop test dependency on http://www.example.com #5294

Contributor

hirak commented May 7, 2016

Test failure is not my fault.
Pull Requested.
Drop test dependency on http://www.example.com #5294

+ private static function ifOr($str, $pre = '', $post = '')
+ {
+ if ($str) {
+ return "$pre$str$post";

This comment has been minimized.

@stloyd

stloyd May 7, 2016

Contributor
return $pre . $str . $post;
@stloyd

stloyd May 7, 2016

Contributor
return $pre . $str . $post;
+ }
+ }
+
+ private static $NSS_CIPHERS = array(

This comment has been minimized.

@stloyd

stloyd May 7, 2016

Contributor

This should be defined before methods.

@stloyd

stloyd May 7, 2016

Contributor

This should be defined before methods.

@hirak

This comment has been minimized.

Show comment
Hide comment
@hirak

hirak May 7, 2016

Contributor

I built composer.phar for trial.
Release prestissimo-patch-preview · hirak/composer

Contributor

hirak commented May 7, 2016

I built composer.phar for trial.
Release prestissimo-patch-preview · hirak/composer

+ if ($this->user) {
+ $user = $this->user;
+ $user .= self::ifOr($this->pass, ':');
+ $url .= "$user@";

This comment has been minimized.

@stloyd

stloyd May 7, 2016

Contributor
$url .= $user . '@';
@stloyd

stloyd May 7, 2016

Contributor
$url .= $user . '@';
+ foreach (array('http', 'https') as $scheme) {
+ if ($this->scheme === $scheme) {
+ $label = $scheme . '_proxy';
+ foreach (array($label, strtoupper($label)) as $l) {

This comment has been minimized.

@stloyd

stloyd May 7, 2016

Contributor

Loop inside loop looks like overkill, it's small but IMO two ifs would be more clear.

@stloyd

stloyd May 7, 2016

Contributor

Loop inside loop looks like overkill, it's small but IMO two ifs would be more clear.

@hirak hirak referenced this pull request in hirak/prestissimo May 7, 2016

Closed

Merge to composer/composer #67

6 of 6 tasks complete
+ $this->permanent = $permanent;
+
+ // for PHP<5.5 @see getFinishedResults()
+ $this->blackhole = fopen('php://memory', 'wb');

This comment has been minimized.

@hirak

hirak May 10, 2016

Contributor

$this->blackhole = tmpfile(); or $this->blackhole = fopen('php://temp', 'wb');

I think 'php://temp' is better.
http://stackoverflow.com/questions/6841854/how-to-do-curl-put-requests-with-a-php-memory-file-handle

@hirak

hirak May 10, 2016

Contributor

$this->blackhole = tmpfile(); or $this->blackhole = fopen('php://temp', 'wb');

I think 'php://temp' is better.
http://stackoverflow.com/questions/6841854/how-to-do-curl-put-requests-with-a-php-memory-file-handle

+ curl_setopt($ch, CURLOPT_FILE, $this->blackhole); //release file pointer
+ $index = (int)$ch;
+ $request = $this->runningRequests[$index];
+ if (CURLE_OK === $errno && !$error && (!preg_match('/^http/', $info['url']) || 200 === $info['http_code'])) {

This comment has been minimized.

@stloyd

stloyd May 10, 2016

Contributor

Why not using strpos() instead of preg_match()?

false === strpos($info['url'], 'http')
@stloyd

stloyd May 10, 2016

Contributor

Why not using strpos() instead of preg_match()?

false === strpos($info['url'], 'http')

This comment has been minimized.

@hirak

hirak May 15, 2016

Contributor

I think substr() is the best choice for performance. I'll rewrite.

@hirak

hirak May 15, 2016

Contributor

I think substr() is the best choice for performance. I'll rewrite.

+ if (!file_exists($targetdir)) {
+ if (!mkdir($targetdir, 0766, true)) {
+ throw new FetchException(
+ "The file could not be written to $fileName."

This comment has been minimized.

@stloyd

stloyd May 10, 2016

Contributor

Shouldn't you mention also the directory? And to match other parts of code this should be string concat.

@stloyd

stloyd May 10, 2016

Contributor

Shouldn't you mention also the directory? And to match other parts of code this should be string concat.

@hirak hirak referenced this pull request in hirak/prestissimo May 15, 2016

Closed

Backport from hirak/composer patch #95

+ }
+ } while ($remains > 0);
+
+ return compact('successCnt', 'failureCnt', 'urls');

This comment has been minimized.

@staabm

staabm May 15, 2016

Contributor

compact is not opimizable by php runtimes, better use the longer and more explicit form

@staabm

staabm May 15, 2016

Contributor

compact is not opimizable by php runtimes, better use the longer and more explicit form

+ $this->permanent = $permanent;
+
+ // for PHP<5.5 @see getFinishedResults()
+ $this->blackhole = fopen('php://temp', 'wb');

This comment has been minimized.

@staabm

staabm May 15, 2016

Contributor

How much data will be written into this stream? php://temp will use a file and gets therefore slow if it is more than 2MB by default.

We could use php://memory which is always stored in memory.

See http://php.net/manual/en/wrappers.php.php for details

@staabm

staabm May 15, 2016

Contributor

How much data will be written into this stream? php://temp will use a file and gets therefore slow if it is more than 2MB by default.

We could use php://memory which is always stored in memory.

See http://php.net/manual/en/wrappers.php.php for details

This comment has been minimized.

@hirak

hirak May 15, 2016

Contributor

blackhole is never written in.
The purpose of blackhole is to release a file resource temporary.
https://github.com/composer/composer/pull/5293/files#diff-d0769a23664b99a74bc1abfef231cf87R152

CURLOPT_FILE needs FILE* castable resource. fopen('php://memory') is not compatible resource, some errors reported.
https://github.com/php/php-src/blob/PHP-5.3.29/ext/curl/interface.c#L1904
hirak/prestissimo#93

As far as I know, fopen('php://temp') is the best choice.

@hirak

hirak May 15, 2016

Contributor

blackhole is never written in.
The purpose of blackhole is to release a file resource temporary.
https://github.com/composer/composer/pull/5293/files#diff-d0769a23664b99a74bc1abfef231cf87R152

CURLOPT_FILE needs FILE* castable resource. fopen('php://memory') is not compatible resource, some errors reported.
https://github.com/php/php-src/blob/PHP-5.3.29/ext/curl/interface.c#L1904
hirak/prestissimo#93

As far as I know, fopen('php://temp') is the best choice.

This comment has been minimized.

@staabm

staabm May 15, 2016

Contributor

Which file resource does it release? One allocated by php-src?

@staabm

staabm May 15, 2016

Contributor

Which file resource does it release? One allocated by php-src?

+ $p = $op->getTargetPackage();
+ break;
+ default:
+ continue 2;

This comment has been minimized.

@staabm

staabm May 15, 2016

Contributor

Should this just be continue?

The switch cant be continued, can it?

@staabm

staabm May 15, 2016

Contributor

Should this just be continue?

The switch cant be continued, can it?

This comment has been minimized.

@hirak

hirak May 15, 2016

Contributor

http://php.net/manual/en/control-structures.switch.php

Note: Note that unlike some other languages, the continue statement applies to switch and acts similar to break. If you have a switch inside a loop and wish to continue to the next iteration of the outer loop, use continue 2.

@hirak

hirak May 15, 2016

Contributor

http://php.net/manual/en/control-structures.switch.php

Note: Note that unlike some other languages, the continue statement applies to switch and acts similar to break. If you have a switch inside a loop and wish to continue to the next iteration of the outer loop, use continue 2.

This comment has been minimized.

@staabm

staabm May 15, 2016

Contributor

Wasnt aware, thx.

@staabm

staabm May 15, 2016

Contributor

Wasnt aware, thx.

+ try {
+ $request = new CopyRequest($url, $destination, $useRedirector, $io, $config);
+ $requests[] = $request;
+ } catch (FetchException $e) {

This comment has been minimized.

@staabm

staabm May 15, 2016

Contributor

Where can such a Exception be thrown actually?

@staabm

staabm May 15, 2016

Contributor

Where can such a Exception be thrown actually?

This comment has been minimized.

@hirak

hirak May 20, 2016

Contributor

@throws FetchException if cache destination is not writable.
new CopyRequest -> CopyRequest->setDestination
https://github.com/composer/composer/pull/5293/files#diff-8e8be57b8b829c1263a0198c501ae1a1R315

@hirak

hirak May 20, 2016

Contributor

@throws FetchException if cache destination is not writable.
new CopyRequest -> CopyRequest->setDestination
https://github.com/composer/composer/pull/5293/files#diff-8e8be57b8b829c1263a0198c501ae1a1R315

-
- return stream_context_create($options, $defaultParams);
+ return $ua = sprintf(
+ 'User-Agent: Composer/%s (%s; %s; %s)',

This comment has been minimized.

@hirak hirak referenced this pull request in hirak/prestissimo May 21, 2016

Merged

Backport #98

@avindra

This comment has been minimized.

Show comment
Hide comment
@avindra

avindra May 26, 2016

Regarding speed improvements... our composer install time went from 2m7.586s to 2.86s (after packages are cached).

If you think this is all due to caching.... with no packages cached, it takes about 5 seconds for us...

The speed improvement is very dramatic in my case because I'm also encountering #4332 , which forces us to serially clone every single dependency from git.... 😭

Can't wait for this to be merged

avindra commented May 26, 2016

Regarding speed improvements... our composer install time went from 2m7.586s to 2.86s (after packages are cached).

If you think this is all due to caching.... with no packages cached, it takes about 5 seconds for us...

The speed improvement is very dramatic in my case because I'm also encountering #4332 , which forces us to serially clone every single dependency from git.... 😭

Can't wait for this to be merged

@Simperfit

This comment has been minimized.

Show comment
Hide comment
@Simperfit

Simperfit Jun 3, 2016

i'm totally +1 on this.

i'm totally +1 on this.

@staabm

This comment has been minimized.

Show comment
Hide comment
@staabm

staabm Jun 11, 2016

Contributor

@Seldaek whats your opinion on this one?

Contributor

staabm commented Jun 11, 2016

@Seldaek whats your opinion on this one?

@mindplay-dk

This comment has been minimized.

Show comment
Hide comment
@mindplay-dk

mindplay-dk Jun 28, 2016

Contributor

@hirak thanks for this awesome work, I can't wait for the merge :-D

Contributor

mindplay-dk commented Jun 28, 2016

@hirak thanks for this awesome work, I can't wait for the merge :-D

@mindplay-dk

This comment has been minimized.

Show comment
Hide comment
@mindplay-dk

mindplay-dk Jun 28, 2016

Contributor

@hirak is this for composer install only, or composer update as well? (updates is where it really hurts, normally - as this is where development time is lost...)

See also #1298 and #2847

Contributor

mindplay-dk commented Jun 28, 2016

@hirak is this for composer install only, or composer update as well? (updates is where it really hurts, normally - as this is where development time is lost...)

See also #1298 and #2847

@hirak

This comment has been minimized.

Show comment
Hide comment
@hirak

hirak Jun 28, 2016

Contributor

@mindplay-dk affects install and update both.

Composer's task is ...

  1. command handling composer update / composer install (without composer.lock)
  2. download meta.json files from https://packagist.org/
  3. resolve dependency (and additional downloading for meta.json)
  4. download package.zip (many files)
  5. generate autoloader

This feature accelerate "4. download package.zip" only.
hirak/prestissimo accelerate "2. download meta.json" too (enable Keep-Alive).
But I think that feature is little complex, so this P-R doesn't contain it. I am planning next P-R.

I think "3. resolve dependency" can become faster by CSP algorithm tuning or multi-processing. But it's a difficult work for me.

Contributor

hirak commented Jun 28, 2016

@mindplay-dk affects install and update both.

Composer's task is ...

  1. command handling composer update / composer install (without composer.lock)
  2. download meta.json files from https://packagist.org/
  3. resolve dependency (and additional downloading for meta.json)
  4. download package.zip (many files)
  5. generate autoloader

This feature accelerate "4. download package.zip" only.
hirak/prestissimo accelerate "2. download meta.json" too (enable Keep-Alive).
But I think that feature is little complex, so this P-R doesn't contain it. I am planning next P-R.

I think "3. resolve dependency" can become faster by CSP algorithm tuning or multi-processing. But it's a difficult work for me.

@Seldaek Seldaek modified the milestone: 1.3 Jul 2, 2016

@mindplay-dk

This comment has been minimized.

Show comment
Hide comment
@mindplay-dk

mindplay-dk Jul 6, 2016

Contributor

@hirak I don't understand much of (3) the dependency resolver - but have wondered if topsort might help? I used the StringSort algorithm in one project to do a topological sort and it is extremely fast. (I imagine what Composer does is nowhere near as simple as just doing a topological sort though...)

Contributor

mindplay-dk commented Jul 6, 2016

@hirak I don't understand much of (3) the dependency resolver - but have wondered if topsort might help? I used the StringSort algorithm in one project to do a topological sort and it is extremely fast. (I imagine what Composer does is nowhere near as simple as just doing a topological sort though...)

@stof

This comment has been minimized.

Show comment
Hide comment
@stof

stof Jul 6, 2016

Contributor

@mindplay-dk solving dependencies is not about sorting at all

Contributor

stof commented Jul 6, 2016

@mindplay-dk solving dependencies is not about sorting at all

@hirak

This comment has been minimized.

Show comment
Hide comment
@hirak

hirak Jul 7, 2016

Contributor

@mindplay-dk Thanks. It's interesting. But I think that it's difficult to support conflict, suggest, provide and replace dependencies.

Contributor

hirak commented Jul 7, 2016

@mindplay-dk Thanks. It's interesting. But I think that it's difficult to support conflict, suggest, provide and replace dependencies.

@martinsik martinsik referenced this pull request in hirak/prestissimo Sep 5, 2016

Closed

Support for multi-threaded composer update? #17

@khromov

This comment has been minimized.

Show comment
Hide comment
@khromov

khromov Sep 6, 2016

Contributor

Great PR, hoping to see some movement on this!

Contributor

khromov commented Sep 6, 2016

Great PR, hoping to see some movement on this!

@avindra

This comment has been minimized.

Show comment
Hide comment
@avindra

avindra Oct 12, 2016

Node.js effectively just got this with Yarn. Happy to see this is becoming
the norm!

On Wed, Oct 12, 2016, 5:56 PM Lucas Mezêncio notifications@github.com
wrote:

Will this PR be merged? I can not wait for it! 😃


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#5293 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHvgNiKtwJZCfQK3fjN9O75-O67gi9_ks5qzVd8gaJpZM4IY3pB
.

avindra commented Oct 12, 2016

Node.js effectively just got this with Yarn. Happy to see this is becoming
the norm!

On Wed, Oct 12, 2016, 5:56 PM Lucas Mezêncio notifications@github.com
wrote:

Will this PR be merged? I can not wait for it! 😃


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#5293 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHvgNiKtwJZCfQK3fjN9O75-O67gi9_ks5qzVd8gaJpZM4IY3pB
.

@Seldaek Seldaek modified the milestones: 1.3, 1.4 Nov 6, 2016

@Seldaek Seldaek modified the milestones: 1.4, 2.0 Mar 7, 2017

@mabar mabar referenced this pull request in hirak/prestissimo Sep 22, 2017

Closed

Add plugin to composer core #154

@hboomsma

This comment has been minimized.

Show comment
Hide comment
@hboomsma

hboomsma Oct 2, 2017

gitlab-token is alive, we are currently using it.

hboomsma commented Oct 2, 2017

gitlab-token is alive, we are currently using it.

@CDRO

This comment has been minimized.

Show comment
Hide comment
@CDRO

CDRO Nov 13, 2017

Any news on this?

CDRO commented Nov 13, 2017

Any news on this?

@m1guelpf

This comment has been minimized.

Show comment
Hide comment

@hirak Updates?

@LKDevelopment

This comment has been minimized.

Show comment
Hide comment
@LKDevelopment

LKDevelopment Dec 19, 2017

@Seldaek are there any Planes to merge this? I Install this Plugin in every Environment we Develop or deploy and IT Speeds Up our deployment process 50%

@Seldaek are there any Planes to merge this? I Install this Plugin in every Environment we Develop or deploy and IT Speeds Up our deployment process 50%

@AbdelkaderBah

This comment has been minimized.

Show comment
Hide comment
@AbdelkaderBah

AbdelkaderBah Dec 25, 2017

This will help a lot.

This will help a lot.

@joshuaadickerson

This comment has been minimized.

Show comment
Hide comment
@joshuaadickerson

joshuaadickerson Jan 26, 2018

Please merge. We're seeing major gains with parallel composer install/update.

Please merge. We're seeing major gains with parallel composer install/update.

@pedrofurtado

This comment has been minimized.

Show comment
Hide comment
@pedrofurtado

pedrofurtado Feb 3, 2018

Any news about this issue?

Any news about this issue?

@AbdelkaderBah

This comment has been minimized.

Show comment
Hide comment
@AbdelkaderBah

AbdelkaderBah Feb 9, 2018

Hope you guys give a look for this pull, its been months I'm waiting this feature

Hope you guys give a look for this pull, its been months I'm waiting this feature

@lsl lsl referenced this pull request in hirak/prestissimo Feb 15, 2018

Open

Merge this library to Composer Core? #164

@rakshazi

This comment has been minimized.

Show comment
Hide comment
@rakshazi

rakshazi Feb 16, 2018

Still waiting for it :(

Still waiting for it :(

@staabm

This comment has been minimized.

Show comment
Hide comment
@staabm

staabm Feb 16, 2018

Contributor

please stop spamming this topic with useless comments. jordi is aware that a lot of people would love this feature, but it is not top priority.

everybody can use the plugin in the meantime.

Contributor

staabm commented Feb 16, 2018

please stop spamming this topic with useless comments. jordi is aware that a lot of people would love this feature, but it is not top priority.

everybody can use the plugin in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment