Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel download of dist archives #2847

Closed
h4cc opened this issue Mar 27, 2014 · 6 comments
Closed

Parallel download of dist archives #2847

h4cc opened this issue Mar 27, 2014 · 6 comments
Labels
Milestone

Comments

@h4cc
Copy link
Contributor

h4cc commented Mar 27, 2014

When using dist files for downloading, it would be very cool to have a parallel download of these archives.

Is there any chance to add such a feature?

@Seldaek
Copy link
Member

Seldaek commented Mar 27, 2014

There is a chance, but it would require quite a few changes in the way things are done, so right now it's not really on the roadmap.

@Seldaek Seldaek added this to the Later milestone Mar 27, 2014
@firstrow
Copy link

firstrow commented Apr 9, 2014

Really interesting feature!

@jakoch
Copy link
Contributor

jakoch commented Apr 26, 2014

I just want to leave a small note here, that i'm working on this and already have a working version.
It was quite hard to find a way to add this functionality. I spend several hours to find a way to implement this. At first, i tried to use downloaders, but they are not only downloaders, but complete handlers and its not possible to intercept or overload their behaviour to parallelize them.
I skipped that approach completely and came up with a simple cache-warming-strategy.

Context: Installer->doInstall().
After the dependencies are resolved, but before installations/updates are run, the "$operations" stack is fetched. The $operations array contains the packages infos with the desired distUrls. The distUrls are placed into a download list, which is written to a temp file and handed over to an external download util. The files are downloaded to the cache folder. Then Composer proceeds with installations/updates and finds the cached files and re-uses them.

The following downloaders are implemented:

  • "aria2c"
  • "parallel+wget"
  • "wget"
  • "curl multi"

In other words: composer will use aria2c for parallalized downloading, if installed.

The effect on download speed is huge - especially when downloading with "aria2c".
That's due to server connection re-usage and parted downloads.

I have several questions.
Mostly, they are related to the level of output the external tool should provide and
when to provide ("debug", "dry-run" and "verbosity level") output.

  1. The external download util produces "download progress" output.
    Should that be displayed or hidden by default?

  2. The external download util has the option, to log to file.
    Only in debug mode? Where to store the log file? To the "home" folder?

  3. Because files are downloaded to the cache, i guess, i need to wrap this into a "cache on" check.
    How can i check that, caching is active/enabled.. or is it on by default?

  4. aria supports downloading one file from multiple servers (mirrors) in parallel.

{
   "type": "package",
   "package": {
   "name": "vendor/package",
   "version": "1.2.3",
   "dist": {
       "url": ["http://host.a/package-1.2.3.zip", "http://host.b/package-1.2.3.zip"],
       "type": "zip"
   }
}

Related: #3015

Regards, Jens

@Koc
Copy link
Contributor

Koc commented Feb 25, 2019

is #7904 closes this issue?

@stof
Copy link
Contributor

stof commented Feb 26, 2019

@Koc no. That one implemented parallel downloading inside the Composer repository, which is a place managing a bunch of downloads on its own.
Supporting parallel downloading of archives requires refactoring the InstallationManager (as the current API deals with only 1 download at a time). I think this will be done only after the refactoring of the solver architecture is completed. Lower-level APIs of the downloaders have already been prepared for the parallel downloading in #7904 though.

Seldaek added a commit that referenced this issue Nov 14, 2019
… parallel then install only once all downloads succeeded, fixes #2847

This also changes the PRE/POST_PACKAGE_INSTALL/UPDATE/UNINSTALL events to have less information available on them, repositorySet, request and policy are gone
@Seldaek
Copy link
Member

Seldaek commented Nov 22, 2019

Fixed by 006985a

@Seldaek Seldaek closed this as completed Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants