Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package cache #915

Closed
wants to merge 12 commits into from
Closed

Package cache #915

wants to merge 12 commits into from

Conversation

chEbba
Copy link
Contributor

@chEbba chEbba commented Jul 15, 2012

Reviewed system package cache from #62.

How it works:

  • After package was downloaded it is copied (actualy repacked from targetDir) to local system repository with modified dist url.
  • Local system repository is added to the RepositoryManager as the first repository.
  • If a package is found in the local system reposity it is installed from it (just copied).

Options:

  • system-repository (default false) to enable system repository in manager,
  • package-cache (default false) to store downloaded packages in the system repository

Notes:

  • This PR comes with improved Archive support (for now only with zip and tar), and can be used in downloaders (1 ArchiveDownloader with Extractors).
  • Packages are compressed with zip (if enabled) or with tar (fallback).
  • Works only for dists.

Planned improvements:

  • Additional configuration options.
  • CLI option to disable cache for command.
  • CLI commands for system repository manipulation.

Discussion:

  • General workflow and details.
  • Configuration and factory creation.

class RepositoryStorage implements StorageInterface
{
private static $loader;
private static $dumper;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why making them static ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because they are used in static method convertPackage(). Why is this method static? Don't know :) It was more clear for me to have such util method as static with cached dumper and loader.

@palex-fpt
Copy link

You write to cache at DownloadManager and read from it from RepositoryManager.
It has following flaw: in CachedDownloadManage you assume $targetDir contains correctly installed package after parent::install() or parent::update(). It is not true, as installers does not run at this point yet. You should either cache downloaded content (aka http cache) at Download Manager or installed packages (it is hard at this point as there is no way to tell how package spreads to filesystem after installation) at Repository Manager.
It is possible to add some persistant key-value cache to installers (so installer can cache data per package and reuse it), but I doubt it would make more benefit than transport cache.

@travisbot
Copy link

This pull request fails (merged 8fb07f1e into f128182).

@travisbot
Copy link

This pull request fails (merged fc7d1ac1 into f128182).

@travisbot
Copy link

This pull request passes (merged 8ebd4b74 into f128182).

@chEbba
Copy link
Contributor Author

chEbba commented Jul 16, 2012

@palex-fpt Actually it is not a simple cache in one place when you try to replace some havy operation with simple one. It consits of 2 concepts: copy all packages to the dedicated repository (like some proxy), use fast local repository for packages. This concepst can be used separately. You can cache evrething but not enable the system repository. Or you can use system repository with no auto-cache.
About $targetDir: at the downloader level there is nothing about package installation. I assume $targetDir contains raw package (with no idea about installation). At this level i have package metada and raw source in the directory, so i can do with this package whatever i whant. e.g can store my copy of packages in zip (or tar.gz or plain) instead of origin format.

@palex-fpt
Copy link

Oh, I got the point. Your repack packages to local distribution.
Just be sure to add one level of directory structure to zip archive, as ZipDownloader (and any ArchiveDownloader descendant) do special processing with containers with only directory at top level.

@chEbba
Copy link
Contributor Author

chEbba commented Jul 16, 2012

@palex-fpt For now it compresses targetDir content as is (without targetDir itself in the archive).

@palex-fpt palex-fpt mentioned this pull request Jul 18, 2012
@gagarine
Copy link

What append to this issue? Package cache will be great for continuous integration when you have to rebuilt everything at each commit to run you test...

@chEbba
Copy link
Contributor Author

chEbba commented Oct 26, 2012

I will update this PR on weekend with some improvements for defaults and separation of the system repository and cache.

@chEbba
Copy link
Contributor Author

chEbba commented Oct 27, 2012

Rebased and improved. Description was updated.
Configuration is separated to 2 options: system-repository & config-cache.
Add tar archiver to support systems without zip.

@Seldaek
Copy link
Member

Seldaek commented Nov 1, 2012

@chEbba maybe I missed something but it seems to me like this is overly complex. We already have a Cache class that could be integrated in the downloaders with very few changes instead of adding a gazillion new lines of code to maintain. In particular, I fail to see the benefits of repackaging and of the split between repository and cache functionalities. What's the use case with only using one of those (repository & cache)? Furthermore, the fact that you cache packages with a higher priority means that for dev packages you will get stale data instead of getting the newest from packagist.

@Seldaek
Copy link
Member

Seldaek commented Nov 1, 2012

See #1282 for what I mean by lighter approach. It works just as well and with a few more lines will have GC of items not used in more than X. It probably could use one more option to completely disable it, but you get the idea. I'm open for discussion but unless this adds great benefits over the other PR, I would rather have less code to maintain.

@Seldaek Seldaek closed this Nov 11, 2012
@chEbba
Copy link
Contributor Author

chEbba commented Nov 13, 2012

Sorry for late answer. Yes it is not so simple, like file cache. When i started with this problem i understand that there are 2 ways of solving it in the current architecture:

  • Simple file cache on downloader level, when you cache downloaded files (as your Add local cache to dist downloads #1282)
  • Package cache on higher level when you save a package not depending on dist format.

Then I remembered about another package manager concept - local repository (which i called system, as we already have one local in Composer). This repository is useful for development, CI, etc. when you don't need to create any remote repositry (ex. maven local repository).
And I got an idea to implement local storage + package cache (because package cache seems more clear with local storage).

So i will save this branch for future, if system storage concept will be interesting for Composer.

@Seldaek
Copy link
Member

Seldaek commented Nov 13, 2012

Ok I see a bit better, but with my file cache the dist format does not matter either, it just stores it so the actual download step can be skipped later.

I agree it might be useful to have a concept of local repo, but then again you can already set a "composer" repository to a local filesystem path I believe (and if not, it wouldn't be a huge change), so I would rather reuse that than adding a lot of new code.

@patcon
Copy link

patcon commented Nov 13, 2012

@chEbba was the goal to provide a general means to support a feature like git clone --reference-type caches? Would your solution be the only one that supported this?

@patcon
Copy link

patcon commented Nov 13, 2012

Related: http://git-scm.com/docs/git-clone

@Seldaek
Copy link
Member

Seldaek commented Nov 13, 2012

@patcon I don't think this would help much, but that's interesting though, maybe we should do all clones to a central location and then reference the repo like that if it already exists. Only problem would be to do reference counting somehow to avoid using space for repos that aren't used anywhere on disk anymore. If you'd like to create a new issue for this I'd be happy to discuss it further. It would mean much faster clones which would be pretty cool.

@chEbba
Copy link
Contributor Author

chEbba commented Nov 13, 2012

@patcon I think this feature can be implemented on CvsDownloader level. We used such feature in one of our deploy system where we had a local storage with git repositories (we didn't use alternates just clone from local one, but i thnik alternates is better solution). It really reduces build time.

If you create a new issue please link it here. I'm interested in this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants