Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Remote cache support #5582

Open
TomMD opened this issue Sep 19, 2018 · 6 comments
Open

RFC: Remote cache support #5582

TomMD opened this issue Sep 19, 2018 · 6 comments

Comments

@TomMD
Copy link
Contributor

TomMD commented Sep 19, 2018

Building the same package over and over is painful and also becoming increasingly common as we have developer workflows that include build bots, VMs, and containers. I propose we add a package-level remote cache to the new-{build,install} process that will query a configured remote cache server and use any available tarballs.

Assuming the premise is accepted there are at least four points to discuss:

  1. Method of configuring cabal-install
  2. Method of indexing the tarballs
  3. Level Of caching
  4. API to the cache server

Method to configure cabal-install

I suggest three optional fields:

  • remote-cache-server A URL of a remote server providing the cache over a common protocol
  • remote-cache-user The username (optional) for basic auth to the remote server
  • remote-cache-password obvious

Method of indexing tarballs

It's basically a content addressable method where the hash of the cabal-hash.txt (that is, hashPackageHashInputs) combined with the cache URI and web route yields the tarball or 404.

This assumes we have captured sufficient information for not just a local cache but a remote cache - that must include architecture, platform, etc. If needed, we can make this assumption true without hurting normal operations.

Level of Caching

Package Level The package caching appears straight-forward. At a high level, we modify ProjectBuilding.hs:950 ish to have, either as part of the build step or a pre-build step / alternative, a query to the cache server for an existing tarball matching hashPackageHashInputs. If such a file exist then we can take that as ground truth. If not then we will need to tar up the result from this build and populate the cache server.

File Level I confess I haven't looked into the pipeline for this granularity. It seems less useful - most the time is spent in the 90% of unchanging packages and not the unchanged modules of the package immediately under development. This is simply an argument for why a file-granularity isn't going to give significantly larger returns.

Cache Server API

The straight-forward answer to me is a bazel-remote style server. I suggest we just match its API since there's nothing fancy about it (HTTP GET/PUT and a webroute with basic auth if configured to require it).

@hvr
Copy link
Member

hvr commented Sep 19, 2018

@TomMD That's an excellent suggestion! :-)

...and in fact I've actually been working on a 2nd-level remote nix-style store for matrix.hackage.haskell.org (I'm using a simple S3 compatible PUT/GET protocol for that which can be easily setup with a local HTTP server as well), and also @phadej was working on something similiar for the purpose of speeding up haskell-ci builds on Travis et al (that has additional challenges regarding ensuring security)... but it's not ready yet; but there's a couple of issues you need to take into account so this involves some tricky logic to get it right.

Moreover, there's a rather significant technical limitation which requires essentially either to find a solution to #4097 or (what I'm currently doing with matrix.hho) require everyone to use the same (e.g.) --store-dir=/usr/share/cabal-cache

PS: As you might infer, I've been investing braincycles in this on and off for 2 years already (and also discussed some of the ideas with Duncan et al about this); but haven't had time to write it down yet as I'd rather invest the time in implementing a proof of concept; and due to upcoming changes in our infrastructure hosting I'm actually forced to get this implemented in the next couple months... :-)

@TomMD
Copy link
Contributor Author

TomMD commented Sep 19, 2018

Hurm, 4097 is pretty ugly. I don't see it as a blocker though - just a blocker to getting maximum benefit.

Am I to understand @phadej or you have progress that supersedes this proposal (with the S3 protocol) or is this still an area for discussion and development? I'm feeling this pain right now, enough that a solution in Cabal HEAD would be valuable enough for me to require the team to use HEAD. If I can help that become a reality it would be neat.

@TomMD
Copy link
Contributor Author

TomMD commented Sep 19, 2018

Your edits tell me:

due to upcoming changes in our infrastructure hosting I'm actually forced to get this implemented in the next couple months

Ok, good news for me. We can close this ticket if you'd like or hijack it to track your work on the caching system.

@hvr
Copy link
Member

hvr commented Sep 19, 2018

@phadej actually has working code which is tailored to the CI case; the thing I'm going for is more ambitious as I need to maximise the utilisation of the matrix CI cluster, reduce contention between workers as much as possible by syncing as eagerly as possible, as well as guarantee inavriants such as that any artifact is built exactly once and/or that only ever artifacts which were build against artifacts in the remote store ever get uploaded into the store (if you don't do that, you risk subtle corruption because GHC is not 100% deterministic in its compilation), and all this while trying to make the upload protocol is as reentrant/atomic as possible regarding multiple workers trying to publish at the same time the same artifact...

@hvr
Copy link
Member

hvr commented Sep 19, 2018

@TomMD no, we should definitely keep this one open until it's addressed; and if you are willing to invest time I'd be happy to elaborate more about the design and ideas I was going for; this would definitely help getting this feature earlier into cabal proper as well as increase the likelihood that it actually helps w/ your use-case as well :-)

@23Skidoo
Copy link
Member

23Skidoo commented Oct 1, 2019

People looking at this issue might also be interested in http://hackage.haskell.org/package/cabal-cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants