New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a single shared package database per GHC version #878

Closed
snoyberg opened this Issue Aug 30, 2015 · 13 comments

Comments

Projects
None yet
4 participants
@snoyberg
Contributor

snoyberg commented Aug 30, 2015

Motivation: avoid unnecessary recompiles. I held off on commenting on this until the new Cabal version with this support was announced, but given that it may be a while before that support filters down for everyone to be able to use it, I don't want to hold off any longer. Pinging @wolftune and @3noch from #771, and @borsboom and @chrisdone for general feedback.

Here's my idea:

  • Inside ~/.stack, we have a package database for all libraries ever built with that version of GHC, Cabal version, etc. In other words: namespaced in the normal way we namespace these things.
  • In order to work around Cabal limitations: when we configure with Cabal, we give it a temporary package database so it registers there and does not have an opportunity to unregister other packages in the mega-database. We then manually copy that registered package info into the mega-database.
  • Instead of having a snapshot database in ~/.stack, the snapshot database moves into the project .stack-work directory. However, instead of containing the libraries themselves, it just contains a copy of the register files from the ~/.stack mega-database.
  • We ensure that this database contains only the packages used by our project, including packages from the global database. That way: stack ghci and friends just include that database and the local database, and don't need to worry about conflicting packages.

Concerns:

  • There's no intelligent story any more for deleting just one snapshot, you'd really have to wipe out all of ~/.stack. However, given how much less storage and recompiling will be needed, I think that's a worthwhile tradeoff.
  • There's some tricky business around executables. We may end up needing to recompile executables more often if we can't figure out something smarter.
@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Aug 30, 2015

Possible short-cut to implementation, that may also add easier support for GHC 7.8. Starting from today's implementation:

  1. After installing a snapshot package, reregister into a mega-database
  2. When asked to build a snapshot library, first check if it exists in the mega-database (with matching flags, dependencies, etc) and, if so, copy to the snapshot database

snoyberg added a commit that referenced this issue Aug 31, 2015

@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Aug 31, 2015

I've opened PR #884 about this, closing.

@snoyberg snoyberg closed this Aug 31, 2015

@snoyberg snoyberg removed the in progress label Aug 31, 2015

chrisdone added a commit that referenced this issue Aug 31, 2015

Merge pull request #884 from commercialhaskell/878-shared-database
Share binary package buils between snapshots #878
@rvion

This comment has been minimized.

Contributor

rvion commented Aug 31, 2015

yay 👍

@wolftune

This comment has been minimized.

Contributor

wolftune commented Aug 31, 2015

wow, that was fast! If I understand right, this fix copies things, so there are redundant files taking up small disk space rather than just links, but that means we get the main benefit of skipping redundant compiling and we retain the ability to trash individual LTS version directories as each directory for an LTS snapshot is still fully self-contained, right?

@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Aug 31, 2015

Not quite. It copies the executables still (which is a mistake on my part, it should hard link them on Unix systems, there's even a FIXME about it). Libraries and data files, however, are not copied at all and remain in the original directories, meaning that the snapshot is not fully self-contained. I wrote up a blog post about this.

The file which is copied between snapshots is the .conf file inside the package database, but that one is tiny.

@wolftune

This comment has been minimized.

Contributor

wolftune commented Aug 31, 2015

Okay that sounds great, maximally efficient (once that FIXME is done). So, if someone were to carelessly trash an LTS directory because no project still used that LTS, would Stack know that libraries and data files (and executable after the FIXME) were missing from the newer LTS and thus then do the necessary recompiling to get them back into the necessary LTS directory? (Seems okay to let this be broken and tell people not to do that, but it's even better if this won't break things)

@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Aug 31, 2015

No, it doesn't do intelligent recovery. The main reason for that is that it would add a major performance hit to stack: instead of being able to trust ghc-pkg's reporting of available libraries, it would need to check for the presence of files. In theory we could have a lint-like command that would remove broken packages, but I'd much rather just tell people "don't do that."

@wolftune

This comment has been minimized.

Contributor

wolftune commented Aug 31, 2015

Got it, thanks for the clarity. But a full-refresh would obviously work if someone wants to actually trash .stack and .stack-work and recompile everything. I'm clear about it now, cheers

@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Aug 31, 2015

And "that was fast" because I was on a train to and from Tel Aviv for 3 hours yesterday :)

I wrote up a blog post about this implementation, I think I'll publish it tomorrow. It contains some of these details. And you're mentioned by name actually :)

@3noch

This comment has been minimized.

Member

3noch commented Aug 31, 2015

This is awesome. How does this relate to Cabal's newer "backpacking" features? Perhaps your blog covers that.

@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Aug 31, 2015

Yes it does

On Mon, Aug 31, 2015, 7:23 PM Elliot Cameron notifications@github.com
wrote:

This is awesome. How does this relate to Cabal's newer "backpacking"
features? Perhaps your blog covers that.


Reply to this email directly or view it on GitHub
#878 (comment)
.

@snoyberg

This comment has been minimized.

Contributor

snoyberg commented Sep 1, 2015

@3noch

This comment has been minimized.

Member

3noch commented Sep 1, 2015

Perfect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment