Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite fetchgitLocal #10176

Merged
merged 1 commit into from
Oct 8, 2015
Merged

Rewrite fetchgitLocal #10176

merged 1 commit into from
Oct 8, 2015

Conversation

Ericson2314
Copy link
Member

For practical purposes, here are the changes in behavior:

  • When fetching from a subdirectory of a repo, do not rebuild because of
    changes elsewhere in the repo
  • Fetch (not-ignored) untracked files too

It does this by letting git hash and export the directory in question, which I believes makes for a cleaner implementation than the ad-hoc copying and hashing that was there before.

We have multiple library in our repo (see https://github.com/unisonweb/platform/blob/master/env.nix) and it was very annoying to needless rebuild everything if anything changes.

@pchiusano

For practical purposes, here are the changes in behavior:
 - When fetching from a subdirectory of a repo, do not rebuild because of
   changes elsewhere in the repo
 - Fetch (not-ignored) untracked files too

It does this by letting git hash and export the directory in question,
which I believes makes for a cleaner implementation than the ad-hoc copying
and hashing that was there before.

cp $ROOT/.git/index $ROOT/.git/index-user # backup index
git reset # reset index
git add . # add current directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This git add will be reversed by the mv command on line 24, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

@puffnfresh
Copy link
Member

I like it 👍

@Ericson2314
Copy link
Member Author

Thanks!

gridaphobe added a commit that referenced this pull request Oct 8, 2015
@gridaphobe gridaphobe merged commit 9be18c4 into NixOS:master Oct 8, 2015
@gridaphobe
Copy link
Contributor

👍 thanks!

@edolstra
Copy link
Member

edolstra commented Oct 8, 2015

Eh. I'm not in favor of using builtins.currentTime in Nixpkgs. It's an undocumented misfeature that shouldn't be relied on.

@gridaphobe
Copy link
Contributor

@edolstra is there a better alternative?

This use of builtins.currentTime is essentially a hack to mark the computed hash as impure, so nix will recompute it every time. Without this step, nix is too clever and skips rebuilding the local package because the the source directory is the only other input and it is invariant.

Is there a way to mark expressions as impure? That's what we really want to do for the hash computation.

@Ericson2314
Copy link
Member Author

@edolstra First of all, the old version used the same basic trick, I did not introduce it.

The idea, (before and after my changes) is the first derivation is impure, and generates a "token" used syntactically (via a dynamic import, readFile, etc). The second derivation, taking that "token" as a syntactic input (i.e. we have a dynamic import or readFile, both of which are monadic bind-like constructs) is thus pure---ideally the token would be a fixed-output hash to "proove" the purity.

I'd say the root problem is our treatment of local paths. It's impossible to write a derivation that relies on paths outside of the nix-store and isn't a fixed out, and expect it to be deterministic. Our current trick of copying local paths to the store amounts to the same thing as this fetch git local stuff: an initial impure step carefully constructed so as not to invalidate caching downstream.

Perhaps we ought to generalize this with impure derivations: they get built every time, but are hashed according to their contents not their inputs so as not to affect downstream. Instead of always copying local paths to the store, hopefully sandboxed builds can be the default so such paths just aren't accessible at all.

Note the once we have an intentional store, the content-hashing part of such impure derivations would simply fall out of the normal rules. Also, everything here could in fact be done in one big impure derivation, and downstream would still be cached just as well.

@Ericson2314
Copy link
Member Author

@edolstra Also, this could be rewritten with a src = ./. to copy the entire git repo, and then extract the correct directory, which I suppose is the conventional way of doing such things today. But that would be much inferior. First of all, since the git repo contains all the history, it can be be indefinitely bigger than the working copy. Second of all, the hash then tracks a bunch of irrelevant other state and thus one gets many more spurious rebuilds.

@gridaphobe
Copy link
Contributor

@Ericson2314 that's exactly why I wrote fetchgitLocal in the first place. Using src = ./. causes needless recompilation due to build artifacts, and can also be extremely slow since it has to hash all of the build artifacts.

@Ericson2314
Copy link
Member Author

Sure! Just want to make why things are the way they are for @edolstra or anyone else that is interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants