Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional `hash` parameter to `builtins.fetchGit` #3216

Open
wants to merge 1 commit into
base: master
from

Conversation

@Ma27
Copy link
Member

Ma27 commented Nov 9, 2019

cc @globin @xbreak


This is particularly useful to ensure the purity e.g. of a repo which is
pinned to a git tag (tags can be force-pushed which makes builds silently irreproducible)
or when trying to evaluate a nix expression without internet connectivity (if the
cache-entry is expired, the eval will break as the git repo can't be re-fetched).

This change adds an optional hash parameter (with an SRI-hash[1]) to the builtin
fetchGit and checks if there's a content-addressable path which matches the hash
(the hash can be determined using nix to-sri --type sha256 $(nix-prefetch-git ...)).

If no such path exists, it will be attempted to fetch the Git repository in
order to compare the hashes. Please note that caching is still used here, so
if the repo is already fetched and the only hash in the expression changes,
the evaluation will fail pretty fast.

[1] https://www.w3.org/TR/SRI/

Copy link
Member

FRidh left a comment

I also started to have my doubts about refs like in #2582. For example

builtins.fetchGit {
  url = https://github.com/NixOS/nixpkgs.git;
  ref = "refs/pull/1024/head;
}

While I want to be able to fetch them, they are mutable. Maybe we should discourage the use of them?

doc/manual/expressions/builtins.xml Outdated Show resolved Hide resolved
doc/manual/expressions/builtins.xml Outdated Show resolved Hide resolved
@Ma27

This comment has been minimized.

Copy link
Member Author

Ma27 commented Nov 9, 2019

While I want to be able to fetch them, they are mutable

The ref argument is needed to e.g. fetch git tags (checking out a git tag as a rev didn't work here, as far as I understand the fetchgit code from nixpkgs, ref/rev is mixed together, so rev = "v1.2.3" works there, but not with builtins.fetchGit).

@Ericson2314

This comment has been minimized.

Copy link
Member

Ericson2314 commented Nov 13, 2019

@matthewbauer

This comment has been minimized.

Copy link
Member

matthewbauer commented Nov 13, 2019

While I want to be able to fetch them, they are mutable

The ref argument is needed to e.g. fetch git tags (checking out a git tag as a rev didn't work here, as far as I understand the fetchgit code from nixpkgs, ref/rev is mixed together, so rev = "v1.2.3" works there, but not with builtins.fetchGit).

I would consider fetchGit with "rev" to be fixed-output and fetchGit with "ref" to be not fixed-output. In Git, branches and tags are considered references, and are not content-addressable (although tags require force push to change). Perhaps another arg could be added like tag or namedRev for only the tag subset of refs.

@Ma27

This comment has been minimized.

Copy link
Member Author

Ma27 commented Nov 13, 2019

@globin and I actually discussed about this last weekend :)

The reason why ref is currently used is because tags can be fetched with it and we use tags (although those are not 100% content-addressable) pretty often in nixpkgs.

Perhaps another arg could be added like tag or namedRev for only the tag subset of refs.

That's a good idea actually! Will implement this probably next weekend :)

@edolstra

This comment has been minimized.

Copy link
Member

edolstra commented Nov 14, 2019

Could you replace sha256 with generic hash argument containing an SRI hash, as @FRidh suggests? We've been trying to move away from sha1/sha256/... arguments (see e.g. <nix/fetchurl.nix>).

@edolstra

This comment has been minimized.

Copy link
Member

edolstra commented Nov 14, 2019

I would consider fetchGit with "rev" to be fixed-output and fetchGit with "ref" to be not fixed-output.

To be clear, neither is fixed-output because they're not derivations. But they both produce content-addressable store paths. The difference is that with rev you'll always get the same CA store path and with ref you won't necessarily. But in neither case can you compute the store path from the rev or ref alone.

Morally speaking, fetching using a ref and a hash is not worse than using rev.

@matthewbauer

This comment has been minimized.

Copy link
Member

matthewbauer commented Nov 14, 2019

Morally speaking, fetching using a ref and a hash is not worse than using rev.

Won't fetching using a ref and a hash inevitably lead to hash mismatch errors, though? Perhaps to Nix it's all the same, but not to Git. The rev commit hash is content-addressable: derived from its parent commit(s), working tree, and commit metadata. Nix could be taught one day to recognize it as such, making the hash argument here optional. On the other hand we could never do the same for ref as it is only a pointer to a commit.

@globin

This comment has been minimized.

Copy link
Member

globin commented Nov 14, 2019

Yeah but we often want to pin on a tag and guard that with the hash to notice if people force push a new version and still have a pure evaluation.

@edolstra

This comment has been minimized.

Copy link
Member

edolstra commented Nov 14, 2019

Nix could be taught one day to recognize it as such, making the hash argument here optional.

That would be tricky because the rev is a hash over the entire history rather than the current tree. So unless we copy the entire history to the store, Nix won't be able to verify the rev.

BTW with flakes it's fine to use just a tag/branch because it will get locked to a specific rev in the lock file.

@Ma27 Ma27 force-pushed the Ma27:sha256-support-for-builtins.fetchGit branch from 0872bb5 to 1b1b6ab Nov 14, 2019
@Ma27 Ma27 changed the title Add an optional `sha256` parameter to `builtins.fetchGit` Add an optional `hash` parameter to `builtins.fetchGit` Nov 14, 2019
@Ma27

This comment has been minimized.

Copy link
Member Author

Ma27 commented Dec 6, 2019

@edolstra may I ask if there's anything else to add/fix? :)

@Ericson2314

This comment has been minimized.

Copy link
Member

Ericson2314 commented Dec 9, 2019

@edolstra Matt and I were also talking about using tree hashes, which Nix could verify without storing history.

This is particularly useful to ensure the purity e.g. of a repo which is
pinned to a git tag (tags can be force-pushed which makes builds silently irreproducible)
or when trying to evaluate a nix expression without internet connectivity (if the
cache-entry is expired, the eval will break as the git repo can't be re-fetched).

This change adds an optional `hash` parameter (with an SRI-hash[1]) to the builtin
`fetchGit` and checks if there's a content-addressable path which matches the hash
(the hash can be determined using `nix to-sri --type sha256 $(nix-prefetch-git ...)`).

If no such path exists, it will be attempted to fetch the Git repository in
order to compare the hashes. Please note that caching is still used here, so
if the repo is already fetched and the only hash in the expression changes,
the evaluation will fail pretty fast.

[1] https://www.w3.org/TR/SRI/
@Ma27 Ma27 force-pushed the Ma27:sha256-support-for-builtins.fetchGit branch from 1b1b6ab to 24e14cc Dec 21, 2019
@Ma27

This comment has been minimized.

Copy link
Member Author

Ma27 commented Dec 21, 2019

Just updated the patch to work with the refactorings on the store API on master :)

@Ma27

This comment has been minimized.

Copy link
Member Author

Ma27 commented Jan 17, 2020

@edolstra @Ericson2314 is there anything I can address here atm? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.