Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix-prefetch-git: don't fetch everything when given a hash #130040

Merged
merged 1 commit into from Jul 17, 2021

Conversation

Atemu
Copy link
Member

@Atemu Atemu commented Jul 12, 2021

Motivation for this change

It's hugely inefficient as we can't use shallow cloning (--depth=1).

This has been tested and adapted for quite a few hosts fetchgit is used on in
Nixpkgs. For those where fetching the hash directly doesn't work (most notably
git.savannah.gnu.org), we simply fall back to the old method.

@danielfullmer this might be especially interesting for robotnix since all its sources are fetchgits with hashes as rev and most have huge histories we don't need, right? I haven't done in-depth benchmarking but fetching vendor/oneplus went from 5GiB to 1.26GiB and ~4min to 2min.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • 21.11 Release Notes (or backporting 21.05 Relase notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

It's hugely inefficient as we can't use shallow cloning (--depth=1).

This has been tested and adapted for quite a few hosts fetchgit is used on in
Nixpkgs. For those where fetching the hash directly doesn't work (most notably
git.savannah.gnu.org), we simply fall back to the old method.
@r-rmcgibbo
Copy link

r-rmcgibbo commented Jul 12, 2021

Result of nixpkgs-review pr 130040 at 11df411 run on aarch64-linux 1

2 packages failed to build:
12 packages built successfully:
  • bundix
  • cabal2nix
  • crate2nix
  • dep2nix
  • go2nix
  • haskellPackages.cabal2nix-unstable
  • haskellPackages.nvfetcher
  • nix-prefetch-git
  • nix-prefetch-scripts
  • nix-update-source
  • nvfetcher
  • vgo2nix

Note that build failures may predate this PR, and could be nondeterministic or hardware dependent.
Please exercise your independent judgement. Does something look off? Please file an issue or reach out on IRC.


Result of nixpkgs-review pr 130040 at 11df411 run on x86_64-linux 1

2 packages failed to build:
13 packages built successfully:
  • bundix
  • cabal2nix
  • crate2nix
  • crystal2nix
  • dep2nix
  • go2nix
  • haskellPackages.cabal2nix-unstable
  • haskellPackages.nvfetcher
  • nix-prefetch-git
  • nix-prefetch-scripts
  • nix-update-source
  • nvfetcher
  • vgo2nix

Note that build failures may predate this PR, and could be nondeterministic or hardware dependent.
Please exercise your independent judgement. Does something look off? Please file an issue or reach out on IRC.

@jonringer
Copy link
Contributor

cc @Mic92 @infinisil

@danielfullmer
Copy link
Contributor

@Atemu I can also confirm this would significantly decrease the download requirements for various repos in the lineageos flavor, since they do not provide tagged releases. (e.g. vendor/google goes from 3GiB to 1.1GiB)

Thanks!

@Mic92
Copy link
Member

Mic92 commented Jul 17, 2021

Result of nixpkgs-review pr 130040 run on x86_64-linux 1

2 packages failed to build:
  • haskellPackages.update-nix-fetchgit
  • update-nix-fetchgit
13 packages built:
  • bundix
  • cabal2nix
  • crate2nix
  • crystal2nix
  • dep2nix
  • go2nix
  • haskellPackages.cabal2nix-unstable
  • haskellPackages.nvfetcher
  • nix-prefetch-git
  • nix-prefetch-scripts
  • nix-update-source
  • nvfetcher
  • vgo2nix

@Mic92 Mic92 merged commit 6f18141 into NixOS:master Jul 17, 2021
@Atemu Atemu deleted the nix-prefetch-git-shallow-hash branch July 17, 2021 13:54
@Atemu
Copy link
Member Author

Atemu commented Jul 17, 2021

Thanks!

@chkno
Copy link
Member

chkno commented Feb 3, 2022

11df411 changes the hash of some fetches, causing hash-mismatch failures.

Example affected package: python3Packages.brotli

Before this change:

$ nix-build --option substitute false . -A python3Packages.brotli.src
this derivation will be built:
  /nix/store/xmvxd0h7ca3cwpnjciv21m4rxllwvx7v-source.drv
building '/nix/store/xmvxd0h7ca3cwpnjciv21m4rxllwvx7v-source.drv'...
exporting https://github.com/google/brotli.git (rev v1.0.9) into /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source
Initialized empty Git repository in /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/
remote: Enumerating objects: 6539, done.
remote: Counting objects: 100% (460/460), done.
remote: Compressing objects: 100% (286/286), done.
remote: Total 6539 (delta 225), reused 319 (delta 159), pack-reused 6079
Receiving objects: 100% (6539/6539), 39.19 MiB | 10.45 MiB/s, done.
Resolving deltas: 100% (4138/4138), done.
From https://github.com/google/brotli
 * [new branch]      custom-dictionary -> origin/custom-dictionary
 * [new branch]      master            -> origin/master
 * [new branch]      recompression     -> origin/recompression
 * [new branch]      shared_static     -> origin/shared_static
 * [new branch]      upd09b            -> origin/upd09b
 * [new branch]      v1.0              -> origin/v1.0
 * [new tag]         v0.1.0            -> v0.1.0
 * [new tag]         v0.2.0            -> v0.2.0
 * [new tag]         v0.3.0            -> v0.3.0
 * [new tag]         v0.4.0            -> v0.4.0
 * [new tag]         v0.5.2            -> v0.5.2
 * [new tag]         v0.6.0            -> v0.6.0
 * [new tag]         v1.0.0            -> v1.0.0
 * [new tag]         v1.0.1            -> v1.0.1
 * [new tag]         v1.0.2            -> v1.0.2
 * [new tag]         v1.0.3            -> v1.0.3
 * [new tag]         v1.0.4            -> v1.0.4
 * [new tag]         v1.0.5            -> v1.0.5
 * [new tag]         v1.0.6            -> v1.0.6
 * [new tag]         v1.0.7            -> v1.0.7
 * [new tag]         v1.0.8            -> v1.0.8
 * [new tag]         v1.0.9            -> v1.0.9
Switched to a new branch 'fetchgit'
Deleted remote-tracking branch origin/custom-dictionary (was 5244106).
Deleted remote-tracking branch origin/master (was f4153a0).
Deleted remote-tracking branch origin/recompression (was 0c5603e).
Deleted remote-tracking branch origin/shared_static (was fb139e4).
Deleted remote-tracking branch origin/upd09b (was a05a6a0).
Deleted remote-tracking branch origin/v1.0 (was c6333e1).
Enumerating objects: 5990, done.
Counting objects: 100% (5990/5990), done.
Compressing objects: 100% (5848/5848), done.
Writing objects: 100% (5990/5990), done.
Total 5990 (delta 4028), reused 1567 (delta 0), pack-reused 0
Enumerating objects: 5990, done.
Nothing new to pack.
/nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source

After this change:

$ nix-build --option substitute false . -A python3Packages.brotli.src
this derivation will be built:
  /nix/store/hxf6y3fsbmriyc6mmsn5zbj0kl0dny91-source.drv
building '/nix/store/hxf6y3fsbmriyc6mmsn5zbj0kl0dny91-source.drv'...
exporting https://github.com/google/brotli.git (rev v1.0.9) into /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source
Initialized empty Git repository in /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/
remote: Enumerating objects: 377, done.
remote: Counting objects: 100% (377/377), done.
remote: Compressing objects: 100% (330/330), done.
remote: Total 377 (delta 28), reused 164 (delta 6), pack-reused 0
Receiving objects: 100% (377/377), 22.60 MiB | 10.66 MiB/s, done.
Resolving deltas: 100% (28/28), done.
From https://github.com/google/brotli
 * branch            e61745a6b7add50d380cfd7d3883dd6c62fc2c71 -> FETCH_HEAD
Switched to a new branch 'fetchgit'
Enumerating objects: 377, done.
Counting objects: 100% (377/377), done.
Compressing objects: 100% (336/336), done.
Writing objects: 100% (377/377), done.
Total 377 (delta 28), reused 348 (delta 0), pack-reused 0
Enumerating objects: 377, done.
Nothing new to pack.
error: hash mismatch in fixed-output derivation '/nix/store/hxf6y3fsbmriyc6mmsn5zbj0kl0dny91-source.drv':
         specified: sha256-8CtCUt1UbK+QAv8Kk7tqesnBHiNvlNvjMQSfFHpOt+U=
            got:    sha256-AnhOM1mtiMf3ryly+jiTYOAGhyDSoDvjC3Fp+F4StEU=

Diff between python3Packages.brotli.src's fetch result before and after this change:

$ diff -Nur /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source 
diff -Nur /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/info/refs /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/info/refs
--- /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/info/refs   1969-12-31 16:00:01.000000000 -0800
+++ /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/info/refs   1969-12-31 16:00:01.000000000 -0800
@@ -1,17 +1 @@
 e61745a6b7add50d380cfd7d3883dd6c62fc2c71       refs/heads/fetchgit
-d811b186c5037b434d56ddb831ceccdf5a954687       refs/tags/v0.1.0
-7f7a2fb48cec63c0459ec6b6e7260810bfb01819       refs/tags/v0.2.0
-98ed7a23a83d64133b0a36a884e489bffb0eb864       refs/tags/v0.3.0
-29d31d5921b0a2b323ac24e7f7d0cdc9a3c0dd08       refs/tags/v0.4.0
-66c14517cf8afcc1a1649a7833ac789366eb0b51       refs/tags/v0.5.2
-46c1a881b41bb638c76247558aa04b1591af3aa7       refs/tags/v0.6.0
-c60563591a9a86196f19987c81dde4384a088861       refs/tags/v1.0.0
-5b4769990dc14a2bd466d2599c946c5652cba4b2       refs/tags/v1.0.1
-0ad94eed00420bf1154cb16a289aa27efbb30c01       refs/tags/v1.0.2
-533843e3546cd24c8344eaa899c6b0b681c8d222       refs/tags/v1.0.3
-c6333e1e79fb62ea088443f192293f964409b04e       refs/tags/v1.0.4
-b601fe817bd3217cb144bbb380a43cae8e847388       refs/tags/v1.0.5
-6eba239a5bb553fd557b7d78f7da8f0059618b9e       refs/tags/v1.0.6
-d6d98957ca8ccb1ef45922e978bb10efca0ea541       refs/tags/v1.0.7
-db361a0bb901d6a71c7cbf1370d97b3703482e3b       refs/tags/v1.0.8
-e61745a6b7add50d380cfd7d3883dd6c62fc2c71       refs/tags/v1.0.9
Binary files /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/info/commit-graph and /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/info/commit-graph differ
diff -Nur /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/info/packs /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/info/packs
--- /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/info/packs  1969-12-31 16:00:01.000000000 -0800
+++ /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/info/packs  1969-12-31 16:00:01.000000000 -0800
@@ -1,2 +1,2 @@
-P pack-fd16202a7e51865a14d05403d4684e1aa23a263c.pack
+P pack-6c928cc90e4d56875142dd7ee3f04ba1d9a25c5c.pack
 
Binary files /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/pack/pack-6c928cc90e4d56875142dd7ee3f04ba1d9a25c5c.idx and /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/pack/pack-6c928cc90e4d56875142dd7ee3f04ba1d9a25c5c.idx differ
Binary files /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/pack/pack-6c928cc90e4d56875142dd7ee3f04ba1d9a25c5c.pack and /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/pack/pack-6c928cc90e4d56875142dd7ee3f04ba1d9a25c5c.pack differ
Binary files /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/pack/pack-fd16202a7e51865a14d05403d4684e1aa23a263c.idx and /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/pack/pack-fd16202a7e51865a14d05403d4684e1aa23a263c.idx differ
Binary files /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/objects/pack/pack-fd16202a7e51865a14d05403d4684e1aa23a263c.pack and /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/objects/pack/pack-fd16202a7e51865a14d05403d4684e1aa23a263c.pack differ
diff -Nur /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/packed-refs /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/packed-refs
--- /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/packed-refs 1969-12-31 16:00:01.000000000 -0800
+++ /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/packed-refs 1969-12-31 16:00:01.000000000 -0800
@@ -1,18 +1,2 @@
 # pack-refs with: peeled fully-peeled sorted 
 e61745a6b7add50d380cfd7d3883dd6c62fc2c71 refs/heads/fetchgit
-d811b186c5037b434d56ddb831ceccdf5a954687 refs/tags/v0.1.0
-7f7a2fb48cec63c0459ec6b6e7260810bfb01819 refs/tags/v0.2.0
-98ed7a23a83d64133b0a36a884e489bffb0eb864 refs/tags/v0.3.0
-29d31d5921b0a2b323ac24e7f7d0cdc9a3c0dd08 refs/tags/v0.4.0
-66c14517cf8afcc1a1649a7833ac789366eb0b51 refs/tags/v0.5.2
-46c1a881b41bb638c76247558aa04b1591af3aa7 refs/tags/v0.6.0
-c60563591a9a86196f19987c81dde4384a088861 refs/tags/v1.0.0
-5b4769990dc14a2bd466d2599c946c5652cba4b2 refs/tags/v1.0.1
-0ad94eed00420bf1154cb16a289aa27efbb30c01 refs/tags/v1.0.2
-533843e3546cd24c8344eaa899c6b0b681c8d222 refs/tags/v1.0.3
-c6333e1e79fb62ea088443f192293f964409b04e refs/tags/v1.0.4
-b601fe817bd3217cb144bbb380a43cae8e847388 refs/tags/v1.0.5
-6eba239a5bb553fd557b7d78f7da8f0059618b9e refs/tags/v1.0.6
-d6d98957ca8ccb1ef45922e978bb10efca0ea541 refs/tags/v1.0.7
-db361a0bb901d6a71c7cbf1370d97b3703482e3b refs/tags/v1.0.8
-e61745a6b7add50d380cfd7d3883dd6c62fc2c71 refs/tags/v1.0.9
diff -Nur /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/shallow /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/shallow
--- /nix/store/hzmm1ws2xmhyl8j91whc9vjq03xn5dbw-source/.git/shallow     1969-12-31 16:00:00.000000000 -0800
+++ /nix/store/dh3mb74v22xv2nbfgz90r8j4dcvv7jnb-source/.git/shallow     1969-12-31 16:00:01.000000000 -0800
@@ -0,0 +1 @@
+e61745a6b7add50d380cfd7d3883dd6c62fc2c71

Seems like this is intentional and the thing to do is to go update the other packages' fetch hashes like #158043 ?

@Atemu
Copy link
Member Author

Atemu commented Feb 4, 2022

No, I think the problem is that the .git dir is included in the output. I'd consider that a bug though, the hash should not depend on some internal data format. I'm not even sure it stays stable anyways? I guess we only fetch a specific rev but I'm not sure git will always pack it in the exact same way there.

We can disable this optimisation when the git dir is included in the output though.

@jbgi
Copy link
Contributor

jbgi commented Jun 30, 2022

We can disable this optimisation when the git dir is included in the output though.

also this optimization ignores deepClone = true.

@Atemu
Copy link
Member Author

Atemu commented Jun 30, 2022

@jbgi could you open a more detailed bug report on that?

@jbgi
Copy link
Contributor

jbgi commented Jul 1, 2022

yes. ☝️

@ajs124
Copy link
Member

ajs124 commented Jul 6, 2022

This should maybe have updated the comment in fetchgit/default.nix, because it still claims that git does not support fetching revisions.

Maybe something like:

--- a/pkgs/build-support/fetchgit/default.nix
+++ b/pkgs/build-support/fetchgit/default.nix
@@ -32,17 +32,20 @@ in
 }:
 
 /* NOTE:
-   fetchgit has one problem: git fetch only works for refs.
-   This is because fetching arbitrary (maybe dangling) commits may be a security risk
-   and checking whether a commit belongs to a ref is expensive. This may
-   change in the future when some caching is added to git (?)
+   fetchgit had one problem: git fetch only worked for refs.
+   This was because fetching arbitrary (maybe dangling) commits might have been a security risk
+   and checking whether a commit belongs to a ref was expensive. This changed in 2015
+   with the release of git 2.5.0 https://github.com/git/git/blob/master/Documentation/RelNotes/2.5.0.txt#L106
    Usually refs are either tags (refs/tags/*) or branches (refs/heads/*)
    Cloning branches will make the hash check fail when there is an update.
    But not all patches we want can be accessed by tags.
 
-   The workaround is getting the last n commits so that it's likely that they
+   The workaround was getting the last n commits so that it it's likely that they
    still contain the hash we want.
 
+   As the new behaviour is optional, we try the new one, but fall back.
+   This change was introduced in 11df41199ba8a1497d8530dbf478ef10147a93bb.
+
    for now : increase depth iteratively (TODO)
 
    real fix: ask git folks to add a

@infinisil
Copy link
Member

infinisil commented Sep 2, 2023

This should either have been fixed or reverted when such regressions were discovered.. (Edit: Oh, it was discovered rather late.. No tests..?)

Edit: #252865

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/deepclone-in-fetchgit-does-not-actually-yield-a-deep-clone/15028/5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants