Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nix cached truncated files downloaded #4533

Open
oxalica opened this issue Feb 8, 2021 · 27 comments
Open

Nix cached truncated files downloaded #4533

oxalica opened this issue Feb 8, 2021 · 27 comments
Labels
feature Feature request or proposal

Comments

@oxalica
Copy link
Contributor

oxalica commented Feb 8, 2021

Describe the bug

In cases of some network error or proxy server error, downloading is ended too early and gives a truncated file. But the truncated file is still cached by nix, leading to immediate truncated gzip input every next time without refetching.

This issue happens because:

  1. The response of github tarball url does NOT contain Content-Length, which makes curl not possible to validate the output length.
  2. Some network error like proxy server being killed, will not RESET the connection. This makes curl returns zero but gives truncated file.
  3. Nix caches the output after curl returns zero, but before extracting. It results in cached truncated files.

Steps To Reproduce

  1. Setup a proxy server.
  2. Run nix-build -A osu-lazer https://github.com/r-ryantm/nixpkgs/archive/cc77f910fc1a06b7cb7eb43639c5904540483c70.tar.gz or with other github archive URL.
  3. During the download, kill the proxy server. nix will fail with truncated gzip input
  4. Run the command again, it immediately fails with truncated gzip input again without network access.

image

Expected behavior
Nix caches the downloaded file only if it can be unpacked successfully.
So it can re-download the file and re-build in the next time.

nix-env --version output
nix-env (Nix) 2.4pre20201205_a5d85d0

@oxalica oxalica added the bug label Feb 8, 2021
@edolstra
Copy link
Member

edolstra commented Feb 8, 2021

This sounds like a bug in the proxy server. There is not much we can do if it's serving corrupted files...

@oxalica
Copy link
Contributor Author

oxalica commented Feb 8, 2021

@edolstra

This sounds like a bug in the proxy server.

kill -9 produces the same result. I think it's the kernel cleanup behavior (close instead of reset),
which is not simple to control.

On nix side, extracting (or other verification) before caching is enough to fix this issue.

@oxalica
Copy link
Contributor Author

oxalica commented Mar 12, 2021

Fixed in curl side curl/curl@d1f4007
We can just wait for the next curl release.

@oxalica oxalica closed this as completed Mar 12, 2021
@rembo10
Copy link

rembo10 commented Dec 14, 2021

Hm, I think I'm experiencing this. I don't have a great internet connection at the moment, and I think the download was interrupted. Not using a proxy server, just nixpkgs.url = "nixpkgs/nixos-21.11" (my only input)

Now if I try to run it again I get the same error above:

error: failed to extract archive (truncated gzip input)

Using nix 2.4

@corwin-of-amber
Copy link

This kept happening and I was unable to figure out which file was truncated. The error does not specify the location and I was unable to turn on more verbosity (tried -L following this #1904, had no effect).

I ended up having to reinstall Nix. This is very bad user experience. Please reopen this issue.

@charlesbaynham
Copy link

charlesbaynham commented Jun 15, 2022

I ran into this too (no proxy server involved) - I tried a nix-store --verify --repair --check-contents to repair my store but this didn't work for some reason.

In the end, I had to do nix-store --gc and clear out my cache completely.

@dzmitry-lahoda
Copy link

vscode ➜ /workspaces/composable (dz/byog-container) $ nix run github:ComposableFi/Composable/49473d1e4a86abe62abfad5648532dab3cef15ec#devnet-xcvm-up -L --show-trace --
error: failed to extract archive (truncated gzip input)

       … while fetching the input 'github:ComposableFi/Composable/49473d1e4a86abe62abfad5648532dab3cef15ec'

using cachix.

so half hour ago it worked on other machine. it is new cache and we have 1TB plan.

@dzmitry-lahoda
Copy link

nix-store --verify --repair --check-contents not helped

@domenkozar domenkozar reopened this Aug 30, 2022
@domenkozar
Copy link
Member

@dzmitry-lahoda are you able to nix-store --delete the offending store path?

@dzmitry-lahoda
Copy link

Nice idea, will try next time. I did gc of store. So cannot retest now. But I guess deleting specific package will work.

@omnibs
Copy link

omnibs commented Apr 1, 2023

Can confirm nix-store --delete [nix-store-path] works, but it's not terribly easy to figure out what that path is.

It happened to me on the initial fetch of nixpkgs, and through a lot of trial and error I figured out that would be in a /nix/store/*nixpkgs-src* path. Some of those paths were dirs, some were files. I guessed the dirs were successfully downloaded and unpacked ones, so I ignored those. Running gzip -t on the remaining paths helped me spot my truncated gzip.

@jualvarez
Copy link

@omnibs
Disclaimer: Newbie on Nix here!

I had the same issue and found that running (in my case) nix develop with the --debug option would ouptut the exact path that I needed to delete.

ignoring disappeared cache entry '{"rev":"0a023762fc097047c0a16fa4d2bc3ef6012f4f44","type":"git-tarball"}'
ignoring disappeared cache entry '{"name":"source","type":"tarball","url":"https://api.github.com/repos/NixOS/nixpkgs/tarball/0a023762fc097047c0a16fa4d2bc3ef6012f4f44"}'
using cache entry '{"name":"source","type":"file","url":"https://api.github.com/repos/NixOS/nixpkgs/tarball/0a023762fc097047c0a16fa4d2bc3ef6012f4f44"}' -> '{"etag":"\"01fb19345a43fc5eec91d21637476fecd2906297c4c917c27ae83bd43c1607a4\"","url":"https://codeload.github.com/NixOS/nixpkgs/legacy.tar.gz/0a023762fc097047c0a16fa4d2bc3ef6012f4f44"}', '/nix/store/s1vb9z51ynxvi9c7q8398wsrv6yrj9vk-source'
error: failed to extract archive (truncated gzip input)

Then running

nix-store --delete /nix/store/s1vb9z51ynxvi9c7q8398wsrv6yrj9vk-source

Worked fine. But your post pointed me in the right direction. Thanks!

@dzmitry-lahoda
Copy link

dzmitry-lahoda commented Jul 27, 2023

Same here, provider is github:

    osmosis-src.flake = false;
    osmosis-src.url = github:osmosis-labs/osmosis/v16.1.1;

My collegue and I just suddenly start getting these.

dz@pop-os:~/github.com/informalsystems/cosmos.nix$ nix --version
nix (Nix) 2.16.0
dz@pop-os:~/github.com/informalsystems/cosmos.nix$ nix show-config
accept-flake-config = false
access-tokens = 
allow-dirty = true
allow-import-from-derivation = true
allow-new-privileges = false
allow-symlinked-store = false
allow-unsafe-native-code-during-evaluation = false
allowed-impure-host-deps = 
allowed-uris = 
allowed-users = *
auto-allocate-uids = false
auto-optimise-store = false
bash-prompt = 
bash-prompt-prefix = 
bash-prompt-suffix = 
build-hook = /nix/store/y49q7bwh5n5ybz8skxhpdypai032dsml-nix-2.16.0/bin/nix __build-remote
build-poll-interval = 5
build-users-group = nixbld
builders = @/etc/nix/machines
builders-use-substitutes = false
commit-lockfile-summary = 
compress-build-log = true
connect-timeout = 0
cores = 20
diff-hook = 
download-attempts = 5
download-speed = 0
eval-cache = true
experimental-features = flakes nix-command
extra-platforms = i686-linux x86_64-v1-linux x86_64-v2-linux x86_64-v3-linux
fallback = false
filter-syscalls = true
flake-registry = https://channels.nixos.org/flake-registry.json
fsync-metadata = true
gc-reserved-space = 8388608
hashed-mirrors = 
http-connections = 25
http2 = true
id-count = 8388608
ignore-try = false
ignored-acls = security.csm security.selinux system.nfs4_acl
impersonate-linux-26 = false
keep-build-log = true
keep-derivations = true
keep-env-derivations = false
keep-failed = false
keep-going = false
keep-outputs = false
log-lines = 10
max-build-log-size = 0
max-free = 18446744073709551615
max-jobs = 1
max-silent-time = 0
max-substitution-jobs = 16
min-free = 0
min-free-check-interval = 5
nar-buffer-size = 33554432
narinfo-cache-negative-ttl = 3600
narinfo-cache-positive-ttl = 2592000
netrc-file = /etc/nix/netrc
nix-path = /home/dz/.nix-defexpr/channels nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixpkgs /nix/var/nix/profiles/per-user/root/channels
plugin-files = 
post-build-hook = 
pre-build-hook = 
preallocate-contents = false
print-missing = true
pure-eval = true
require-sigs = true
restrict-eval = false
run-diff-hook = false
sandbox = relaxed
sandbox-build-dir = /build
sandbox-dev-shm-size = 50%
sandbox-fallback = true
sandbox-paths = /bin/sh=/nix/store/7b943a2k4amjmam6dnwnxnj8qbba9lbq-busybox-static-x86_64-unknown-linux-musl-1.35.0/bin/busybox
secret-key-files = 
show-trace = false
ssl-cert-file = /etc/ssl/certs/ca-certificates.crt
stalled-download-timeout = 300
start-id = 872415232
store = auto
substitute = true
substituters = https://cache.nixos.org/
sync-before-registering = false
system = x86_64-linux
system-features = benchmark big-parallel kvm nixos-test uid-range
tarball-ttl = 3600
timeout = 0
trace-function-calls = false
trace-verbose = false
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs= composable-community.cachix.org-1:GG4xJNpXJ+J97I8EyJ4qI5tRTAJ4i7h+NK2Z32I8sK8= helix.cachix.org-1:ejp9KQpR1FBI2onstMQ34yogDm4OgU2ru6lIwPvuCVs= mitchellh-nixos-config.cachix.org-1:bjEbXJyLrL1HZZHBbO4QALnI5faYZppzkU4D2s0G8RQ=
trusted-substituters = https://cache.nixos.org/ https://composable-community.cachix.org/ https://devenv.cachix.org/ https://nix-community.cachix.org/
trusted-users = dz root dzmitry-lahoda
use-case-hack = false
use-cgroups = false
use-registries = true
use-sqlite-wal = true
use-xdg-base-directories = false
user-agent-suffix = 
warn-dirty = true

dz@pop-os:~/github.com/informalsystems/cosmos.nix$ 

dz@pop-os:~/github.com/informalsystems/cosmos.nix$ uname -a
Linux pop-os 5.19.0-76051900-generic #202207312230~1663791054~22.04~28340d4 SMP PREEMPT_DYNAMIC Wed S x86_64 x86_64 x86_64 GNU/Linux

@pwaller
Copy link
Contributor

pwaller commented Aug 4, 2023

Shouldn't there be a nar hash verified before corrupted files make it as far downstream as being extracted?

@dzmitry-lahoda
Copy link

dzmitry-lahoda commented Aug 25, 2023

I got this reproduced. So actually it happens when download just stops (no only via nix, that is issue of GH for some items).

But when it happens, I get this

image

I do nix store delete and use path from -L -debug. And it starts again and fails.

So I think it is nix issue because it must not store partially downloaded files in store. This violates integrity of it for static derivation with well know hashes it is seems no acceptable.

I am on nix 2.17.

When GH started to work Ok, I cleaned store, did CURL well. But nix still was stucking. Like if it caching bad internet connection. I restared nix daemon.

Nix consistently stucks on

f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz"}'
downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz'...
starting download of https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz
[3.3/0.0 MiB DL] downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar

same value all the time

but after minute I got it

did not find cache entry for '{"name":"source","type":"tarball","url":"https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz"}'
performing daemon worker op: 11
performing daemon worker op: 1
ignoring disappeared cache entry '{"name":"source","type":"file","url":"https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz"}'
downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz'...
starting download of https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz
[64.7/84.6 MiB DL] downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.t

fine. feels like some Nix prameters and usage of curl lead to issue, but nix should not cache bad files anyway.

@dzmitry-lahoda
Copy link

@edolstra

This sounds like a bug in the proxy server. There is not much we can do if it's serving corrupted files...

this is not the case. issue with nix handling downloads.

@CorbanR
Copy link

CorbanR commented Sep 4, 2023

I am running into the exact same issue

 - system: `"aarch64-darwin"`
 - host os: `Darwin 22.6.0, macOS 13.5.1`
 - multi-user?: `yes`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.17.0`
 - channels(root): `"nixpkgs"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixpkgs`

I see it when trying to run

nix search nixpkgs#rclone
error:
       … while fetching the input 'github:NixOS/nixpkgs/nixpkgs-unstable'

       error: cannot get archive member name: truncated gzip input

I have an alias that runs sudo nix-collect-garbage && nix-collect-garbage && sudo nix-store --verify --check-contents --repair && sudo nix-store --optimise which seems to fix the issue. Although that is usually my last resort command when I'm running into issues.

@flemzord
Copy link

flemzord commented Sep 4, 2023

I have the same problem in my CI: https://github.com/formancehq/stack/actions/runs/6077877454/job/16488949571

@domenkozar
Copy link
Member

This is an issue with github that's happening since yesterday.

@cor
Copy link

cor commented Sep 5, 2023

This is an issue with github that's happening since yesterday.

Same for us

@PlumpMath
Copy link

I wonder when it will be over?;;; I've been overhauling the entire flake.nix config settings because it hasn't been updated for several days. I mainly removed all the repositories that took a long time during the nix flake update. Lol.
It works well on the MacBook I have, but it doesn't install on the Cylinder Mac Pro... NixOS works well on other PCs. Hmm... I'm switching all darwin systems to determinate systems with nix. Anyway, it's only being installed and updated properly on a MacBook intel-mac. I'm not sure what the difference is, but there's no doubt that the GitHub server has gone haywire. So, I got separate access-tokens for each computer. The first one I received was for the MacBook.

@jimmidyson
Copy link

jimmidyson commented Sep 5, 2023

Latest from https://status.github.com:

We have mitigated the impact on download and raw file operations and are seeing recovery on response times but are continuing to monitor.

🤞

@PlumpMath
Copy link

PlumpMath commented Sep 6, 2023

For the Mac Pro 2013 (which can only be installed up to Monterey), I wondered why I was getting that error. Once I removed commercial-emacs, the build started without any issues. It's uncertain whether the nix community will offer the option to select commercial-emacs, but in any case, others might find this information useful. Naturally, there were no issues on other MacBooks with the latest Ventura installed. It seems there might be an issue when using repository sources not managed by nixpkgs, considering both the Mac version and the fact that there was an error on nixos as well.

@duijf
Copy link

duijf commented Sep 6, 2023

If you used builtins.fetchTarball and you know which archive is broken, this is how you can fix this on a machine:

$ cat fetch.nix
builtins.fetchTarball {
  sha256 = "sha256-6GQ9ib4dA/r1leC5VUpsBo0BmDvNxLjKrX1iyL+h8mc=";
  url = "https://github.com/NixOS/nixpkgs/archive/e43e2448161c0a2c4928abec4e16eae1516571bc.tar.gz";
}

# Repro: if you get this error, you know you have the right archive:
$ nix-instantiate fetch.nix
error: cannot get archive member name: truncated gzip input

# Force nix to download the file again:
$ sudo nix-instantiate --option tarball-ttl 0 fetch.nix

@duijf
Copy link

duijf commented Sep 6, 2023

This looks very similar to behaviour you see in cache-poisoning related issues.

Regardless of how the file was originally downloaded / the upstream server behaved, the local Nix CLI / daemon should not cache invalid archives. In our particular case, we observed that Nix was caching these invalid archives for 1.5 days

@tylerd-canva
Copy link

My understanding is part of the issue here is that fetchTarball only verifies the checksum after it is extracted. In this case the content is incomplete -- an invalid archive -- and therefore fails to extract. The checksum is never validated. But either way, it's weird to cache an artifact that never makes it to the checksum step.

@ditsuke
Copy link
Member

ditsuke commented Nov 27, 2023

Ran into this today, and indeed I would expect a truncated cached asset to be invalidated or not be stored in the cache at all until it's validated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.