Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix-channel segfault: Fatal error: glibc detected an invalid stdio handle #2733

Closed
noonien opened this issue Mar 17, 2019 · 11 comments
Closed

Comments

@noonien
Copy link

noonien commented Mar 17, 2019

% sha256sum $(readlink -f $(which nix-channel))
975bccf1d28b989c218ef3a795b5bd4a0d65c37d7330a4ac6943da2e0621ed9d  /nix/store/6qd3aj1dxsxd9ksa2n6nlk7sm3sva32d-nix-2.1.3/bin/nix-channel
% sudo sh -c 'ulimit -c unlimited; while true; do nix-channel --update || exit; done'
unpacking channels...
unpacking channels...
Fatal error: glibc detected an invalid stdio handle
sh: line 1: 19507 Aborted                 (core dumped) nix-channel --update

I can provide more info, and a core dump if needed, just ask here or in #nixos.

@mweinelt
Copy link
Member

# coredumpctl dump 3533
           PID: 3533 (nix-channel)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Wed 2019-03-20 14:11:14 CET (11min ago)
  Command Line: nix-channel --update nixos
    Executable: /nix/store/kjjbqc6q8brqz87jil6w5hrym3di75k7-nix-2.2/bin/nix
 Control Group: /user.slice/user-1000.slice/session-2.scope
          Unit: session-2.scope
         Slice: user-1000.slice
       Session: 2
     Owner UID: 1000 (hexa)
       Boot ID: dfd08345e94c404ba355802ef8157a97
    Machine ID: 44d0f9f15ad7490f89336ab4d36ddcf9
      Hostname: nyx
       Storage: /var/lib/systemd/coredump/core.nix-channel.0.dfd08345e94c404ba355802ef8157a97.3533.1553087474000000.lz4
       Message: Process 3533 (nix-channel) of user 0 dumped core.
Refusing to dump core to tty (use shell redirection or specify --output).

For me this looks to be like a curl issue. I have a coredump as well, and I can enable debug symbols if necessary. I'm hexa- on #nixos.

(gdb) bt full
#0  0x00007f5b0e8f9be0 in raise () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
#1  0x00007f5b0e8fadc1 in abort () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
#2  0x00007f5b0e93b2ac in __libc_message () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
#3  0x00007f5b0e93b2ce in __libc_fatal () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
#4  0x00007f5b0e93ba18 in _IO_vtable_check () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
#5  0x00007f5b0e9324b5 in fwrite () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
#6  0x00007f5b0e4b9e5d in Curl_debug () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#7  0x00007f5b0e4ba000 in Curl_infof () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#8  0x00007f5b0e4f3654 in http2_conncheck () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#9  0x00007f5b0e4c0677 in extract_if_dead () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#10 0x00007f5b0e4c4520 in Curl_connect () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#11 0x00007f5b0e4d60c7 in multi_runsingle () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#12 0x00007f5b0e4d73b3 in curl_multi_perform () from /nix/store/z1hlaaq623sl3iif64jwckwickfgzn15-curl-7.64.0/lib/libcurl.so.4
No symbol table info available.
#13 0x00007f5b0f18a9b5 in nix::CurlDownloader::workerThreadMain() () from /nix/store/kjjbqc6q8brqz87jil6w5hrym3di75k7-nix-2.2/lib/libnixstore.so
No symbol table info available.
#14 0x00007f5b0f18c51c in nix::CurlDownloader::workerThreadEntry() () from /nix/store/kjjbqc6q8brqz87jil6w5hrym3di75k7-nix-2.2/lib/libnixstore.so
No symbol table info available.
#15 0x00007f5b0ef0bd7f in ?? () from /nix/store/hlnxw4k6931bachvg5sv0cyaissimswb-gcc-7.4.0-lib/lib/libstdc++.so.6
No symbol table info available.
#16 0x00007f5b0ea83ef7 in start_thread () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libpthread.so.0
No symbol table info available.
#17 0x00007f5b0e9b722f in clone () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
No symbol table info available.
(gdb) 

@michojel
Copy link

michojel commented Jun 15, 2019

I can reproduce when running as root in a systemd service with nix 2.2.2 on NixOS 19.03.172866.4649b6ef4b5 (Koi).
Works fine in a terminal though.

The command is: ${pkgs.nix}/bin/nix-channel --update nixos-unstable

@nh2
Copy link
Contributor

nh2 commented Jun 17, 2019

This happened to me when installing nix on CircleCI (doesn't happen deterministically, previous installs succeeded):

https://circleci.com/gh/nh2/static-haskell-nix/4

performing a single-user installation of Nix...
copying Nix to /nix/store.................................
initialising Nix database...
Nix: creating /home/circleci/.nix-profile
installing 'nix-2.2.2'
building '/nix/store/2c4l83wkdfrj7ra9k07l0aqvym1d5z20-user-environment.drv'...
created 6 symlinks in user environment
Fatal error: glibc detected an invalid stdio handle
Aborted (core dumped)
Exited with code 134

Apparently the currently known workaround is to retry :(

@flokli
Copy link
Contributor

flokli commented Jul 31, 2019

Encountered the same segfault and dump as @mweinelt posted, in my case while invoking a nix-shell, which lead to things being downloaded.

@flokli
Copy link
Contributor

flokli commented Jul 31, 2019

I was able to reproduce this multiple times at the same step in a CI (while having multiple substituters present btw), now trying to reproduce with a Nix version having debug symbols enabled to get a more detailled coredump.

However, as CI is flaky due to other reasons too, and nix pretty slow due to -Og, this might take a while (or never finish at all).

@arcnmx
Copy link
Member

arcnmx commented Jul 31, 2019

It's notable that this occurs very frequently (like 50% of the time?) just by running the install script on an azure pipelines ubuntu machine if that helps with anyone trying to reproduce it.

@flokli
Copy link
Contributor

flokli commented Aug 5, 2019

So, we managed to get somewhat better setup to reproduce, and were able to drill it down a bit:

  • The bug is present in Nix 2.2.2 already, not just master
  • Setting --options substituters "" caused the bug to disappear
  • Disabling http2 caused the bug to disappear --options http2 false (requires Fix http2 = false having no effect. #2977, which is not included to Nix 2.2.2, or some manual fiddling with gdb to call curl_easy_setopt(req, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);)

@domenkozar
Copy link
Member

domenkozar commented Oct 2, 2019

TLDR: Setting http2 = false in /etc/nix/nix.conf workarounds it in Nix 2.3.

Note that won't affect nix-channel calls during installation, so not sure yet how to fix that.

@flokli
Copy link
Contributor

flokli commented Nov 22, 2019

I also see this in unattended startup scripts installing nix, here during a nix-channel --update:

Nov 22 09:09:57 buildkite-ubuntu-vjg6 sudo[5725]: buildkite-agent : TTY=unknown ; PWD=/var/lib/buildkite-agent ; USER=root ; ENV=HOME=/root NIX_SSL_CERT_FILE=/nix/var/nix/profiles/default/etc/ssl/certs/ca-bundle.crt ; COMMAND=/nix/store/6chjfy4j6hjwj5f8zcbbdg02i21x1qsi-nix-2.3.1/bin/nix-channel --update nixpkgs
Nov 22 09:09:57 buildkite-ubuntu-vjg6 sudo[5725]: pam_unix(sudo:session): session opened for user root by (uid=0)
Nov 22 09:09:57 buildkite-ubuntu-vjg6 kernel: show_signal_msg: 9 callbacks suppressed
Nov 22 09:09:57 buildkite-ubuntu-vjg6 kernel: nix-channel[5735]: segfault at 8 ip 00007fc6fab6c330 sp 00007fc6dde2c270 error 4 in libc-2.27.so[7fc6fab20000+13d000]
Nov 22 09:09:57 buildkite-ubuntu-vjg6 kernel: Code: 85 db 0f 84 22 01 00 00 8b 01 49 89 fc 49 89 f2 49 89 c9 25 00 80 00 00 75 59 4c 8b 81 88 00 00 00 64 48 8b 2c 25 10 00 00 00 <49> 3b 68 08 74 3e be 01 00 00 00 83 3d 76 74 14 00 00 74 09 f0 41
Nov 22 09:09:57 buildkite-ubuntu-vjg6 sudo[5725]: pam_unix(sudo:session): session closed for user root
Nov 22 09:09:57 buildkite-ubuntu-vjg6 startup-script[2118]: INFO startup-script: /tmp/nix-binary-tarball-unpack.qPzVeaDyPk/unpack/nix-2.3.1-x86_64-linux/install-multi-user: line 216:  5725 Segmentation fault      sudo "$@"

I'm pretty convinced by now this might have something to do with how stdout and stderr are connected, and whether it's a terminal.

@danbst
Copy link
Contributor

danbst commented Dec 14, 2019

I was not able to reproduce crash with OP instructions, but I found that removing ~/.cache/nix makes the problem occur more often:

root@ip-10-0-61-30:~# while nix-channel --update; do rm -rf /root/.cache/nix; done
unpacking channels...
unpacking channels...
unpacking channels...
unpacking channels...
unpacking channels...
unpacking channels...
unpacking channels...
Fatal error: glibc detected an invalid stdio handle
Aborted (core dumped)

I also confirm crash in libcurl. Maybe time to upgrade it from 7.64.0 to latest (7.67.0)?

@edolstra
Copy link
Member

This should be fixed in Nix 2.3.3 (which links against curl 7.68.0).

cbley-da added a commit to digital-asset/daml that referenced this issue May 9, 2022
This patch works around an old problem in libcurl, when using HTTP/2.

The problem was fixed with nix version 2.3.3, see NixOS/nix#2733 (comment).
cbley-da added a commit to digital-asset/daml that referenced this issue May 9, 2022
This patch works around an old problem in libcurl, when using HTTP/2.

The problem was fixed with nix version 2.3.3, see NixOS/nix#2733 (comment).

CHANGELOG_BEGIN

CHANGELOG_END
cbley-da added a commit to digital-asset/daml that referenced this issue May 10, 2022
This patch works around an old problem in libcurl, when using HTTP/2.

The problem was fixed with nix version 2.3.3, see NixOS/nix#2733 (comment).

CHANGELOG_BEGIN

CHANGELOG_END
cbley-da added a commit to digital-asset/daml that referenced this issue May 10, 2022
This patch works around an old problem in libcurl, when using HTTP/2.

The problem was fixed with nix version 2.3.3, see NixOS/nix#2733 (comment).

CHANGELOG_BEGIN

CHANGELOG_END
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants