Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regressions due to patchelf: 0.9 -> 0.10 #69213

Closed
yorickvP opened this issue Sep 21, 2019 · 22 comments
Closed

regressions due to patchelf: 0.9 -> 0.10 #69213

yorickvP opened this issue Sep 21, 2019 · 22 comments
Labels
0.kind: bug Something is broken 0.kind: regression Something that worked before working no longer 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS

Comments

@yorickvP
Copy link
Contributor

Describe the bug

During stage1:
Starting udev
[ 0.589855] systemd-udevd[114]: Assertion 'resolve_name_timing >= 0 && resolve_name_timing < _RESOLVE_NAME_TIMING_MAX' failed at src/udev/udev-rules.c:1282, function udev_rules_new(). Aborting.

After this, the system is unable to mount any disks.

To Reproduce
Steps to reproduce the behavior:

  1. nixpkgs 41af38f
  2. boot

Expected behavior
A clear and concise description of what you expected to happen.
System boots.

Additional context
This is on a xen VM with the root pivoted. Config: https://gist.github.com/yorickvP/837cc52589b609803b0d28b3c3369e84

cc: @abbradar

@yorickvP yorickvP added the 0.kind: bug Something is broken label Sep 21, 2019
@veprbl veprbl added the 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS label Sep 21, 2019
@vcunat
Copy link
Member

vcunat commented Sep 21, 2019

The problem is avoided now on master after f8a8fc6, but we still need to fix it, so I'm leaving this open.

@vcunat vcunat added the 0.kind: regression Something that worked before working no longer label Sep 21, 2019
@vcunat
Copy link
Member

vcunat commented Sep 22, 2019

I can confirm that the patchelf upgrade #58715 caused this and also some other issues: 41af38f#commitcomment-35178328 though I still fail to understand why/how.

@vcunat vcunat changed the title udev aborts on assertion resolve_name_timing regressions due to patchelf: 0.9 -> 0.10 Sep 22, 2019
vcunat added a commit that referenced this issue Sep 22, 2019
This is a partial revert of #58715.  Bumping the default caused problems
described in #69213.  I tested that the vscode corruption happened even
with the 0.10 pre-release, so I'm keeping patchelfUnstable on 0.10
(patchelfUnstable shouldn't cause a large rebuild anyway)
@vcunat
Copy link
Member

vcunat commented Sep 22, 2019

vscode: the wrapping is the same, but binary gets corrupted by patchelf: NixOS/patchelf#170 (comment) I find it likely that both regressions (and probably more) were caused by these.

For now I downgraded the default patchelf in 4e5b465, so that we can immediately start integrating the rest of staging-next (security fixes, in particular).

@booxter
Copy link
Contributor

booxter commented Sep 24, 2019

@vcunat thanks for handling the mess. What's the procedure to validate a bump for patchelf before merging it?

@vcunat
Copy link
Member

vcunat commented Sep 25, 2019

I'm not aware of any such formalities. First we should verify at least some of those known regressions (say, test vscode). Then we might do a run in a separate Hydra jobset, but... I didn't discover this regression by looking at the staging-next jobset and it got to nixpkgs master.

@vvs-
Copy link

vvs- commented Nov 17, 2019

Just found this issue. I'm unable to install vscode and vscodium with virtual address space underrun! apparently because of patchelf. I'm on NixOS 19.03 i686, though.

@vcunat
Copy link
Member

vcunat commented Nov 18, 2019

Hmm, I'm getting a different error

git checkout 3f92c2124a5 # current 19.03
env NIXPKGS_ALLOW_UNFREE=1 nix build -f . pkgsi686Linux.vscode
./result/bin/code
# /nix/store/gkbm49f5a096knqphy1iwim024hzvxrx-vscode-1.35.1/lib/vscode/code: error while loading shared libraries: libnode.so: cannot open shared object file: No such file or directory

19.03 always used patchelf 0.9 by default, so I can't see why it would be affected by the very same issue.

Note: on 19.09 and later i686 isn't supported anymore due to upstream: #64308 (comment) ... and 19.03 isn't really a supported nixpkgs branch anymore.

@flokli
Copy link
Contributor

flokli commented Dec 26, 2019

I'd really prefer if patchelf from nixpkgs could point to the the latest official release. If vscode and vscodium don't work with the latest patchelf, we could keep a patchelf_0_9 attribute until the bugs are fixed in patchelf.

Are we able to reproduce the udev error with a NixOS VM test, or is it a Xen-specific thing?

@flokli
Copy link
Contributor

flokli commented Dec 26, 2019

Urgh, turns out, patchelf is part of stdenv, so autoPatchelfHook (what's used to patchelf vscode and vscodium) will pick whatever is a default there :-/

@domenkozar
Copy link
Member

@vcunat I've merged a bunch of patchelf patches including the possible fix, could you setup staging using patchelfUnstable from recent master?

@vcunat
Copy link
Member

vcunat commented Jun 3, 2020

I had to hack around some issues in 15bfb6b; we'll probably want a patchelf release/tarball before merging this to nixpkgs master. Jobset is running now: https://hydra.nixos.org/eval/1591215?compare=1591203

@flokli
Copy link
Contributor

flokli commented Jun 3, 2020

@vcunat nice, let's hope this works out!

IMHO, while we do that, we should also rename patchelf/unstable.nix to patchelf/default.nix and replace all usages of patchelfUnstable with patchelf - these were all hacks that wanted >= patchelf 0.10, features, and we can remove all this.

@vcunat
Copy link
Member

vcunat commented Jun 3, 2020

Sure, I didn't want to hassle with such details at this moment.

@flokli
Copy link
Contributor

flokli commented Jun 3, 2020

sure :-)

@domenkozar
Copy link
Member

@vcunat I've tried vscode but I get:

$ nix-build -A vscode
error: anonymous function at /home/ielectric/dev/nixpkgs/pkgs/build-support/fetchurl/boot.nix:5:1 called with unexpected argument 'recursiveHash', at /home/ielectric/dev/nixpkgs/pkgs/build-support/fetchzip/default.nix:17:2
(use '--show-trace' to show detailed location information)

@vcunat
Copy link
Member

vcunat commented Jun 4, 2020

That's an error for almost anything in nixpkgs, and the commit mentioned above fixes it for me.

Still, I was unable to use vscode for testing this, because in the past only i686 version appeared to be broken and that platform isn't supported by current vscode. (With some more work, I can imagine applying the patchelf update to the old nixpkgs and trying that.)

EDIT: so I skipped that, thinking that we need to check whole nixpkgs anyway.

@domenkozar
Copy link
Member

I can launch vscode-with-extensions with your additional commit without problems.

@domenkozar
Copy link
Member

@vcunat It's looking pretty good (6k builds to go).

The only weird failure so far is:

/nix/store/psw68f7ngn728fng2xh9vzsdiri9a8r5-pytest-check-hook/nix-support/setup-hook: line 40:   113 Illegal instruction     /nix/store/2v0r743sczdnagsja18vq89hw4qdqm4d-python3-3.7.7/bin/python3.7 -m pytest -k "not test_load_cuda_params_to_cpu and not test_pickle_load"

@domenkozar
Copy link
Member

Jobset has only 30 failing builds and they all don't seem related. Seems like we can release 0.11?

@domenkozar
Copy link
Member

To me it looks like @edolstra can release patchelf 0.11, but I'd invite anyone to test extra things that were broken by 0.10

@domenkozar
Copy link
Member

0.11 was released and a PR for staging is at #89927

@monoidal
Copy link

Since #89927 was merged, should this be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 0.kind: regression Something that worked before working no longer 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS
Projects
None yet
Development

No branches or pull requests

9 participants