Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcloud crashes with SIGSEGV on recent nixpkgs unstable #226339

Open
r-vdp opened this issue Apr 15, 2023 · 32 comments
Open

pcloud crashes with SIGSEGV on recent nixpkgs unstable #226339

r-vdp opened this issue Apr 15, 2023 · 32 comments

Comments

@r-vdp
Copy link
Contributor

r-vdp commented Apr 15, 2023

Describe the bug

Running pcloud from nixos-unstable crashes with a segmentation fault:

$ NIXPKGS_ALLOW_UNFREE=1 nix run --impure 'github:nixos/nixpkgs/nixos-unstable#pcloud'
fish: Job 1, 'NIXPKGS_ALLOW_UNFREE=1 nix run …' terminated by signal SIGSEGV (Address boundary error)

The same build used to work before, so I guess there is an ABI incompatibility in one of the runtime dependencies.

@r-vdp r-vdp changed the title Pcloud crashes with SIGSEGV on recent nixpkgs unstable pcloud crashes with SIGSEGV on recent nixpkgs unstable Apr 15, 2023
chenlijun99 added a commit to chenlijun99/dotfiles that referenced this issue Apr 16, 2023
This reverts commit 60b5575.

pCloud doesn't work. See NixOS/nixpkgs#226339
@naturallaw777
Copy link

Having the same issue with NixOS unstable on my machine.

@juanibiapina
Copy link
Contributor

@Patryk27 hey, sorry to tag you, but maybe you can help ?

@LovingMelody
Copy link
Contributor

LovingMelody commented May 15, 2023

Does anyone know a version of nixpkgs I can pin where pcloud is still working?

You can try following the revisions in this flake.

@r-vdp
Copy link
Contributor Author

r-vdp commented May 15, 2023

@LovingMelody
Copy link
Contributor

@juanibiapina this is what works for me currently: https://git.sr.ht/~r-vdp/nixos-config/tree/main/item/flake.nix#L131

Unable to figure out why your hash is different, it seems to be the same archive in nixpkgs

@Patryk27
Copy link
Member

@Patryk27 hey, sorry to tag you, but maybe you can help ?

Thanks for tagging - I'll take a look later today or tomorrow!

@Patryk27
Copy link
Member

Patryk27 commented May 17, 2023

Interestingly, it crashes somewhere in icu-60:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff707aa03 in icu_60::UnicodeString::copyFrom(icu_60::UnicodeString const&, signed char) () from /nix/store/hspijmy7kfrs0f12h7b56f07lnx3522v-pcloud-1.12.0/app/libnode.so

(gdb) bt
#0  0x00007ffff707aa03 in icu_60::UnicodeString::copyFrom(icu_60::UnicodeString const&, signed char) () from /nix/store/hspijmy7kfrs0f12h7b56f07lnx3522v-pcloud-1.12.0/app/libnode.so
#1  0x00007ffff6ed494e in icu_60::BasicTimeZone::BasicTimeZone(icu_60::UnicodeString const&) () from /nix/store/hspijmy7kfrs0f12h7b56f07lnx3522v-pcloud-1.12.0/app/libnode.so
#2  0x00007ffff6f7bf55 in icu_60::SimpleTimeZone::SimpleTimeZone(int, icu_60::UnicodeString const&) () from /nix/store/hspijmy7kfrs0f12h7b56f07lnx3522v-pcloud-1.12.0/app/libnode.so
#3  0x00007ffff6f78feb in icu_60::TimeZone::detectHostTimeZone() () from /nix/store/hspijmy7kfrs0f12h7b56f07lnx3522v-pcloud-1.12.0/app/libnode.so
#4  0x00007ffff6f7909f in icu_60::TimeZone::createDefault() () from /nix/store/hspijmy7kfrs0f12h7b56f07lnx3522v-pcloud-1.12.0/app/libnode.so
#5  0x00000000030c8ac5 in ?? ()
#6  0x000000000387f733 in ?? ()
#7  0x000000000247bfa6 in ?? ()
#8  0x0000000003880394 in ?? ()
#9  0x0000000004aeae46 in main ()

... but faking a timezone:

(gdb) set env TZ=xd
(gdb) r

... causes it to both print a weird warning message:

[2774010:0517/180552.216458:FATAL:memory_linux.cc(36)] Out of memory.

... and segfault somewhere else - during v8's initialization:

Thread 1 "pcloud" received signal SIGSEGV, Segmentation fault.
0x00007ffff783b072 in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
(gdb) bt
#0  0x00007ffff783b072 in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#1  0x00007ffff783efdf in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#2  0x00007ffff783f03e in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#3  0x00007ffff70d84e7 in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#4  0x00007ffff762842b in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#5  0x00007ffff6e633ef in v8::base::CallOnceImpl(long*, std::__1::function<void ()>) () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#6  0x00007ffff76283b0 in ?? () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#7  0x00007ffff7802419 in v8::V8::Initialize() () from /nix/store/rhlr6x0199x6xj02cywqidp4f8vcq71y-pcloud-1.12.0/app/libnode.so
#8  0x000000000393411b in ?? ()
#9  0x000000000393114b in ?? ()
#10 0x0000000004b418cb in atom::JavascriptEnvironment::Initialize() ()
#11 0x0000000004b41731 in atom::JavascriptEnvironment::JavascriptEnvironment() ()
#12 0x0000000004b3800e in atom::AtomBrowserMainParts::PostEarlyInitialization() ()
#13 0x00000000035a28d7 in ?? ()
#14 0x00000000035d0a9a in ?? ()
#15 0x00000000035a1c7f in ?? ()
#16 0x000000000387fb78 in ?? ()
#17 0x000000000247c414 in ?? ()
#18 0x0000000003880394 in ?? ()
#19 0x0000000004aeae46 in main ()

This weird-warning-message and crashes don't happen when I checkout older nixpkgs (e.g. git checkout a575c243c23e2851b78c00e9fa245232926ec32f) and apply newest pcloud onto it - in that case everything just works.

This low-key suggests an ABI mismatch on the libnode.so <-> nixpkgs' icu boundary -- maybe we're linking stuff that accidentally produces UnicodeString using e.g. icu-70 and passes instances of that class into libnode.so, not sure yet.

I'm trying to bisect what's changed but it takes a while 😅

@naturallaw777
Copy link

Thanks so much @Patryk27 for looking into this!

@Patryk27
Copy link
Member

Okie, it looks like #209870 is somehow at fault here:

$ git checkout fdd49f1bcd8a7f0b5e29f550d698b2abe5c540cd
$ NIXPKGS_ALLOW_UNFREE=1 nix run --impure .#pcloud
# seems to work
$ git checkout 5f57c2e0f97a83bf5691ac3a29da6ef9b44535c4
$ NIXPKGS_ALLOW_UNFREE=1 nix run --impure .#pcloud
# Segmentation fault

I'm not yet sure what's actually causing it, though 🤔

@Patryk27
Copy link
Member

Patryk27 commented May 20, 2023

More funky stuff: it looks like the first divergence happens on a readlink() somewhere in uprv_tzname_60() -- if I put a breakpoint there, the correct version has:

(gdb) print (char*)$rdi
$1 = 0x7ffff6abb9b6 "/etc/localtime"
(gdb) bt
#0  0x00007ffff4afff49 in readlink () from /nix/store/59nni10pyifs8mdmdxibqqihk87mk0w6-glibc-2.35-224/lib/libc.so.6
#1  0x00007ffff742f564 in uprv_tzname_60 () from /nix/store/qnx9ryj7qdmg5mnb7gb597dqimwh6jpv-pcloud-1.11.0/app/libnode.so
#2  0x00007ffff7378eeb in icu_60::TimeZone::detectHostTimeZone() () from /nix/store/qnx9ryj7qdmg5mnb7gb597dqimwh6jpv-pcloud-1.11.0/app/libnode.so
#3  0x00007ffff737909f in icu_60::TimeZone::createDefault() () from /nix/store/qnx9ryj7qdmg5mnb7gb597dqimwh6jpv-pcloud-1.11.0/app/libnode.so
#4  0x00000000030c8ac5 in ?? ()
#5  0x000000000387f733 in ?? ()
#6  0x000000000247bfa6 in ?? ()
#7  0x0000000003880394 in ?? ()
#8  0x0000000004aeae46 in main ()

... while the invalid version does:

(gdb) print (char*)$rdi
$1 = 0x7ffff66bb9b6 'X' <repeats 200 times>...
(gdb) bt
#0  0x00007ffff46b9f69 in readlink () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/libc.so.6
#1  0x00007ffff702f564 in uprv_tzname_60 () from /nix/store/0x8qmm4klmi2rzfia6ykin5wcn9ai27j-pcloud-1.12.0/app/libnode.so
#2  0x00007ffff6f78eeb in icu_60::TimeZone::detectHostTimeZone() () from /nix/store/0x8qmm4klmi2rzfia6ykin5wcn9ai27j-pcloud-1.12.0/app/libnode.so
#3  0x00007ffff6f7909f in icu_60::TimeZone::createDefault() () from /nix/store/0x8qmm4klmi2rzfia6ykin5wcn9ai27j-pcloud-1.12.0/app/libnode.so
#4  0x00000000030c8ac5 in ?? ()
#5  0x000000000387f733 in ?? ()
#6  0x000000000247bfa6 in ?? ()
#7  0x0000000003880394 in ?? ()
#8  0x0000000004aeae46 in main ()

(note that both traces above exhibit different glibc but I've just checked with newer pcloud downgraded to glibc-2.35-224 and the same thing happens)

Edit: what's even weirder is that judging by the source code:

https://github.com/unicode-org/icu/blob/dcae2a648060dce170fc47f37dbe40e1ec9db394/icu4c/source/common/putil.cpp#L1142

... this $rdi should point somewhere into a read-only memory (wherever TZDEFAULT lies); it's not dynamically allocated or anything 🤔

Edit 2: yeah, it looks like for some reason the invalid libnode.so gets the address wrong -- the correct code has:

(gdb) find 0x7ffff6a00000, 0x7ffff7f2c000, "/etc/localtime"
0x7ffff6abb9b6
1 pattern found.
(gdb) p/x $rdi
$1 = 0x7ffff6abb9b6 <-- ok, matches the found address

... while the invalid code uses a different address for some reason:

(gdb) find 0x7ffff6400000, 0x7ffff7dbd000, "/etc/localtime"
0x7ffff79fe776
1 pattern found.
(gdb) p/x $rdi
$5 = 0x7ffff64bb9b6 <-- lol what

(the addresses for find were taken from info proc map)

@Patryk27
Copy link
Member

Patryk27 commented May 21, 2023

Okie, I think I've found the culprit - patchelf generates an invalid libnode.so file, which is probably caused by NixOS/patchelf#482.

Downgrading patchelf doesn't help here though, and that kinda suggests that there exist some another issue in patchelf that has just gotten exacerbated by #209870.

No fix just yet, although I'm open to ideas 😅

@StefanSchroeder
Copy link
Contributor

Could you offer a workaround while this patchelf-error is not fixed? For non-flake users?

@r-vdp
Copy link
Contributor Author

r-vdp commented Jun 2, 2023

Could you offer a workaround while this patchelf-error is not fixed? For non-flake users?

@StefanSchroeder you can do essentially the same as I did, but instead of getting the 22.11 nixpkgs through a flake input, you get it with fetchTarball.

@StefanSchroeder
Copy link
Contributor

Could you offer a workaround while this patchelf-error is not fixed? For non-flake users?

@StefanSchroeder you can do essentially the same as I did, but instead of getting the 22.11 nixpkgs through a flake input, you get it with fetchTarball.

Ummmh, my Nix-foo is not that well developed. Would you guide me through it? I installed 23.05 yesterday. I can examine and patch pcloud/default.nix. But as far as I understand, the error is outside of pcloud. So, where exactly would I put the fetchTarball?

@r-vdp
Copy link
Contributor Author

r-vdp commented Jun 2, 2023

Could you offer a workaround while this patchelf-error is not fixed? For non-flake users?

@StefanSchroeder you can do essentially the same as I did, but instead of getting the 22.11 nixpkgs through a flake input, you get it with fetchTarball.

Ummmh, my Nix-foo is not that well developed. Would you guide me through it? I installed 23.05 yesterday. I can examine and patch pcloud/default.nix. But as far as I understand, the error is outside of pcloud. So, where exactly would I put the fetchTarball?

@StefanSchroeder a basic NixOS module could look something like this: https://gist.github.com/R-VdP/999dd803c96aee0cb6a176eadef0978e

I included the part that updates pcloud to the latest 1.13.0 release.

@StefanSchroeder
Copy link
Contributor

@StefanSchroeder a basic NixOS module could look something like this: https://gist.github.com/R-VdP/999dd803c96aee0cb6a176eadef0978e

I included the part that updates pcloud to the latest 1.13.0 release.

When I import that file pcloud.nix from my configuration.nix, I get this error:

error: anonymous function at /etc/nixos/pcloud.nix:1:1 called with unexpected argument 'config'

       at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:519:8:

          518|       # works.
          519|     in f (args // extraArgs);
             |        ^
          520|
(use '--show-trace' to show detailed location information)
building Nix...
error: anonymous function at /etc/nixos/pcloud.nix:1:1 called with unexpected argument 'config'

       at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:519:8:

          518|       # works.
          519|     in f (args // extraArgs);
             |        ^
          520|
(use '--show-trace' to show detailed location information)
building the system configuration...
error: anonymous function at /etc/nixos/pcloud.nix:1:1 called with unexpected argument 'config'

       at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:519:8:

          518|       # works.
          519|     in f (args // extraArgs);
             |        ^
          520|
(use '--show-trace' to show detailed location information)

@Patryk27
Copy link
Member

Patryk27 commented Jun 3, 2023

If you're doing:

imports = [
    ./that-file.nix
];

... then you'll have to change config.environment.systemPackages into just environment.systemPackages and, possibly, change { pkgs, lib }: into { pkgs, lib, ... }: - then the expression should work 🙂

@StefanSchroeder
Copy link
Contributor

That worked!

@r-vdp
Copy link
Contributor Author

r-vdp commented Jun 3, 2023

Ah yeah, that should have been { lib, pkgs, ... } indeed.

@naturallaw777
Copy link

Any news on when this will be merged? Thanks everyone!

@Patryk27
Copy link
Member

No idea, unfortunately - it looks like patchelf has an unresolved bug so it's not just a matter of waiting for when something gets merged somewhere, but rather waiting for someone to fix that bug; I don't have enough time on my hands to taking a stab at it, though 👀

@naturallaw777
Copy link

Thanks for the response and all the work that you contributed! Right now my skill set is not very high in this area, I wish I could help more. Maybe in the future as I lean more.

@r-vdp
Copy link
Contributor Author

r-vdp commented Jun 25, 2023

@Patryk27: I had a quick look today, and I noticed that pcloud uses autoPatchelfHook which, unless I misread something, doesn't actually use https://github.com/NixOS/patchelf but instead uses https://github.com/eliben/pyelftools.
I wasn't aware that we had multiple implementations of patchelf in NixOS.

So I guess we might need to report the issue in that repo.

It could also be that actually using the patchelf tool would not cause the corrupt files, but I didn't have the time to try this yet.

@Patryk27
Copy link
Member

Patryk27 commented Jul 2, 2023

fwiw, @r-vdp, it looks like that hook uses pyelftools to scan the binary files, but all the patching ultimately ends up happening through patchelf - e.g.:

This also matches my observations in NixOS/patchelf#482 (comment) where changing the version of patchelf generates files that are broken in different ways, which suggests that something might be going on there 👀

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/pcloud-gives-segmentation-fault/31330/2

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-do-i-use-specific-nixpkgs-versions-in-overlays-using-a-flake-config/33064/1

@r-vdp r-vdp mentioned this issue Oct 18, 2023
12 tasks
@Patryk27
Copy link
Member

Patryk27 commented Feb 4, 2024

Status: I've gotten back to fixing this bug - currently I've narrowed the issue down to this place:

https://github.com/NixOS/patchelf/blob/7c2f768bf9601268a4e71c2ebe91e2011918a70f/src/patchelf.cc#L1294

... which doesn't implement support for rewriting values other than STT_SECTION - in the case of libnode.so (which is the reason pcloud doesn't start) the issue is that extending its RPATH causes the .rodata section to get relocated without updating other references that point into this section (or at least that's my hunch and that's what inspecting with eu-elflint suggests).

I think this used not to be a problem before #209870 got merged, because that change made autoPatchelfHook notice libnode.so depends on libgcc_s.so.1 and so now it tries to include /nix/store/blahblah-libgcc/lib in libnode.so's RPATH, triggering the unimplemented patching code in patchelf (but that change itself is alright, it's patchelf which is the culprit here).

tl;dr when building pcloud, patchelf modifies pcloud's libraries (shuffling stuff in the memory), but then forgets to update other parts of binary, which continue to refer to the older memory addresses instead of the new ones - this causes sigsegv when trying to start the app

I'll report back once I have more information.

Related:

Progress

Patch is ready!
NixOS/patchelf#544

@philg-dev
Copy link

philg-dev commented Apr 30, 2024

@StefanSchroeder a basic NixOS module could look something like this: https://gist.github.com/R-VdP/999dd803c96aee0cb6a176eadef0978e

I included the part that updates pcloud to the latest 1.13.0 release.

Thanks a lot for providing this workaround.

Just in case anybody else still needs to use this workaround...
I've updated the pCloud version used by this workaround to the current 1.14.5 from the pCloud website as follows:

      version = "1.14.5";
      code = "XZ0AMJ0ZdrENNeVMNI4Tz3lO1nxr577ryOMV";
      # Archive link's codes: https://www.pcloud.com/release-notes/linux.html
      src = pkgs.fetchzip {
        url = "https://api.pcloud.com/getpubzip?code=${code}&filename=${prev.pname}-${version}.zip";
        hash = "sha256-a577iWPrke3EizG03m0+hjSoPzA4wDai/QMX2Zl7MF0=";
      };

@StefanSchroeder
Copy link
Contributor

Even though the title of this issue no longer applies, because I am reporting for 24.05-stable here,
pcloud still continuous to segfault w/ version 1.14.5. Tried with recently released stable version.

@VanDoge
Copy link

VanDoge commented Jun 6, 2024

Even though the title of this issue no longer applies, because I am reporting for 24.05-stable here, pcloud still continuous to segfault w/ version 1.14.5. Tried with recently released stable version.

can confirm issue persists on 24.05

@StefanSchroeder
Copy link
Contributor

@Patryk27 you had done some work on this. It seems that NixOS/patchelf#544 might be the solution.
It's open for a while, has seen some review and improvements. Is there any way to push this forward?

@Patryk27
Copy link
Member

Patryk27 commented Jun 7, 2024

So, the remaining issue is a couple of tests failing on less popular architectures - if you scroll down to the bottom of the pull request, right after Some checks were not successful, you'll see that the tests succeed on x86, but fail for ppc64le, arm64v8 etc - it's possibly related to me misunderstanding some piece of logic:

NixOS/patchelf#544 (comment)

I haven't had much time to re-investigate it, but I do have a couple of spare hours this weekend, so I'll try giving it another shot.

That being said, you can still apply the patch on your system in the meantime, you don't have to wait until it's merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants