Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on Alpine Linux 3.13 / musl 1.2.2 with prebuilt binaries #2570

Closed
cnorthwood opened this issue Feb 9, 2021 · 9 comments
Closed

Comments

@cnorthwood
Copy link

Are you using the latest version? Is the version currently in use as reported by npm ls sharp the same as the latest version as reported by npm view sharp dist-tags.latest?

Yes

What are the steps to reproduce?

On the alpine:3.13 Docker image, install Sharp and attempt to load an image and run .metadata() on it. Node exits with a segmentation fault.

Running it with gdb attached I get the following backtrace

Thread 8 "node" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 212]
0x00007f32ddaedc59 in ?? () from /lib/ld-musl-x86_64.so.1
(gdb) bt
#0  0x00007f32ddaedc59 in ?? () from /lib/ld-musl-x86_64.so.1
#1  0x000000007e000000 in ?? ()
#2  0x00007f32d8abc900 in ?? ()
#3  0x00007f32d6bed610 in ?? ()
#4  0x00007f32daaf2820 in ?? ()
#5  0x00007f32daaf38d0 in ?? ()
#6  0x00007f32d89ddbba in vips::VImage::call_option_string(char const*, char const*, vips::VOption*) ()
   from /app/node_modules/sharp/build/Release/../../vendor/8.10.5/lib/libvips-cpp.so.42
#7  0x00007f32d89f9ece in vips::VImage::new_from_file(char const*, vips::VOption*) ()
   from /app/node_modules/sharp/build/Release/../../vendor/8.10.5/lib/libvips-cpp.so.42
#8  0x00007f32d8a26f38 in sharp::OpenInput(sharp::InputDescriptor*) () from /app/node_modules/sharp/build/Release/sharp.node
#9  0x00007f32d8a2f2dc in MetadataWorker::Execute() () from /app/node_modules/sharp/build/Release/sharp.node
#10 0x00007f32d8a2bb94 in Napi::AsyncWorker::OnAsyncWorkExecute(napi_env__*, void*) () from /app/node_modules/sharp/build/Release/sharp.node
#11 0x000055d831f6c17e in ?? ()
#12 0x00007f32ddb1d160 in ?? () from /lib/ld-musl-x86_64.so.1
#13 0x0000000000000000 in ?? ()

What is the expected behaviour?

For it not to segfault

Are you able to provide a minimal, standalone code sample, without other dependencies, that demonstrates this problem?

docker run --rm -it alpine:3.13
apk add nodejs npm
npm install sharp
(outside the container) docker cp image.jpg <container ID>:/image.jpg
node -e 'new require("sharp")("image.jpg").metadata()'

Are you able to provide a sample image that helps explain the problem?

I don't think this is related to the image, I've tried with several.

What is the output of running npx envinfo --binaries --system?

/ # npx envinfo --binaries --system
npx: installed 1 in 0.988s

  System:
    OS: Linux 4.19 Alpine Linux
    CPU: (6) x64 Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
    Memory: 11.66 GB / 15.64 GB
    Container: Yes
    Shell: 1.32.1 - /bin/ash
  Binaries:
    Node: 14.15.4 - /usr/bin/node
    npm: 6.14.10 - /usr/bin/npm
@cnorthwood
Copy link
Author

Installing vips-dev and forcing Sharp to build from source seems to be a workaround for this.

@lovell
Copy link
Owner

lovell commented Feb 9, 2021

Hello, thank you for the clear details in this report, I can reproduce this locally.

Interestingly, using the alpine:3.12 image does not fail, nor do the node:*-alpine images (based on 3.11), which is what the prebuilt binaries are compiled on.

This smells a bit like an ABI change somewhere between Alpine 3.12 / musl 1.1.24 and Alpine 3.13 / musl 1.2.2.
https://abi-laboratory.pro/index.php?view=compat_report&l=musl&v1=1.2.0&v2=1.2.1&obj=66fd5&kind=abi

It's possible that the change in lovell/sharp-libvips#81 will help, but otherwise this could be a case of having to (pre)build for musl 1.1.x and musl 1.2.x separately.

Initially we can prevent the use of the current prebuilt binaries for musl >=1.2.0 approximately here - https://github.com/lovell/sharp/blob/master/install/libvips.js#L84

@lovell lovell changed the title Segfault on Alpine Linux Segfault on Alpine Linux 3.13 / musl 1.2.2 with prebuilt binaries Feb 9, 2021
@lovell
Copy link
Owner

lovell commented Feb 9, 2021

https://musl.libc.org/time64.html introduced in musl 1.2.0 to deal with the epochalypse (year 2038 problem) is a likely candidate for ABI mismatches.

@davidwindell
Copy link

Just to say, we're experiencing this also on Alpine 3.13. It took a while to track it down and find this issue because sharp is used by our vendor (Vue Storefront) and all we knew was image generation was causing a segfault.

@lovell
Copy link
Owner

lovell commented Feb 20, 2021

Commit 9f2f920 adds an install-time check to prevent prebuilt binaries from being used with musl >= v1.2.0, which will be part of sharp v0.27.2.

For sharp v0.28.0+, if lovell/sharp-libvips#81 doesn't fix this, we can consider separate prebuilt binaries for musl 1.1.x vs 1.2.x.

@lovell lovell added enhancement and removed triage labels Feb 20, 2021
@lovell lovell added this to the v0.28.0 milestone Feb 22, 2021
@lovell
Copy link
Owner

lovell commented Mar 7, 2021

I've made a little progress on this.

It looks like even libvips binaries built on Alpine 3.13 fail in the same manner, which is unexpected.

The following two commands are very similar, and when run with sharp v0.27.1 on Alpine 3.13 (as this issue was reported against) one works but the other segfaults:

$ node -e "require('sharp')({create:{width:1,height:1,channels:3,background:'red'}}).sharpen().toBuffer().then(console.log)"
<Buffer ff 00 00>
$ node -e "require('sharp')({create:{width:1,height:1,channels:3,background:'red'}}).sharpen(1).toBuffer().then(console.log)"
Segmentation fault (core dumped)

The difference is:

sharp/src/operations.cc

Lines 194 to 199 in dcf913c

VImage sharpen = VImage::new_matrixv(3, 3,
-1.0, -1.0, -1.0,
-1.0, 32.0, -1.0,
-1.0, -1.0, -1.0);
sharpen.set("scale", 24.0);
return image.conv(sharpen);

vs:

sharp/src/operations.cc

Lines 206 to 207 in dcf913c

return image.sharpen(
VImage::option()->set("sigma", sigma)->set("m1", flat)->set("m2", jagged))

VImage::option() is a bit unusual in that it is created in sharp but deleted in libvips, which has caused problems with the memory allocator on Windows before.

musl 1.2.1 introduced a new "mallocng" memory allocator, so that's the next thing to investigate.

@lovell
Copy link
Owner

lovell commented Mar 8, 2021

Found it. The segfault occurs when delete is called here https://github.com/libvips/libvips/blob/master/cplusplus/VImage.cpp#L113-L119 on a std::list, which immediately made me suspect https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html

The prebuilt sharp binary is built with C++11 "old" ABI but it looks like the libvips binaries for musl-based Linux were built with C++11 "new" ABI. This hasn't been a problem with the "oldmalloc" allocator, but it is incorrect behaviour between sharp and libvips, and musl's new "mallocng" allocator has exposed this.

I'm going to switch all prebuilt Linux binaries to use "new" ABI for the next release, so the forthcoming libvips v8.10.6 and sharp v0.28.0 binaries will be aligned for both glibc and musl Linux.

lovell/sharp-libvips@07de78c

@lovell
Copy link
Owner

lovell commented Mar 8, 2021

Commit 885e959 adds Alpine 3.13 / Node.js 14 to the x64 CI matrix and loosens the semver check. This will be in v0.28.0.

@lovell
Copy link
Owner

lovell commented Mar 29, 2021

v0.28.0 now available with prebuilt binaries that will work on musl v1.2.x (as well as musl v1.1.24+).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants