Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to load shared library #17601

Closed
gl84 opened this issue Aug 8, 2022 · 12 comments · Fixed by #17628
Closed

Failed to load shared library #17601

gl84 opened this issue Aug 8, 2022 · 12 comments · Fixed by #17628

Comments

@gl84
Copy link
Contributor

gl84 commented Aug 8, 2022

Please include the following in your bug report:

I recently updated my emscripten version and now get: "Error in loading dynamic library libXXX.wasm: RuntimeError: null function or function signature mismatch".

Bisecting found that since llvm-roll https://chromium.googlesource.com/emscripten-releases/+/c3f55f5c179c073c569684a0640727edb7c5d10e the loading is broken (somewhere between 3.1.14 and 3.1.15).

@sbc100: Do you have an idea if this could be related somehow to not applying data relocations at static constructor time?

Version of emscripten/emsdk:
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.15-git (ae675c6)
clang version 15.0.0 (https://github.com/llvm/llvm-project 9f94d63a6a7e00a792fb0e05dbb5ad08313ec9bc)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /build/emsdk/upstream/bin

Full link command and output with -v appended:
em++ -O3 -o libXXX.wasm static_library1.a static_library2.a -fexceptions -fPIC -s SIDE_MODULE=2 -s DISABLE_EXCEPTION_CATCHING=0 -s "EXPORTED_FUNCTIONS=['_custom_function1','_custom_function1']"

Any help would be appreciated! Thanks in advance!

@sbc100
Copy link
Collaborator

sbc100 commented Aug 8, 2022

Can you confirm its still broken with latest?

Its hard to tell what it might be without more info. Does the Error come with stack trace perhaps? Does it reproduce in an -O0 build?

@gl84
Copy link
Contributor Author

gl84 commented Aug 8, 2022

Thanks for your suggestions! It helped me to find out that in my main program some exports are missing which are needed in my plugin....

Do you know a clean and automatic way to ensure all exports from the main program are created without recording the plugin in the dylink section?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 8, 2022

Yes you can pass the library when linking the main program and then add -sNO_AUTOLOAD_DYLIBS

@gl84
Copy link
Contributor Author

gl84 commented Aug 9, 2022

Ah nice!

Seems to be a global flag which means there is no grouping, nevertheless it's much easier to keep track of library deps instead of functions which need to be present....

Thanks!

@gl84 gl84 closed this as completed Aug 9, 2022
@gl84 gl84 reopened this Aug 9, 2022
@gl84
Copy link
Contributor Author

gl84 commented Aug 9, 2022

Unfortunately the missing exports had nothing to do with my problem (this only showed up due to missing DCE with -O0).

I debugged a bit further and the call stack is:
.....
postInstantiation
__wasm_call_ctors
....
invoke_vii

In the invoke_vii we get a valid wasm table entry. The call getWasmTableEntry(index)(a1,a2); leads immediately to a function signature mismatch.

From the stacktrace it seems that __dynamic_cast produces this function signature mismatch...

@sbc100
Copy link
Collaborator

sbc100 commented Aug 9, 2022

Can you produce at -O2 with assertions enabled? If possible it would be great to be able to reduce this down to small test case that can show what the issue is.

How are you loading your library? dlopen? Is this happening during the dlopen call?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 9, 2022

Does it happen with -sSIDE_MODULE=1 too?

@gl84
Copy link
Contributor Author

gl84 commented Aug 9, 2022

I can reproduce it on the latest tot build with -sSIDE_MODULE=1 -O2 and with -sASSERTIONS=2 enabled when I'm opening the library with dlopen.

I try to create a reduced test case.

@gl84
Copy link
Contributor Author

gl84 commented Aug 9, 2022

While trying to create the small test case I encounter a wasm-ld crash with the following stacktrace:

/build/emsdk/upstream/emscripten/em++ -O2 -g -o /build/libXXX.wasm /build/XXX/wasm_32/install/lib/libXXX.a -fexceptions -s SIDE_MODULE=1 -s DISABLE_EXCEPTION_CATCHING=0
wasm-ld: /b/s/w/ir/cache/builder/emscripten-releases/llvm-project/llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From ) [To = lld::wasm::DefinedFunction, From = const lld::wasm::FunctionSymbol]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /build/emsdk/upstream/bin/wasm-ld -o /build/libXXX.wasm --whole-archive /build/XXX/wasm_32/install/lib/libXXX.a -L/build/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/pic --no-whole-archive -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --import-memory --export-dynamic --export-if-defined=main --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=__main_argc_argv --export-if-defined=__wasm_apply_data_relocs --export-if-defined=setThrew --export-if-defined=stackSave --export-if-defined=stackRestore --export-if-defined=stackAlloc --export-if-defined=__wasm_call_ctors --export-if-defined=__errno_location --export-if-defined=malloc --export-if-defined=free --export-if-defined=__cxa_is_pointer_type --export-if-defined=__cxa_can_catch --experimental-pic -shared
#0 0x00007f95c719bd03 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/build/emsdk/upstream/bin/../lib/libLLVM-16git.so+0x1b54d03)
#1 0x00007f95c7199b0e llvm::sys::RunSignalHandlers() (/build/emsdk/upstream/bin/../lib/libLLVM-16git.so+0x1b52b0e)
#2 0x00007f95c719c1cf SignalHandler(int) Signals.cpp:0:0
#3 0x00007f95c501a520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x42520)
#4 0x00007f95c506ea7c pthread_kill (/usr/lib/x86_64-linux-gnu/libc.so.6+0x96a7c)
#5 0x00007f95c501a476 gsignal (/usr/lib/x86_64-linux-gnu/libc.so.6+0x42476)
#6 0x00007f95c50007f3 abort (/usr/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
#7 0x00007f95c500071b (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
#8 0x00007f95c5011e96 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#9 0x000055a0b7b8f6cf lld::wasm::FunctionSymbol::getFunctionIndex() const (/build/emsdk/upstream/bin/wasm-ld+0x4b36cf)
#10 0x000055a0b7ba8909 lld::wasm::ElemSection::writeBody() (/build/emsdk/upstream/bin/wasm-ld+0x4cc909)
#11 0x000055a0b7b9df4a lld::wasm::SyntheticSection::finalizeContents() Writer.cpp:0:0
#12 0x000055a0b7b99f61 lld::wasm::(anonymous namespace)::Writer::run() Writer.cpp:0:0
#13 0x000055a0b7b90e06 lld::wasm::writeResult() (/build/emsdk/upstream/bin/wasm-ld+0x4b4e06)
#14 0x000055a0b7b71fa9 lld::wasm::(anonymous namespace)::LinkerDriver::linkerMain(llvm::ArrayRef<char const
>) Driver.cpp:0:0
#15 0x000055a0b7b6cbb9 lld::wasm::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/build/emsdk/upstream/bin/wasm-ld+0x490bb9)
#16 0x000055a0b787d686 lldMain(int, char const**, llvm::raw_ostream&, llvm::raw_ostream&, bool) lld.cpp:0:0
#17 0x000055a0b787ce8a main (/build/emsdk/upstream/bin/wasm-ld+0x1a0e8a)
#18 0x00007f95c5001d90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#19 0x00007f95c5001e40 __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#20 0x000055a0b787c99a _start (/build/emsdk/upstream/bin/wasm-ld+0x1a099a)
em++: error: '/build/emsdk/upstream/bin/wasm-ld -o /build/libXXX.wasm --whole-archive /build/XXX/wasm_32/install/lib/libXXX.a -L/build/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/pic --no-whole-archive -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --import-memory --export-dynamic --export-if-defined=main --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=__main_argc_argv --export-if-defined=__wasm_apply_data_relocs --export-if-defined=setThrew --export-if-defined=stackSave --export-if-defined=stackRestore --export-if-defined=stackAlloc --export-if-defined=__wasm_call_ctors --export-if-defined=__errno_location --export-if-defined=malloc --export-if-defined=free --export-if-defined=__cxa_is_pointer_type --export-if-defined=__cxa_can_catch --experimental-pic -shared' failed (received SIGABRT (-6))

Which means I'm kind of stuck as I've no idea if this is somehow related or is an independent wasm-ld bug.

@gl84
Copy link
Contributor Author

gl84 commented Aug 10, 2022

@sbc100: Thanks for your help.

It seems that when dlopening a shared library __wasm_call_ctors is called without a preceding call to __wasm_apply_data_relocs:

var init = moduleExports['__wasm_call_ctors'];
if (init) {
if (runtimeInitialized) {
init();
} else {
// we aren't ready to run compiled code yet
__ATINIT__.push(init);
}
}
var applyRelocs = moduleExports['__wasm_apply_data_relocs'];
if (applyRelocs) {
if (runtimeInitialized) {
applyRelocs();
} else {
__RELOC_FUNCS__.push(applyRelocs);
}
}

Swapping and calling __wasm_apply_data_relocs before __wasm_call_ctors fixes the crash. Nevertheless I'm not sure if this is the right fix...

@ryanking13
Copy link
Contributor

ryanking13 commented Aug 11, 2022

Hi, Pyodide team is also suffering this error, and retrying __wasm_call_ctors after __wasm_apply_data_relocs fixes the crash. Though we are also not sure whether that is a proper fix.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 11, 2022

Indeed that certainly looks like the wrong order to me. I wonder how that isn't covered by our tests .. I guess that more of our tests are using load time dynamic linking.

Notice that when the library are loaded before the runtime is initialized then they are run in the other (correct) order:

emscripten/src/preamble.js

Lines 266 to 270 in 2b2e142

#if RELOCATABLE
callRuntimeCallbacks(__RELOC_FUNCS__);
#endif
<<< ATINITS >>>
callRuntimeCallbacks(__ATINIT__);

sbc100 added a commit that referenced this issue Aug 11, 2022
This bug applied to libraries loaded after the runtime initialized.

Fixes: #17601
sbc100 added a commit that referenced this issue Aug 11, 2022
This bug applied to libraries loaded after the runtime initialized.

Fixes: #17601
sbc100 added a commit that referenced this issue Aug 11, 2022
This bug applied to libraries loaded after the runtime initialized.

Fixes: #17601
sbc100 added a commit that referenced this issue Aug 12, 2022
This bug applied to libraries loaded after the runtime initialized.

Fixes: #17601
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants