Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Teach dynamic linking to handle library -> library dependencies
Currently Emscripten allows to create shared libraries (DSO) and link them to main module. However a shared library itself cannot be linked to another shared library. The lack of support for DSO -> DSO linking becomes problematic in cases when there are several shared libraries that all need to use another should-be shared functionality, while linking that should-be shared functionality to main module is not an option for size reasons. My particular use-case is SciPy support for Pyodide: pyodide/pyodide#211, pyodide/pyodide#240 where several of `*.so` scipy modules need to link to LAPACK. If we link to LAPACK statically from all those `*.so` - it just blows up compiled size pyodide/pyodide#211 (comment) and if we link in LAPACK statically to main module, the main module size is also increased ~2x, which is not an option, since LAPACK functionality is not needed by every Pyodide user. This way we are here to add support for DSO -> DSO linking: 1. similarly to how it is already working for main module -> side module linking, when building a side module it is now possible to specify -s RUNTIME_LINKED_LIBS=[...] with list of shared libraries that side module needs to link to. 2. to store that information, for asm.js, similarly to how it is currently handled for main module (which always has js part), we transform RUNTIME_LINKED_LIBS to libModule.dynamicLibraries = [...] (see src/preamble_sharedlib.js) 3. for wasm module, in order to store the information about to which libraries a module links, we could in theory use "module" attribute in wasm imports. However currently emscripten almost always uses just "env" for that "module" attribute, e.g. (import "env" "abortStackOverflow" (func $fimport$0 (param i32))) (import "env" "_ffunc1" (func $fimport$1)) ... and this way we have to embed the information about required libraries for the dynamic linker somewhere else. What I came up with is to extend "dylink" section with information about which shared libraries a shared library needs. This is similar to DT_NEEDED entries in ELF. (see tools/shared.py) 4. then, the dynamic linker (loadDynamicLibrary) is reworked to handle that information: - for asm.js, after loading a libModule, we check libModule.dynamicLibraries and post-load them recursively. (it would be better to load needed modules before the libModule, but for js we currently have to first eval whole libModule's code to be able to read .dynamicLibraries) - for wasm the needed libraries are loaded before the wasm module in question is instantiated. (see changes to loadWebAssemblyModule for details) 5. since we also have to teach dlopen() to handle needed libraries, and since dlopen was already duplicating loadDynamicLibrary() code in many ways, instead of adding more duplication, dlopen is now reworked to use loadDynamicLibrary itself. This moves functionality to keep track of loaded DSO, their handles, refcounts, etc into the dynamic linker itself, with loadDynamicLibrary now accepting various flags (global/nodelete) to handle e.g. RTLD_LOCAL/RTLD_GLOBAL and RTLD_NODELETE dlopen cases (RTLD_NODELETE semantic is needed for initially-linked-in libraries). Also, since dlopen was using FS to read libraries, and loadDynamicLibrary was previously using Module['read'] and friends, loadDynamicLibrary now also accepts fs interface, which if provided, is used as FS-like interface to load library data, and if not - native loading capabilities of the environment are still used. Another aspect related to deduplication is that loadDynamicLibrary now also uses preloaded/precompiled wasm modules, that were previously only used by dlopen (see a5866a5 "Add preload plugin to compile wasm side modules async (emscripten-core#6663)"). (see changes to dlopen and loadDynamicLibrary) 6. The functionality to asynchronously load dynamic libraries is also integrated into loadDynamicLibrary. Libraries were asynchronously preloaded for the case when Module['readBinary'] is absent (browser, see 3446d2a "preload wasm dynamic libraries when we can't load them synchronously"). Since this codepath was also needed to be taught of DSO -> DSO dependency, the most straightforward thing to do was to teach loadDynamicLibrary to do its work asynchronously (under flag) and to switch the preloading to use loadDynamicLibrary(..., {loadAsync: true}) (see changes to src/preamble.js and loadDynamicLibrary) 7. A test is added for verifying linking/dlopening a DSO with other needed library. browser.test_dynamic_link is also amended to verify linking to DSO with dependencies. With the patch I've made sure that all core tests (including test_dylink_* and test_dlfcn_*) are passing for asm{0,1,2} and binaryen{0,1,2}. However since I cannot get full browser tests to pass even on pristine incoming (1.38.19-2-g77246e0c1 as of today; there are many failures for both Firefox 63.0.1 and Chromium 70.0.3538.67), I did not tried to verify full browser tests with my patch. Bit I've made sure that browser.test_preload_module browser.test_dynamic_link are passing. "other" kind of tests also do not pass on pristine incoming for me. This way I did not tried to verify "other" with my patch. Thanks beforehand, Kirill P.S. This is my first time I do anything with WebAssembly/Emscripten, and only a second time with JavaScript, so please forgive me if I missed something. P.P.S. I can split the patch into smaller steps, if it will help review. /cc @kripken, @juj, @sbc100, @max99x, @junjihashimoto, @mdboom, @rth
- Loading branch information