Skip to content

Commit

Permalink
Teach dynamic linking to handle library -> library dependencies
Browse files Browse the repository at this point in the history
Currently Emscripten allows to create shared libraries (DSO) and link
them to main module. However a shared library itself cannot be linked to
another shared library.

The lack of support for DSO -> DSO linking becomes problematic in cases when
there are several shared libraries that all need to use another should-be
shared functionality, while linking that should-be shared functionality to main
module is not an option for size reasons. My particular use-case is SciPy
support for Pyodide:

	pyodide/pyodide#211,
	pyodide/pyodide#240

where several of `*.so` scipy modules need to link to LAPACK. If we link to
LAPACK statically from all those `*.so` - it just blows up compiled size

	pyodide/pyodide#211 (comment)

and if we link in LAPACK statically to main module, the main module size is
also increased ~2x, which is not an option, since LAPACK functionality is not
needed by every Pyodide user.

This way we are here to add support for DSO -> DSO linking:

1. similarly to how it is already working for main module -> side module
   linking, when building a side module it is now possible to specify

	-s RUNTIME_LINKED_LIBS=[...]

   with list of shared libraries that side module needs to link to.

2. to store that information, for asm.js, similarly to how it is currently
   handled for main module (which always has js part), we transform
   RUNTIME_LINKED_LIBS to

	libModule.dynamicLibraries = [...]

   (see src/preamble_sharedlib.js)

3. for wasm module, in order to store the information about to which libraries
   a module links, we could in theory use "module" attribute in wasm imports.
   However currently emscripten almost always uses just "env" for that "module"
   attribute, e.g.

	(import "env" "abortStackOverflow" (func $fimport$0 (param i32)))
	(import "env" "_ffunc1" (func $fimport$1))
	...

   and this way we have to embed the information about required libraries for
   the dynamic linker somewhere else.

   What I came up with is to extend "dylink" section with information about
   which shared libraries a shared library needs. This is similar to DT_NEEDED
   entries in ELF.

   (see tools/shared.py)

4. then, the dynamic linker (loadDynamicLibrary) is reworked to handle that information:

   - for asm.js, after loading a libModule, we check libModule.dynamicLibraries
     and post-load them recursively. (it would be better to load needed modules
     before the libModule, but for js we currently have to first eval whole
     libModule's code to be able to read .dynamicLibraries)

   - for wasm the needed libraries are loaded before the wasm module in
     question is instantiated.

     (see changes to loadWebAssemblyModule for details)

5. since we also have to teach dlopen() to handle needed libraries, and since dlopen
   was already duplicating loadDynamicLibrary() code in many ways, instead of
   adding more duplication, dlopen is now reworked to use loadDynamicLibrary
   itself.

   This moves functionality to keep track of loaded DSO, their handles,
   refcounts, etc into the dynamic linker itself, with loadDynamicLibrary now
   accepting various flags (global/nodelete) to handle e.g.
   RTLD_LOCAL/RTLD_GLOBAL and RTLD_NODELETE dlopen cases (RTLD_NODELETE
   semantic is needed for initially-linked-in libraries).

   Also, since dlopen was using FS to read libraries, and loadDynamicLibrary was
   previously using Module['read'] and friends, loadDynamicLibrary now also
   accepts fs interface, which if provided, is used as FS-like interface to load
   library data, and if not - native loading capabilities of the environment
   are still used.

   Another aspect related to deduplication is that loadDynamicLibrary now also
   uses preloaded/precompiled wasm modules, that were previously only used by
   dlopen (see a5866a5 "Add preload plugin to compile wasm side modules async
   (emscripten-core#6663)").

   (see changes to dlopen and loadDynamicLibrary)

6. The functionality to asynchronously load dynamic libraries is also
   integrated into loadDynamicLibrary.

   Libraries were asynchronously preloaded for the case when
   Module['readBinary'] is absent (browser, see 3446d2a "preload wasm dynamic
   libraries when we can't load them synchronously").

   Since this codepath was also needed to be taught of DSO -> DSO dependency,
   the most straightforward thing to do was to teach loadDynamicLibrary to do
   its work asynchronously (under flag) and to switch the preloading to use

	loadDynamicLibrary(..., {loadAsync: true})

   (see changes to src/preamble.js and loadDynamicLibrary)

7. A test is added for verifying linking/dlopening a DSO with other needed library.

   browser.test_dynamic_link is also amended to verify linking to DSO with
   dependencies.

With the patch I've made sure that all core tests (including test_dylink_* and
test_dlfcn_*) are passing for asm{0,1,2} and binaryen{0,1,2}.

However since I cannot get full browser tests to pass even on pristine incoming
(1.38.19-2-g77246e0c1 as of today; there are many failures for both Firefox
63.0.1 and Chromium 70.0.3538.67), I did not tried to verify full browser tests
with my patch. Bit I've made sure that

	browser.test_preload_module
	browser.test_dynamic_link

are passing.

"other" kind of tests also do not pass on pristine incoming for me. This
way I did not tried to verify "other" with my patch.

Thanks beforehand,
Kirill

P.S.

This is my first time I do anything with WebAssembly/Emscripten, and only a
second time with JavaScript, so please forgive me if I missed something.

P.P.S.

I can split the patch into smaller steps, if it will help review.

/cc @kripken, @juj, @sbc100, @max99x, @junjihashimoto, @mdboom, @rth
  • Loading branch information
navytux committed Nov 15, 2018
1 parent 77246e0 commit 876d284
Show file tree
Hide file tree
Showing 11 changed files with 591 additions and 280 deletions.
1 change: 1 addition & 0 deletions AUTHORS
Expand Up @@ -371,3 +371,4 @@ a license to everyone to use it as detailed in LICENSE.)
* Altan Özlü <altanozlu7@gmail.com>
* Mary S <ipadlover8322@gmail.com>
* Martin Birks <mbirks@gmail.com>
* Kirill Smelkov <kirr@nexedi.com> (copyright owned by Nexedi)
2 changes: 1 addition & 1 deletion emcc.py
Expand Up @@ -2604,7 +2604,7 @@ def do_binaryen(target, asm_target, options, memfile, wasm_binary_target,
shared.Building.eval_ctors(final, wasm_binary_target, binaryen_bin, debug_info=debug_info)
# after generating the wasm, do some final operations
if shared.Settings.SIDE_MODULE:
wso = shared.WebAssembly.make_shared_library(final, wasm_binary_target)
wso = shared.WebAssembly.make_shared_library(final, wasm_binary_target, shared.Settings.RUNTIME_LINKED_LIBS)
# replace the wasm binary output with the dynamic library. TODO: use a specific suffix for such files?
shutil.move(wso, wasm_binary_target)
if not shared.Settings.WASM_BACKEND and not DEBUG:
Expand Down
126 changes: 33 additions & 93 deletions src/library.js
Expand Up @@ -1696,20 +1696,25 @@ LibraryManager.library = {
// being compiled. Not sure how to tell LLVM to not do so.
// ==========================================================================

#if MAIN_MODULE == 0
dlopen: function(/* ... */) {
abort("To use dlopen, you need to use Emscripten's linking support, see https://github.com/kripken/emscripten/wiki/Linking");
},
dlclose: 'dlopen',
dlsym: 'dlopen',
dlerror: 'dlopen',
dladdr: 'dlopen',
#else

$DLFCN: {
error: null,
errorMsg: null,
loadedLibs: {}, // handle -> [refcount, name, lib_object]
loadedLibNames: {}, // name -> handle
},
// void* dlopen(const char* filename, int flag);
dlopen__deps: ['$DLFCN', '$FS', '$ENV'],
dlopen__deps: ['$LDSO', '$DLFCN', '$FS', '$ENV'],
dlopen__proxy: 'sync',
dlopen__sig: 'iii',
dlopen: function(filenameAddr, flag) {
#if MAIN_MODULE == 0
abort("To use dlopen, you need to use Emscripten's linking support, see https://github.com/kripken/emscripten/wiki/Linking");
#endif
// void *dlopen(const char *file, int mode);
// http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
var searchpaths = [];
Expand Down Expand Up @@ -1739,128 +1744,62 @@ LibraryManager.library = {
}
}

if (DLFCN.loadedLibNames[filename]) {
// Already loaded; increment ref count and return.
var handle = DLFCN.loadedLibNames[filename];
DLFCN.loadedLibs[handle].refcount++;
return handle;
}
// We don't care about RTLD_NOW and RTLD_LAZY.
var flags = {
global: Boolean(flag & 256), // RTLD_GLOBAL
nodelete: Boolean(flag & 4096), // RTLD_NODELETE

var lib_module;
if (filename === '__self__') {
var handle = -1;
lib_module = Module;
} else {
if (Module['preloadedWasm'] !== undefined &&
Module['preloadedWasm'][filename] !== undefined) {
lib_module = Module['preloadedWasm'][filename];
} else {
var target = FS.findObject(filename);
if (!target || target.isFolder || target.isDevice) {
DLFCN.errorMsg = 'Could not find dynamic lib: ' + filename;
return 0;
}
FS.forceLoadFile(target);
fs: FS, // load libraries from provided filesystem
}

try {
#if WASM
// the shared library is a shared wasm library (see tools/shared.py WebAssembly.make_shared_library)
var lib_data = FS.readFile(filename, { encoding: 'binary' });
if (!(lib_data instanceof Uint8Array)) lib_data = new Uint8Array(lib_data);
//err('libfile ' + filename + ' size: ' + lib_data.length);
lib_module = loadWebAssemblyModule(lib_data);
#else
// the shared library is a JS file, which we eval
var lib_data = FS.readFile(filename, { encoding: 'utf8' });
lib_module = eval(lib_data)(
alignFunctionTables(),
Module
);
#endif
} catch (e) {
try {
handle = loadDynamicLibrary(filename, flags)
} catch (e) {
#if ASSERTIONS
err('Error in loading dynamic library: ' + e);
err('Error in loading dynamic library ' + filename + ": " + e);
#endif
DLFCN.errorMsg = 'Could not evaluate dynamic lib: ' + filename + '\n' + e;
return 0;
}
}

// Not all browsers support Object.keys().
var handle = 1;
for (var key in DLFCN.loadedLibs) {
if (DLFCN.loadedLibs.hasOwnProperty(key)) handle++;
}

// We don't care about RTLD_NOW and RTLD_LAZY.
if (flag & 256) { // RTLD_GLOBAL
for (var ident in lib_module) {
if (lib_module.hasOwnProperty(ident)) {
// When RTLD_GLOBAL is enable, the symbols defined by this shared object will be made
// available for symbol resolution of subsequently loaded shared objects.
//
// We should copy the symbols (which include methods and variables) from SIDE_MODULE to MAIN_MODULE.
//
// Module of SIDE_MODULE has not only the symbols (which should be copied)
// but also others (print*, asmGlobal*, FUNCTION_TABLE_**, NAMED_GLOBALS, and so on).
//
// When the symbol (which should be copied) is method, Module._* 's type becomes function.
// When the symbol (which should be copied) is variable, Module._* 's type becomes number.
//
// Except for the symbol prefix (_), there is no difference in the symbols (which should be copied) and others.
// So this just copies over compiled symbols (which start with _).
if (ident[0] == '_') {
Module[ident] = lib_module[ident];
}
}
}
}
DLFCN.errorMsg = 'Could not load dynamic lib: ' + filename + '\n' + e;
return 0;
}
DLFCN.loadedLibs[handle] = {
refcount: 1,
name: filename,
module: lib_module
};
DLFCN.loadedLibNames[filename] = handle;

return handle;
},
// int dlclose(void* handle);
dlclose__deps: ['$DLFCN'],
dlclose__deps: ['$LDSO', '$DLFCN'],
dlclose__proxy: 'sync',
dlclose__sig: 'ii',
dlclose: function(handle) {
// int dlclose(void *handle);
// http://pubs.opengroup.org/onlinepubs/009695399/functions/dlclose.html
if (!DLFCN.loadedLibs[handle]) {
if (!$LDSO.loadedLibs[handle]) {
DLFCN.errorMsg = 'Tried to dlclose() unopened handle: ' + handle;
return 1;
} else {
var lib_record = DLFCN.loadedLibs[handle];
var lib_record = $LDSO.loadedLibs[handle];
if (--lib_record.refcount == 0) {
if (lib_record.module.cleanups) {
lib_record.module.cleanups.forEach(function(cleanup) { cleanup() });
}
delete DLFCN.loadedLibNames[lib_record.name];
delete DLFCN.loadedLibs[handle];
delete $LDSO.loadedLibNames[lib_record.name];
delete $LDSO.loadedLibs[handle];
}
return 0;
}
},
// void* dlsym(void* handle, const char* symbol);
dlsym__deps: ['$DLFCN'],
dlsym__deps: ['$LDSO', '$DLFCN'],
dlsym__proxy: 'sync',
dlsym__sig: 'iii',
dlsym: function(handle, symbol) {
// void *dlsym(void *restrict handle, const char *restrict name);
// http://pubs.opengroup.org/onlinepubs/009695399/functions/dlsym.html
symbol = Pointer_stringify(symbol);

if (!DLFCN.loadedLibs[handle]) {
if (!$LDSO.loadedLibs[handle]) {
DLFCN.errorMsg = 'Tried to dlsym() from an unopened handle: ' + handle;
return 0;
} else {
var lib = DLFCN.loadedLibs[handle];
var lib = $LDSO.loadedLibs[handle];
symbol = '_' + symbol;
if (!lib.module.hasOwnProperty(symbol)) {
DLFCN.errorMsg = ('Tried to lookup unknown symbol "' + symbol +
Expand Down Expand Up @@ -1897,7 +1836,7 @@ LibraryManager.library = {
}
},
// char* dlerror(void);
dlerror__deps: ['$DLFCN'],
dlerror__deps: ['$LDSO', '$DLFCN'],
dlerror__proxy: 'sync',
dlerror__sig: 'i',
dlerror: function() {
Expand Down Expand Up @@ -1925,6 +1864,7 @@ LibraryManager.library = {
{{{ makeSetValue('info', QUANTUM_SIZE*3, '0', 'i32') }}};
return 1;
},
#endif // MAIN_MODULE != 0

// ==========================================================================
// pwd.h
Expand Down
2 changes: 1 addition & 1 deletion src/library_browser.js
Expand Up @@ -244,7 +244,7 @@ var LibraryBrowser = {
// promises to run in series.
this['asyncWasmLoadPromise'] = this['asyncWasmLoadPromise'].then(
function() {
return loadWebAssemblyModule(byteArray, true);
return loadWebAssemblyModule(byteArray, {loadAsync: true, nodelete: true});
}).then(
function(module) {
Module['preloadedWasm'][name] = module;
Expand Down
34 changes: 14 additions & 20 deletions src/preamble.js
Expand Up @@ -1619,37 +1619,31 @@ addOnPreRun(function() { addRunDependency('pgo') });
}}}

addOnPreRun(function() {
function runPostSets() {
if (Module['asm']['runPostSets']) {
Module['asm']['runPostSets']();
}
}
function loadDynamicLibraries(libs) {
if (libs) {
libs.forEach(function(lib) {
loadDynamicLibrary(lib);
// libraries linked to main never go away
loadDynamicLibrary(lib, {global: true, nodelete: true});
});
}
if (Module['asm']['runPostSets']) {
Module['asm']['runPostSets']();
}
runPostSets();
}
// if we can load dynamic libraries synchronously, do so, otherwise, preload
#if WASM
if (Module['dynamicLibraries'] && Module['dynamicLibraries'].length > 0 && !Module['readBinary']) {
// we can't read binary data synchronously, so preload
addRunDependency('preload_dynamicLibraries');
var binaries = [];
Module['dynamicLibraries'].forEach(function(lib) {
fetch(lib, { credentials: 'same-origin' }).then(function(response) {
if (!response['ok']) {
throw "failed to load wasm binary file at '" + lib + "'";
}
return response['arrayBuffer']();
}).then(function(buffer) {
var binary = new Uint8Array(buffer);
binaries.push(binary);
if (binaries.length === Module['dynamicLibraries'].length) {
// we got them all, wonderful
loadDynamicLibraries(binaries);
removeRunDependency('preload_dynamicLibraries');
}
});
Promise.all(Module['dynamicLibraries'].map(function(lib) {
return loadDynamicLibrary(lib, {loadAsync: true, global: true, nodelete: true});
})).then(function() {
// we got them all, wonderful
runPostSets();
removeRunDependency('preload_dynamicLibraries');
});
return;
}
Expand Down
14 changes: 14 additions & 0 deletions src/preamble_sharedlib.js
Expand Up @@ -4,5 +4,19 @@
// Runtime essentials
//========================================

{{{
(function() {
// add in RUNTIME_LINKED_LIBS, if provided
//
// for side module we only set Module.dynamicLibraries - and loading them
// will be handled by dynamic linker runtime in the main module.
if (RUNTIME_LINKED_LIBS.length > 0) {
return "if (!Module['dynamicLibraries']) Module['dynamicLibraries'] = [];\n" +
"Module['dynamicLibraries'] = " + JSON.stringify(RUNTIME_LINKED_LIBS) + ".concat(Module['dynamicLibraries']);\n";
}
return '';
})()
}}}

// === Body ===

5 changes: 2 additions & 3 deletions src/settings.js
Expand Up @@ -683,9 +683,8 @@ var MAIN_MODULE = 0;
// Corresponds to MAIN_MODULE (also supports modes 1 and 2)
var SIDE_MODULE = 0;

// If this is a main module (MAIN_MODULE == 1), then
// we will link these at runtime. They must have been built with
// SIDE_MODULE == 1.
// If this is a shared object (MAIN_MODULE == 1 || SIDE_MODULE == 1), then we
// will link these at runtime. They must have been built with SIDE_MODULE == 1.
var RUNTIME_LINKED_LIBS = [];

// If set to 1, this is a worker library, a special kind of library that is run
Expand Down

0 comments on commit 876d284

Please sign in to comment.