Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dlopen for self-contained shared libraries. #278

Closed
alexey-milovidov opened this issue Sep 27, 2021 · 6 comments
Closed

Implement dlopen for self-contained shared libraries. #278

alexey-milovidov opened this issue Sep 27, 2021 · 6 comments

Comments

@alexey-milovidov
Copy link

alexey-milovidov commented Sep 27, 2021

TLDR: opening and using dynamic libraries from statically linked executable.

Use case:

I have a .so library that does not require any function from my executable. It may require loading other libraries though.
I want to load and run the code from this library within my statically linked executable.
Also I want all dl-related functions (like dl_iterate_phdr, dladdr) to work.

Notes:

When performing dynamic linking I don't want the library to link to any code in my executable. E.g. if this library will use any libc function, the system libc.so will be opened and linked. It will use its own version of errno, its own view of __environ, etc. It may introduce various sort of troubles but should work for at least some libraries. To make it working, we can prepare the same ABI of global variables and functions as the dynamic loader from glibc does.

References:

  1. Dynamic loader from Musl: https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c It is mostly unused in practice.
  2. Dynamic loader from LLVM-libc: https://github.com/llvm/llvm-project/blob/main/libc/loader/linux/x86_64/start.cpp It is in initial stage of development and incomplete.
  3. Dynamic loader from Diet Libc: https://github.com/ensc/dietlibc/blob/master/ldso.c
  4. Stubs: https://github.com/jart/cosmopolitan/blob/8f52c0d7734de16e664e5c4e67e97ffd1a50c3b7/libc/runtime/ldso.c
  5. Explanation of how bad this idea is: https://www.openwall.com/lists/musl/2012/12/08/4
@pkulchenko
Copy link
Collaborator

Is this a duplicate of #137? Or is the twist here that the loaded library doesn't depend on anything in the executable? If you can avoid all the dependencies, why not launch it as a standalone executable and use IO or mmap for IPC?

Explanation of how bad this idea is: https://www.openwall.com/lists/musl/2012/12/08/4

I think the arguments here are well presented. I've been supporting dynamic loading for cosmopolitan libs in #137 and some other tickets, but given the aggressive optimizations that @jart is applying to remove any code that is not being used, I don't see how it can be made to work with arbitrary libraries that may depend on any code that was optimized away. Surely the libraries themselves can be made self-sufficient, but then one would need to make sure that all internal structures have the same format/size/etc. (similar to the FILE example in the post). Maybe @jart has a plan...

@alexey-milovidov
Copy link
Author

alexey-milovidov commented Sep 28, 2021

Let's assume the library can use FILE, do libc calls, etc inside but it provides self-contained API that does not depend on anything. E.g. instead of passing FILE* to method, it provides opaque * mylibrary_file_create(); void mylibrary_file_destroy(opaque *).

Then it can open and use its own copy of libc. This looks dangerous but can work. It requires some magic: preparing structures in memory that libc's dynamic loader is doing.

I imagine that it's in the spirit of Cosmopolitan Libc. If someone will do this sort of crazy stuff, this is the only place where it can happen.

Yes, running libraries in separate process and doing IPC is better. Also see https://github.com/google/sandboxed-api

@alexey-milovidov
Copy link
Author

Also interesting idea (might not work):

Instead of loading a shared library, run a new process. This new process act like dynamic loader and will resolve all the dependencies of the shared library (load other shared libraries, including libc and link with them). But it will interpret LOAD segments of ELF by allocating shared memory in /dev/shm. It will also hook mmap function so they will also operate on shared memory. mmap may contain hints so that the virtual addresses will be predefined. And the main process will map the same shared memory at the same virtual addresses. The loaded process will listen to the RPC commands via pipe and the dlsym calls in the main process will generate shims that will do RPC.

@alexey-milovidov
Copy link
Author

Also see iree-org/iree#5504.

@jacereda
Copy link
Contributor

What I'm doing in https://github.com/jacereda/cosmogfx is loading a shared binary liked against libdl and jumping back to the original executable code providing the libdl entry points. I guess this could be wrapped into ape/loader.c to make this transparent and provide a functional dlopen.

We could have the first (static) executable (loader) load helper and jump to its entry point that just jumps back to loader providing the libdl symbol addresses. loader then loads the APE executable passing it the libdl info.

Does that make sense?

@mrdomino
Copy link
Collaborator

This now exists, it's called cosmo_dlopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants