Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-add support for building as a static library #975

Open
derekbruening opened this issue Nov 28, 2014 · 23 comments
Open

re-add support for building as a static library #975

derekbruening opened this issue Nov 28, 2014 · 23 comments

Comments

@derekbruening
Copy link
Contributor

From bruen...@google.com on November 13, 2012 14:24:59

long, long ago in a galaxy far, far away, DR supported being built as a static library. we now want to revive that support.

issues we need to address include:

  • code that looks for DR library bounds
  • code that makes .data read-only
  • may be ok to leave the extra data sections

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=975

@derekbruening
Copy link
Contributor Author

From rnk@google.com on November 26, 2012 11:12:23

I've started working on this for Linux. I'd like to punt making it work on Windows for now.

Status: Started
Owner: rnk@google.com

@derekbruening
Copy link
Contributor Author

From rnk@google.com on November 26, 2012 12:38:55

For plain DR, the static library support seems pretty straightforward.

  • For module bounds, we can say DR has bounds [0, 0), so nothing is in DR.
  • On Linux, the section protections are off by default, and haven't been a problem yet.

We have to make some decisions about client support, though. Clients could always be linked statically, in which case I see two ways forward:

  1. Have DR lookup "dr_init" in the exe, call it if it exists, and go from there. This has the disadvantage that we can't support multiple clients without extra work.
  2. Let the exe do its own initialization between dr_app_setup() and dr_app_start(). If someone is only planning to use the static build with the app API, this is more natural and requires less indirection. It also allows the app to link in multiple clients, and decide which one it's actually going to call later. It glosses over the concept of client ids and client options, however. If the app has the client options stored away somewhere else, though, it's not clear if it really needs this information.
  3. Add a new dr_app_register_client() API that takes a function pointer. This solves the client_id issue, but doesn't seem worth it to me.
  4. Use the drconfig library to generate a config file with clients and options. We'll check this out at takeover, but we currently assume that clients are DSOs that need to be loaded.

My preference is to implement 1 now since it's the least work and our existing clients are all compatible with it. Later, when we have a use case for linking in two clients and deciding which one to use, we can implement another strategy, or go back to being a shared library.

Another concern is that we have lots of checks for clients based on client_lib being empty or not. We could do something like setting client_lib to ";0;", in which case DR may do something close to the right thing. We'd have to make it not add the exe to dynamo_areas.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on November 26, 2012 14:07:57

  • For module bounds, we can say DR has bounds [0, 0), so nothing is in DR.

this may be too simplistic: there are a bunch of tests of whether in DR. though mostly what I'm thinking of are DR-owned memory tests, and this is just the DR library. we should go through all the tests and make sure they're ok.

  • On Linux, the section protections are off by default, and haven't been a problem yet.

No, they're not: DR marks .data as read-only at init time. This will break the app.

@derekbruening
Copy link
Contributor Author

From rnk@google.com on November 27, 2012 08:27:20

Another problem: we leak hidden symbols to the application. For example, our versions or __errno_location and strcpy will preempt the ones the app expects in libc.

Maybe before we create the archive, we can use "ld -r" to create a single object file (dynamorio.o), and then use some linker options to hide non-exported symbols.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on November 27, 2012 09:13:53

I'd like to punt making it work on Windows for now.

No, let's design a cross-platform solution here. So long as we avoid any ELF-specific design decisions and implementation hacks that rely on toolchain-specific tricks, it shouldn't be much more work to make it work on Windows.

@derekbruening
Copy link
Contributor Author

From rnk@google.com on December 05, 2012 11:21:22

To avoid leaking symbols into the application, it looks like we can use a combination of "ld -r" and "objcopy --localize-hidden" in a post build hook on dynamorio. I'm not sure what the equivalent is for Windows.

@derekbruening
Copy link
Contributor Author

From rnk@google.com on December 05, 2012 11:21:40

I meant to link the stackoverflow article I found: http://stackoverflow.com/questions/393980/restricting-symbols-in-a-linux-static-library

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on December 06, 2012 08:34:05

for taking over on windows if we assume it's compiled with MSVC we can use .CRT$XCU

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on August 02, 2013 07:14:35

some notes from a discussion that never made their way into this issue:

what about private libs? do we need separate static copy or just interp vs
native is enough? on linux do we redirect anything?

yes, we wrap malloc.

can wrap via linker flag: but that affects whole app.

  1. have DR interp "unwrap" if see __wrap_alloc()
  2. have duplicate copy of priv lib that wraps: then need to combine ELF files

Owner: peter.goodman

@derekbruening
Copy link
Contributor Author

We're re-reviving this!

My proposal is to throw out the top-level STATIC_LIBRARY build type and instead have a side-by-side dynamorio_static library target that we create in every build, just like we do with the extension libraries. This will make it easier to maintain and test.

@derekbruening
Copy link
Contributor Author

Since it's a pain to set DR runtime options, we should just commit to -code_api being on by default for dynamorio_static.

@derekbruening
Copy link
Contributor Author

I'm leaning toward combining proposals 1 and 2 above: we enable -code_api and assume that client code could be run at any time, even if there's no dr_init, but we also look for and call dr_init and dr_client_main (aside: should we drop dr_init for static to reduce collision risk with the app?)

@derekbruening
Copy link
Contributor Author

derekbruening commented Aug 25, 2016

Further issues:

  • For non-dr_init model, client will not receive init-time module events. We should just document this: client should use module iterator after initializing DR.
  • Drop dr_init support for static b/c of risk of collision?
  • How set DR options? Have to set env var I guess. Or should we take them in as a param to dr_app_setup()?
  • How set client options? If client is not invoked by DR, options come from app. If client has a dr_client_main: simplest to again have options come from the app.

@derekbruening
Copy link
Contributor Author

I wish github had an auto-link feature other than on closing an issue.

Here is what I've done:

commit 4f70937

i#975 static DR: rename create_thread to avoid name conflicts

On Windows we have no solution to hide non-exported symbols in
dynamorio_static like we do on Linux.  In our own tests the only conflict
is create_thread.  We rename the core's to our_create_thread.  We have yet
to find a great solution to this general issue on Windows.

commit 8db6180

i#975 static DR: refactor STATIC_LIBRARY to a side-by-side target

This is a big refactoring of the STATIC_LIBRARY top-level CMake
configuration and separate build into instead using a side-by-side
dynamorio_static target.

Removes STATIC_LIBRARY from all CMake code.  It is no longer a separate
build choice.

Adds a dynamorio_static static library build of DR that is always built.

Disables data section protection for dynamorio_static.

Adds a -static option to drrun to set the env var for auto-takeover of an
app using dynamorio_static.

Adds a new CMake utility configure_DynamoRIO_static() to make it easy to
link an app with dynamorio_static.

Renames the existing static_* tests to drdecode_* to make it clear what
they are testing.

Adds a new test static_explicit.c to test explicit use of the start/stop
API with dynamorio_static.

DR still does not look for clients without runtime options being set: that
will be added in future work.

commit c268019

i#975 static DR: client existence and loading

Implements the proposed model for dynamorio_static: it allows any part of
the app to acts as the client, but also looks for and invokes a dr_init or
dr_client_main routine, supporting multiple methods of including a client
in the application.

Switches to accessing the client lib via the system loader when it's the
same as the application on Windows (the code for Linux was added in the
past).

Replaces checks for !IS_INTERNAL_STRING_OPTION_EMPTY(client_lib) with a new
macro CLIENTS_EXIST(), which is set to true for STATIC_LIBRARY.

Sets -code_api to true for STATIC_LIBRARY by default, and sets its
dependences via moving that code into options.c.

For STATIC_LIBRARY, "loads" the app itself as the client, and makes the
absence of an init routine non-fatal.

Updates the static_startstop test and adds two new tests: static_noclient
and static_noinit.

@derekbruening
Copy link
Contributor Author

I tried to make dynamorio_static support cross-platform, though on Windows we have the name conflict risk and no auto-takeover, on Android there were several issues I fixed but it still crashes (#1996), on Mac it fails to build (#1997), and on ARM the start/stop API is NYI (lumped under #1551).

@derekbruening
Copy link
Contributor Author

Other missing pieces here are documenting how to use dynamorio_static (incl init-time module differences) and adding an auto-takeover test (and implementing .CRT$XCU for Windows).

@derekbruening
Copy link
Contributor Author

For the client options: above we just said that the app could pass them to dr_client_main. Except it's not easy to do so for clients written separately from the app w/o some convention on a global var or env var to pass the string, and then somebody has to parse it: so better for instrument.c to centralize the parsing.

The code inside DR that parses options has a fixed buffer size. There could be tons of options added to our tracer? Maybe better as exported symbol from rest of app instead of env var? So DR would use dlsym to look it up, for static?

I'm going with something simple for now, and we can change it later if we need to. Leaning toward just using the existing -client_lib and ignoring the path, b/c app might want to set DR ops too (-disable_traces or sthg) and can do that in same env var.

@derekbruening
Copy link
Contributor Author

For the app setting DYNAMORIO_OPTIONS, we have a problem: on UNIX we cache environ in our constructor. If the app then calls setenv() and DYNAMORIO_OPTIONS was not already in the environment, it will make a copy of the env block, leaving our our_environ cached pointer pointing to the originals and thus not seeing the app's change. The plan is to eliminate the cache for static builds.

@derekbruening
Copy link
Contributor Author

To help avoid malloc mid-run, yet allow it at init and exit, as we're doing with drmemtrace (#2006), I'm adding a feature where a client library can request that the private loader, in debug build, complain if malloc & co. are called outside of init or exit.

@derekbruening
Copy link
Contributor Author

To elaborate: drmemtrace's use of C++ and droption incurs malloc and free calls at init time (libstdc++ lib init, droption static initializers, droption_parser_t::parse_argv) and exit time (droption). We would have to rewrite the tracer in C and not use droption to avoid these.

derekbruening added a commit that referenced this issue Feb 28, 2018
Adds a feature where a client library can request that the private loader
complain if malloc & co. are called at any time other than process init or
exit, to help clients that want to support being linked statically with the
app.

Fixes drcachesim to use placement new for its offline custom module data
allocations.

Issue: #975, #2006
@derekbruening
Copy link
Contributor Author

I want to avoid changing the hot redirect_malloc() path so I'm using a var decl to request these static import checks. Yet, the checks fire in drcachesim with -use_physical, so we need a way to disable them dynanically. We can't switch from the var decl to pure runtime b/c client code runs too
late to change imports. I don't want to add conditionals to the main
hot-path redirect_malloc(). How about a combo: we add
dr_allow_unsafe_static_behavior() which overrides the decl?

derekbruening added a commit that referenced this issue Feb 28, 2018
Adds a feature where a client library can request that the private loader
complain if malloc & co. are called at any time other than process init or
exit, to help clients that want to support being linked statically with the
app.  Because it has to be early, the feature is triggered by a variable
declaration DR_DISALLOW_UNSAFE_STATIC.  It can be overridden
dynamically by a new API routine dr_allow_unsafe_static_behavior().

Fixes drcachesim to use placement new for its offline custom module data
allocations.

Issue: #975, #2006
derekbruening added a commit that referenced this issue Mar 1, 2018
Fixes a Mac build error introduced by 3bc3315.
Fixes a Mac test build error that shows up on newer toolchains.

Issue: #975
derekbruening added a commit that referenced this issue Mar 1, 2018
Fixes a Mac build error introduced by 3bc3315.
Fixes a Mac test build error that shows up on newer toolchains.

Issue: #975
derekbruening added a commit that referenced this issue Jul 12, 2018
Adds support for calling dr_app_setup();dr_app_stop_and_cleanup() with
no start in between.  This is useful to use DR as a decode/encode
library when it's statically linked and also used for instrumentation,
as that setup precludes using drdecodelib, which relies on redirecting
heap allocation via name redirection.

Add a test to api.static_noclient.

Issue: #975
derekbruening added a commit that referenced this issue Jul 12, 2018
Adds support for calling dr_app_setup();dr_app_stop_and_cleanup() with
no start in between.  This is useful to use DR as a decode/encode
library when it's statically linked and also used for instrumentation,
as that setup precludes using drdecodelib, which relies on redirecting
heap allocation via name redirection.

Adds a test to api.static_noclient.  Unfortunately this hits #2040 on Windows
and we disable it there.

Issue: #975
derekbruening added a commit that referenced this issue Mar 11, 2019
The static DR libs are quite large and an enclosing project (such as
Dr. Memory) may not want to include them.  We add a new variable
DO_DR_INSTALL_STATIC_DR here to control this behavior.

Issue: #975
derekbruening added a commit that referenced this issue Jun 27, 2019
Adds dr_standalone_exit() with support for subsequent DR use,
including full attach.  This is a better solution than setup;detach
support added in de99d45 for the use case of performing DR
decode/encode with DR statically linked (which precludes using
drdecodelib), followed by a separate use of DR for instrumentation.

Updates the test of this use case in api.static_noclient.

Issue: #975
derekbruening added a commit that referenced this issue Jun 27, 2019
Adds dr_standalone_exit() with support for subsequent DR use,
including full attach.  This is a better solution than setup;detach
support added in de99d45 for the use case of performing DR
decode/encode with DR statically linked (which precludes using
drdecodelib), followed by a separate use of DR for instrumentation.

Updates the test of this use case in api.static_noclient.

Issue: #975
derekbruening added a commit that referenced this issue Aug 5, 2020
Resets a flag used to auto-initialize when heap routines are called
without an explicit dr_standalone_init(), enable drdecode and other
use cases.  Without this reset, a prior dr_standalone_exit() can cause
a subsequent crash when using implicit-initialization interfaces.

Issue: #975
derekbruening added a commit that referenced this issue Aug 5, 2020
Resets a flag used to auto-initialize when heap routines are called
without an explicit dr_standalone_init(), enable drdecode and other
use cases.  Without this reset, a prior dr_standalone_exit() can cause
a subsequent crash when using implicit-initialization interfaces.

Issue: #975
derekbruening added a commit that referenced this issue Aug 17, 2020
A std::unordered_set, even using dr_allocator_t, raises transparency
risks when statically linked on Windows (from lock functions and other
non-allocator resources).  We thus create our own resource-isolated
class to track GPR register inclusion for drmemtrace elision
optimizations.

Adds unit tests of the set class to burst_traceopts.

Further tested manually by ensuring the same number of elided
addresses is seen as with std::unordered_set in the burst_traceopts
test:
  [drmemtrace]: Reconstructed 622 elided addresses.

Even further tested by ensuring that raw2trace using the new set
correctly operates on a raw file that was created using
std::unordered_set.

Issue: #975, #4403
derekbruening added a commit that referenced this issue Aug 18, 2020
A std::unordered_set, even using dr_allocator_t, raises transparency
risks when statically linked on Windows (from lock functions and other
non-allocator resources).  We thus create our own resource-isolated
class to track GPR register inclusion for drmemtrace elision
optimizations.

Adds unit tests of the set class, which are linked into burst_traceopts.

Further tested manually by ensuring the same number of elided
addresses is seen as with std::unordered_set in the burst_traceopts
test:
  [drmemtrace]: Reconstructed 622 elided addresses.

Even further tested by ensuring that raw2trace using the new set
correctly operates on a raw file that was created using
std::unordered_set.

Issue: #975, #4403
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant