-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change MCA component build style default to static #8132
Change MCA component build style default to static #8132
Conversation
@jsquyres You had brought this up a couple months ago. I'm not sure I have strong preferences on whether or not we should take this patch, but here's how we should make the change in component build style. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do remember bringing this up a few months ago (I think after talking with @rhc54 about it...?). I don't remember if there was a reason that I didn't end up filing a PR like this (e.g., a technical or UX reason). ...or were we waiting for The Great 3rd Party Updates to be completed before doing this?
Can one of the admins verify this patch? |
25b088b
to
3830582
Compare
@jsquyres my recollection was that I said I'd look into it and wasn't sure how hard it would be. Turns out, not very :). |
d2e44ac
to
e68a639
Compare
@jsquyres your readme changes look good to me. Thanks! |
Do we need/want to port this to PMIx and PRRTE? |
does it mean that if one component depends on external shared library which is not available, then OpenMPI would not be able to ignore just this one component? |
Yes. Most of our customers don't seem to do this. For those that do, adding --enable-mca-dso= is probably the recommended course of action. |
If we think this is the right default behavior, yes. I think there's an open question about whether we want to do this. There was a question about ability to make the change, and I wanted to show that it wasn't a hard change... |
Both of these concerns are legit (reducing the IO traffic and not breaking loading the OMPI library on missing external dependencies). We can take a middle-ground approach, where we allow components with no external dependencies (BTL sm, tcp, some mpools) to be statically linked in, while everything that has external dependencies remains as dynamic libraries (except when specificaly overwritten with the option described above). We can select the desired behavior either in the configure.m4 or as a file in the component directory (similar to .ignore). |
This is a good idea. I don't know if I want this to always be the behavior -- e.g., for the Cisco distribution of Open MPI, I'm fine if the core Open MPI libraries depend on libfabric. But a generic Linux distribution may make a different choice, for example. Perhaps |
@jsquyres suggestion sounds good to me |
TL;DR : I'd propose keeping treematch in its own private I have a testcase proposal And I started a PR to change a bunch of symbols to static and/or add But there were objections at least to modifying the treematch MCA. And that MCA in particular is littered with unreasonably generic exported symbol names, like So for this PR I'd at least propose treematch be kept in its own |
We had a discussion today about this PR, and I just wanted to clarify something for myself. We are not eliminating the DSO mechanism, just changing the default for (most or all) of the components in the tree. So I can still build a DSO an drop it into the DSO lib search path and it will try to include it in the framework component loading. Is that correct? I say this because it is common for us to ship components that, depending on the client's system configuration, that component may not load because of missing libraries. That's ok and we depend on Open MPI's fallback to the next best component in that case. This is another instance where we never want this component to be linked against libmpi.so since the dependent library may not be present and we still want to run. |
@jjhursey that is one of the proposals on the table - but not the only one. 😄 |
@rhc54 and I talked about this on the phone today:
In short:
|
Is there a timeout by when we want to try to merge this PR? Just to put a time box on the discussion. |
No, not really. The main issue I think is that if we want to do this, we should give our sister projects time to follow our lead (i.e., PMIx / PRRTE). |
After I posted yesterday, I re-read everything more carefully and was reminded of the treematch symbol pollution issues. That should be addressed; we don't want poorly-named symbols polluting the MPI application namespace. @jjhursey and I talked through this a bit on the phone today. Things I'll look into:
I'll check these things out. |
We started a good discussion on the 1/12 call, but ran out of time, so we'll tackle this earlier on 1/19 call for those who are interested. |
Failed to get to this on 1/19 due to release branch discussions. |
The symbol pollution issue is a real one, and we should figure it out. But keep in mind that the symbol pollution problem already exists and isn't because of this PR. This PR makes the effects of it more pronounced, but the problem already exists. |
On the option @jsquyres listed about what if treematch was its own separate library and libmpi was linked against it, I think that still exposes the end user to all the genericly named symbols in libtreeematch.so. We'd be able to say "hey, it's not OUR fault, those symbols aren't in our code" but the user experience would still be bad. If we dlopened libtreematch.so and accessed everything in it via function pointer I believe that would be safe. In the olden days Platform MPI did that for several external libraries like libibverbs. I'd still lean toward the rename.h approach though |
@jsquyres also to put on your radar screen, a proposed testcase for generic symbol names being exported: open-mpi/ompi-tests-public#3 It currently accepts a pretty broad list of prefixes. And even though there might be discussions still about whether to fix this with prefixes or by going all in on DECLSPEC and only exporting a strict list, the testcase wouldn't care about that, it just looks at the results. |
e68a639
to
fb5f2a1
Compare
Jumping back into this conversation... We load DSOs into the global scope today (because reasons) and allow customers to build as static libraries, so it seems like we should just fix treematch, rather than screw with defaults. I think what George is suggesting with allowing components to set a default might make sense, although it's more work than I can commit to doing right now (particularly chasing through setting a proper default for all the components). To clarify for Josh, this does not change the ability for Open MPI to search for components built out of tree (or similar). This only changes whether components are built as static or dso libraries by default. For at least 10 years, we've split the logic between looking for dsos (the |
opal_mca.m4 was setting enable_dlopen after it was processed, and the dlopen argument checking was overriding configure arguments that belong to opal_mca. Clean up the late change in enable_dlopen and handle the --disable-dlopen case in opal_mca instead of overriding user options. Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Remove an environment variable that probably did something at one point, but is dead code now. Git history is hard to follow due to some reverts in the SVN days. Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Default to building MCA components into the library, rather than as dso objects, since most users are not taking advantage of per-library packaging, and this results in less stat/open/etc. calls during application startup. Clean up the static/dso selection logic for componets at the same time. Before, it was relatively hard to describe what would happen if there was a mix of --enable-mca-dso and --enable-mca-static arguments. Now, the code will first look for a component-specific argument, then a framework-level argument, then a global argument. If there is a tie, static is preferred. Update the NEWS and README.md to document the changes. Jeff Squyres wrote the original update to the README file, which was adopted to the new README.md format. Signed-off-by: Brian Barrett <bbarrett@amazon.com>
fb5f2a1
to
930260c
Compare
@bwbarrett - This looks ready to go. Please PR to v5.0 after merging. |
This patch series primarily changes the default component build style from dso to static. At the same time, it cleans up some minor nits and clarifies the selection logic when there are conflicting --enable-mca-dso and --enable-mca-static arguments.