New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify statically-linked symbols that are needed by modules #3683
Conversation
mesonbuild/scripts/undefsymbols.py
Outdated
| # libraries into the main program. | ||
|
|
||
| import os, sys | ||
| from .. import mesonlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Flake8]
[F401] '..mesonlib' imported but unused
mesonbuild/scripts/undefsymbols.py
Outdated
| # touched. This information is used to include functions from static | ||
| # libraries into the main program. | ||
|
|
||
| import os, sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Flake8]
[F401] 'os' imported but unused
07ad5bf
to
7eb1cfa
Compare
mesonbuild/backend/ninjabackend.py
Outdated
| self.generate_shsym(outfile, target) | ||
| self.generate_shsym(rust + crstr, outfile, target) | ||
| if isinstance(target, build.SharedModule): | ||
| self.generate_undefsym(rust + crst, outfile, target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Flake8]
[F821] undefined name 'rust'
mesonbuild/backend/ninjabackend.py
Outdated
| @@ -1332,7 +1326,9 @@ def generate_rust_target(self, target, outfile): | |||
| element.add_item('cratetype', cratetype) | |||
| element.write(outfile) | |||
| if isinstance(target, build.SharedLibrary): | |||
| self.generate_shsym(outfile, target) | |||
| self.generate_shsym(rust + crstr, outfile, target) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Flake8]
[F821] undefined name 'rust'
82c3562
to
688f379
Compare
mesonbuild/backend/ninjabackend.py
Outdated
| self.generate_shsym(outfile, target) | ||
| self.generate_shsym('rust' + crstr, outfile, target) | ||
| if isinstance(target, build.SharedModule): | ||
| self.generate_undefsym('rust' + crst, outfile, target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Flake8]
[F821] undefined name 'crst'
| # libraries into the main program. | ||
|
|
||
| import sys | ||
| from ..mesonlib import Popen_safe |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
mesonbuild/compilers/compilers.py
Outdated
| @@ -49,7 +49,7 @@ | |||
| c_suffixes = lang_suffixes['c'] + ('h',) | |||
| # List of languages that can be linked with C code directly by the linker | |||
| # used in build.py:process_compilers() and build.py:get_dynamic_linker() | |||
| clike_langs = ('d', 'objcpp', 'cpp', 'objc', 'c', 'fortran',) | |||
| clike_langs = ('d', 'objcpp', 'cpp', 'objc', 'c', 'fortran', 'vala',) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open a separate PR for this, it's bad practice to mix commits from unrelated changes together. It also means that your work will get pushed faster.
Please include the sentence "Fixes #791" in your commit message so the issue is closed automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I was planning to do so but I wasn't sure how to get CI to work correctly on this PR. Thanks!
cf33dec
to
0c3587b
Compare
|
I think I will reimplement this after adding the above feature to Ninja. However the main ideas (scripts, rules) will be the same. |
Do you have an estimate for how much bloat this adds?
When would this happen? And also, how would linking a subset of the library then work in the same case? |
For one of the binaries built by QEMU, which only links with glib (libglib-2.0.a and libgthread-2.0.a) and glibc: So the extra size is about 12% of the binary. The space taken by glib itself in the binary grows by a factor of 2, from ~630 KiB to ~1380 KiB.
In the case of QEMU, there is a "utility functions" static library used by both the emulators and the other smaller programs distributed with QEMU. Whenever some functionality is only required in the emulators, the static library includes stubs for use in the smaller tools, while the emulators override the stubs. whole_archive would cause duplicate definitions in the emulators, and linking would fail. In the past we tried other mechanisms such as weak symbols, but static libraries turned out to be the easiest and most portable way to implement this kind of stubbing. |
These are needed so that the correct set of files from static libraries is included in the executable. At build time, ninja will first collect undefined symbols from the modules, then build the executable and the import library, then finally link the module.
Instead of passing a $CROSS variable in the "build" directive, define separate rules. This introduces some duplication in the rules, but removes it from the targets because the --cross-host=... option depends only on whether the target is native or cross-compiled. Previously, it was specified via $CROSS once per shared library or module. While at it, extract the code function to generate SHSYM rules into a separate function, and generalize the function that generates SHSYM "build" directives. We will use them shortly to generate a list of undefined symbols in the static library. That usage in turn will actually require to have separate rules for each compiler.
d575f0a
to
f6bc04e
Compare
Even though shared modules are more or less incompatible with statically linked binaries, it is of course possible to link static libraries into a binary that uses shared modules. Whenever an executable is statically linked with a library, the linker omits library functions that are not used by the executable. However, this could include functions that are not used by the executable, but are needed by shared modules. When this happens, the shared module will fail to load. One common workaround in that case is to use link_whole, but that is not always possible and might seriously bloat the executable. This commit implements an alternative solution, inspired by the operation of QEMU's Makefiles. Module object files are processed (for example with "nm") to find undefined symbols, and the undefined symbols are included in the executable through the linker's "-u" option. (There exists equivalents in the MSVC world for both steps, even though they are not implemented yet). This commit adds the logic to the generic code and to the ninja backend. The next commit is where we actually start to use "nm". Note that, unlike symbolextractor, this is not just an optimization. However, if for example nm is not found in the toolchain file for cross-compilation, behavior is unchanged compared to before this series: a module that has dependencies on a statically-linked library will fail to link, but everything else will work normally.
Finally, this patch adds support to undefsymbols for the nm utility.
Linker option -u is incompatible with -bitcode_bundle. Recognize this and avoid using the .undef files on the linker command line. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
Looks like all CI tests pass now. Sorry for the noise! |
|
Am I correct in reading the code that this is to adding a link argument like I'm concerned about adding something automatic like this for what is a fairly niche use case. One of the things we've learned that changes like these always break someone. |
Yes. Note however that What do you suggest? A built-in option? |
|
Well, AIUI this can be done without these changes with something like this: required_symbols_args = ['-Wl,--require-defined=bar', ...]
executable(...,
link_args: required_symbol_args)The downside being that you don't get the list of symbols automatically. On the other hand listing them out is more explicit which some people consider to be an advantage. I guess it depends how often the list of symbols changes... |
It can change quite often. With our current build system it's totally transparent so I don't know how often it would change, but it would likely be a blocker for switching to Meson. We never had a compatibility problem. I guess it would be okay to keep the logic in the build system instead of moving it to Meson. There are two issues for this:
but there is no way for |
|
Well one way of doing this currently, but which granted is a bit hacky is to do something like: # UNDEF_DEFINITION START
<add here all the link args needed>
# UNDEF_DEFINITION END
run_target(# a script that takes the files as input, opens this file and updates the linker
# command list above
)This would not be automatic (you'd need to execute the run target manually when symbols change) but it would work for all backends and cases. |
|
The set of symbols depends on the target (e.g. Linux/Windows/Mac) and especially for libraries such as glib it's not very predictable when it changes. The same patch, backported to a stable branch, might necessitate adding a symbol or two. I guess CI would catch mistakes but still I don't think it would be acceptable for other developers. The only solution I see is being able to add implicit dependencies to the link rule via |
|
(Of course the manual solution was already considered and discarded when we came up with the current scheme using nm). |
|
Ok, I got it to work. The trick is: 1) to build the modules in two phases using For the full solution, I would use |
|
I actually woke up last night and then immediately realized that
If you have set up nm in the cross file (which you should) then doing |
|
I managed to build the join of undefs & defs as a custom target, but I wonder what tricks I could use to transform the result (content of target file) as link_args. I wish I could have a custom_target().stdout() for example. (similar to run_command() run result object). Any idea? |
|
You can use |
Even though shared modules are more or less incompatible with statically linked binaries (glibc's support for libdl inside statically linked binaries, for example, is very limited), it is of course possible to link static libraries into a binary that uses shared modules. These could be either external packages, or libraries that are internal to the program being built.
Whenever an executable is statically linked with a library, the linker omits library functions that are not used by the executable. However, some of these omitted library functions, while not used by the executable, might be needed by shared modules. When this happens, the shared module will fail to load.
Because the library is statically linked, adding it to the module may not do the right thing. The most common workaround in that case is to use link_whole, but that might seriously bloat the executable and might not even always work. This pull request implements an alternative solution. It is inspired by the operation of QEMU's Makefiles, which I'm looking at for a conversion to Meson.
Module object files are processed with "nm" to find undefined symbols, and the undefined symbols are included in the executable through the linker's "-u" option. There exists equivalents in the MSVC world for both steps, but they are not implemented yet. The process is automatic and guided entirely by syntax that is already present in the Meson language.
Note that, unlike symbolextractor, this is not just an optimization. However, if for example nm is not found in the toolchain file for cross-compilation, behavior is unchanged compared to before this pull request: a module that has dependencies on a statically-linked library will fail to link, but apart from this special case everything will work normally.