-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch-all fallback for libs detection #10621
Conversation
another question to ask ourselves: how many packages are non-standard? If only 10-15 our of 2000, then one might as well provide a custom But regardless, let's wait for others to comment. |
fed9fae
to
877c7b2
Compare
I can answer this question for the set of 98 packages that I have built recently, but maybe someone has a larger Spack package set to run this experiment on? |
So, here is the answer to @davydden on my Spack installation. Using the following little Python script... #!/usr/bin/env python3
#
# Run this script in your Spack prefix to evaluate how badly the
# _libs_default_handler would mess with your library packages.
import os
import re
# We treat as a library every file whose name follows this regular expression
lib_regex_pattern = r"^lib{0}\.(l?a|so)$"
# List all folders which contain libraries, and libraries inside each of them
lib_list = []
lib_regex = re.compile(lib_regex_pattern.format('.*'))
for path, _dirs, files in os.walk('opt/spack'):
lib_file_list = None
for file in files:
if lib_regex.match(file):
if lib_file_list is None:
lib_list.append((path, []))
lib_file_list = lib_list[-1][1]
lib_file_list.append(file)
# Check which library paths would be excluded by the default libs handler
altered_lib_paths = []
discarded_lib_paths = []
for lib_path, lib_files in lib_list:
# Extract the package name
package_version = lib_path.split('/')[4]
package = '-'.join(package_version.split('-')[:-2])
# Build a regex matching the default libs handler's name expectations
package_libname = package
if package_libname.startswith('lib'):
package_libname = package_libname[3:]
package_lib_re = lib_regex_pattern.format(package_libname.replace('-', '.'))
package_lib_regex = re.compile(package_lib_re)
# Check which libraries would be rejected by these expectations
altered_path = False
discarded_path = True
for lib_name in lib_files:
if not package_lib_regex.match(lib_name):
if not altered_path:
print('In library path', lib_path, 'from package', package)
altered_lib_paths.append(lib_path)
altered_path = True
print('- Library', lib_name, 'would be excluded from libs()')
else:
discarded_path = False
# Would this path still be recognized as a library path?
if discarded_path:
print('- This would exclude the entire library path from libs()!')
discarded_lib_paths.append(lib_path)
# Put breathing space between reports
if altered_path:
print()
# Print our conclusions
print('Overall, of', len(lib_list), 'library paths found in the Spack prefix')
print('-', len(altered_lib_paths), 'would have missing libraries in libs()')
print('-', len(discarded_lib_paths), 'would be discarded from libs() entirely') I estimated that among my 98 installed Spack packages, there are 67 library paths. 51 of these would see some of their libraries ignored by the current default libs() handler, among which 39 would be discarded entirely, potentially breaking the package due to a missing RPATH. Therefore, it seems that for the libraries that I use at least, the failure rate of the default libs() handler is very high. Note that this script does not call into the actual Spack library search routines, but only matches my understanding of what they do. Therefore, if I misunderstood something, it may have false negatives and false positives here and there. However, looking at the log, most detected failures of the default libs() handler seem legit to me. |
Overall I think your suggestion makes perfect sense... except in the case of external packages. This would definitely be a disaster for external packages, and would lead to a lot of difficult to diagnose build failures that aren't reproducible for others. We could try to hack something where we only add I've been away from Spack for a while. Did we ever merge those PRs that suggested replacing |
we did #8136 and frankly I am happy that we did so. I am sure we can figure out issues like this one. |
How would one detect if a package is internal or external in the default libs handler? Is that exposed by the spec object? |
^ @alalazo would know |
877c7b2
to
e9e3217
Compare
|
@scheibelp Thanks for the explanation! Latest commit implements the policy suggested by @adamjstewart of only enabliing the catch-all fallback for internal packages. |
I think we overlooked one detail. The previous heuristic looked for a single library (modulo the comment over @HadrienG2 Can you write "realistic" unit tests for this, based on a couple of packages that are considered relevant? You can mock their installation directory using |
@alalazo I think I need to understand better why the order of libs() should matter to Spack. The only circumstance I can think about is when passing libs arguments to a linker, because these things never underwent a proper redesign since the single-pass punch card era. However, this is not the job of spack, but that of the build system of dependent packages. Hand-written makefiles, CMake modules, etc. which depend on libraries will take care of listing library dependencies in the right order, and all they need from spack is a list of library search paths to look into. As far as I know, Spack does not try to replace these components, and therefore does not need to know the libraries' dependency order. Even if it did, it could not generate a proper dependency list anyway, because it does not know which libs a dependent package links to. All Spack could do without additional metadata from both the library package and the dependent package is to provide a catch-all libs list, which would be hopelessly pessimistic for large modular frameworks like Qt or ROOT. Can you explain me in which circumstance the order of spack's libs() list should matter ? As for unit tests, considering that the Spack developer documentation on them is empty and that I've never worked on a large Python project before, I could indeed use a quick tour of what is needed to add a new test to Spack, which useful facilities I should know about, and some pointers to similar existing tests that I could use as a template. |
That's exactly the use case I had in mind. Spack doesn't try to be a substitute of the build-system underneath, but sometimes needs to know how to link to a library in a certain configuration. This is particularly true for hand-written Makefiles. In many cases the upstream authors suppose that a user should edit the Makefile to supply proper linking lines. You can check
No, but packages in Spack can reply to queries. The dependent makes a query, the package replies with the correct list of libraries. This also works through virtual dependencies, so you can say: mpi_fortran_libs = spec['mpi:cxx'].libs in a dependent and you'll get the list of libraries that are needed to link to the (deprecated) C++ interface of the underlying MPI. |
Spack uses pytest underneath. You can run the current unit tests on your platform by giving the following command: $ spack test In spack/lib/spack/spack/test/conftest.py Lines 719 to 741 in b721338
while an example of a test in the same PR that uses the fixture is: spack/lib/spack/spack/test/llnl/util/filesystem.py Lines 220 to 239 in b721338
|
@alalazo Thanks for the explanations. I see the use case for a dependency-ordered libs() list better now, but still think it is a bit too specialized to warrant penalizing library package writers over the lack thereof (by having them reverse-engineer and write a dependency-ordered list in their package.py). A quick grep for "def libs(" in Spack's built-in repo fuels this suspicion, with only 47 packages over ~3000 overriding the default libs handler (as flawed as it may currently be). Therefore, to answer your original question...
I think this should not be considered a fatal error, but at best an optional and off-by-default warning. This way, our message for library package writers becomes a lot friendlier: "Spack has a default libs() implementation which will produce an unordered list of all libraries exported by the package. If you export at most one library, if the libraries which you export are independent, if you expect dependent packages to use a reasonably modern build system such as CMake, SCons or Meson, or if dependent packages hardcode their linker flags, then you may safely rely on this default implementation. If, on the other hand, you belong to the minority of packages that care about library order because the libs() list is used in dependent packages in order to produce a linker command line, you will need to write your own libs() handler, which exposes the library list in dependency order." What do you think? (Meanwhile, I will start thinking about some good unit tests to add to this PR.) |
I would write something much shorter - as it might show up for multiple packages in a DAG - but I like the idea of a warning. This will still give us pointers to packages that potentially need a custom |
fe3b472
to
3e61be6
Compare
Ah, sorry, I did not intend the earlier prose as something that is actually displayed by spack, but only as a user-side interpretation of the proposed Spack behaviour. Here is something which would be more suitable as a CLI warning (which would only be displayed if a package provides multiple libraries): "By default, package libraries are listed in an arbitrary order. If you intend to use the libs() query as input to a linker, you may need to provide a dependency-ordered override." And then we can link to a documentation page containing a longer-form text explaining...
Users of the libs() query who know in advance that they won't care about library order, such as the link path extractor, should then be able to disable the warning, for example by using a dedicated query or by passing a special |
3e61be6
to
2e6c998
Compare
@scheibelp @alalazo anything holds this PR? @scheibelp would like to merge this prior to #10628 , whereas #9032 is blocked by #10628 . |
Two things holding it right now: could use some unit tests and a warning when multiple libs are found. Both can probably be postponed to a later PR if you're in a hurry. |
I am not in a hurry, it’s just that the other PR is hanging since September 2018, whereas I prefer to have things done and out of the way. |
@scheibelp @alalazo ping. |
19ad3bf
to
d0f310e
Compare
d0f310e
to
35c35db
Compare
35c35db
to
302921c
Compare
Ping? |
@scheibelp ping? |
08044aa
to
e0aad58
Compare
@HadrienG2 could you please resolve conflicts? |
e0aad58
to
2830c12
Compare
@davydden Done! |
2830c12
to
a574505
Compare
a574505
to
c596b45
Compare
c596b45
to
c6efed5
Compare
@scheibelp ping? |
This has been waiting long enough and I am tired of perpetually rebasing it, so I'll be closing it. I'll keep the branch around for a while on my repo in case someone feels like taking it over. |
The
_libs_default_handler
implicitly assumes that a library package called(lib)?Something
will always produce a library calledlibSomething
. This is not always the case, here are some counter examples:gettext
(produces severallibgettextSomething
s and alibintl
, but notlibgettext
)intel-Something
(produceslibSomething
, without the intel prefix)Currently, the result is that these libraries will not be correctly added to the RPATH, leading to problems like #10617 . One possibility is to override libs() for every such package, but that gets tedious quickly. As an alternative, this PR proposes to have the default libs handler try harder and enumerate every library in the package's Spack prefix instead.
I am quite convinced that this is a decent heuristic for library packages that are build using Spack, as the prefix will be specific to that package. But am not sure if this heuristic works or can be made to work for external packages where many libraries can end up sharing a common prefix. Extra review on this specific point would be most welcome.
See also #10617 for prior discussion.