-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test package detection in a systematic way #18175
Test package detection in a systematic way #18175
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few questions. Also I think a function in package.py
would be easier to use than a domain specific language. More details in the comments.
This YAML file contains enough information for Spack to mock an environment | ||
and try to check if the detection logic yields the results that are expected. | ||
|
||
As a general rule, attributes at the top-level of ``detection_test.yaml`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about having a domain-specific language vs. adding functions to the package itself - why not have the user write a function in package.py
?
def test_inspections(self):
bin1 = 'bin/clang-3.9'
script1 = """\
echo "clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)"
echo "Target: x86_64-pc-linux-gnu"
echo "Thread model: posix"
echo "InstalledDir: /usr/bin"
"""
bin2 = ...
result2 = ...
specs = ['llvm@3.9.1 +clang~lld~lldb']
return [(bin1, result1), (bin2, result2)], specs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this as a domain specific language, more like an additional YAML file that packagers can provide to specify tests for package detection and aggregate knowledge on different cases. Given that:
- Our users in general need to deal with YAML files a lot (so they are used to them)
- This particular file is not mandatory and is documented
I wouldn't consider this much of an issue. Maybe there's a better section to put documentation on it rather than the one I used?
A few considerations. The YAML file has no behavior, only static information to setup a unit test on a core function of the:
$ spack external find ...
command. The logic for this unit test is the same for all packages. In some sense the information in the YAML file is of the same kind of what we can see in a parametrized fixture for other tests. The only difference is that here this information relates to different packages and is more verbose, so I decided to put it in a YAML file alongside the corresponding package.py
rather than aggregating it in a single test file.
There's also a thing that worries me in taking the approach of an additional method. The package class is already overcharged with a lot of different responsibilities and I don't think it would help readability or maintenance to also add the responsibility of unit testing its own detection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this as a domain specific language, more like an additional YAML file that packagers can provide
Our users in general need to deal with YAML files a lot (so they are used to them)
I think what you are asking is: “if we have config files, then why do you think this is bad?” Overall I would say the problem is that this doesn’t provide any of the benefits that config files typically provide to Spack users:
Users that write detection_test.yaml
have to be familiar with the detection API and the specific implementation of the detection function in the associated package.py
file. Users editing config files don’t need to know Spack internals, but users writing detection_test.yaml
do.
Generally we keep config files as key/value stores or otherwise make them as simple as possible. Even the most complex config files we have in spack are entries which are collections of properties. detection_test.yaml
specifies properties (like file locations) along with behavior (scripts) and outcomes (the expected spec output): that is more-involved than any config file we have. This also forces users to be familar with YAML syntax constraints which don’t appear in other config files, like the text block specification:
output: |
echo "clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)"
echo "Target: x86_64-pc-linux-gnu"
echo "Thread model: posix"
echo "InstalledDir: /usr/bin"
Since the user has to understand (and in fact, coordinate with) the internals (i.e. the external detection function in package.py
), I think this behavior should be specified as a function rather than a YAML file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you are asking is: “if we have config files, then why do you think this is bad?” Overall I would say the problem is that this doesn’t provide any of the benefits that config files typically provide to Spack users:
That's not the point. As I tried to explain above this YAML is not configuration - it's data fed into a unit test. I concur this part is true:
Users that write detection_test.yaml have to be familiar with the detection API and the specific implementation of the detection function in the associated package.py file.
even though I don't see that as an issue, but then I don't understand the alternative solution of adding a method to the package. It shares the same charateristics (be familiar with the detection API and specific implementation of other functions in the package) AND it couples the package class with unit tests on it. Overall it seems to me no concern would be solved but more would be added.
TL;DR I pushed a modification in 55335df that moves the YAML files into a subfolder of the directory where we keep other unit test data files and moved the docs to the developer section. I hope this could be a good way to mitigate the concern you have over the user facing side of this PR and the one I have over adding an additional function to packages and coupling them to unit tests. The description at the top of the PR has been updated too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it seems to me no concern would be solved but more would be added.
I want to avoid forcing users to switch between paradigms: if a user needs to be familiar with the external detection implementation of a package, then they will be reading Python. I don't want to force them to switch between Python and a YAML-based specification of behavior - I think they might as well write a Python function.
I don't understand the alternative solution of adding a method to the package. It shares the same characteristics (be familiar with the detection API and specific implementation of other functions in the package)
AND it couples the package class with unit tests on it
I think that is a good thing. They are already coupled, I don't see why separating them into different files changes that.
Put another way: what do you think is wrong with placing the testing logic in the package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the data in the YAML file is of the same kind as this:
spack/lib/spack/spack/test/compilers/detection.py
Lines 27 to 38 in 77a28b8
('Arm C/C++/Fortran Compiler version 19.0 (build number 73) (based on LLVM 7.0.2)\n' # NOQA | |
'Target: aarch64--linux-gnu\n' | |
'Thread model: posix\n' | |
'InstalledDir:\n' | |
'/opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_RHEL-7_aarch64-linux/bin\n', # NOQA | |
'19.0.0.73'), | |
('Arm C/C++/Fortran Compiler version 19.3.1 (build number 75) (based on LLVM 7.0.2)\n' # NOQA | |
'Target: aarch64--linux-gnu\n' | |
'Thread model: posix\n' | |
'InstalledDir:\n' | |
'/opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_RHEL-7_aarch64-linux/bin\n', # NOQA | |
'19.3.1.75') |
so I'll counter a question with a question: do you think we should move that into the corresponding compiler class to be returned from a method:
class Arm(Compiler):
def input_for_version_detection(self):
return [
('Arm C/C++/Fortran Compiler version 19.0 (build number 73) (based on LLVM 7.0.2)\n' # NOQA
'Target: aarch64--linux-gnu\n'
'Thread model: posix\n'
'InstalledDir:\n'
'/opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_RHEL-7_aarch64-linux/bin\n', # NOQA
'19.0.0.73'),
('Arm C/C++/Fortran Compiler version 19.3.1 (build number 75) (based on LLVM 7.0.2)\n' # NOQA
'Target: aarch64--linux-gnu\n'
'Thread model: posix\n'
'InstalledDir:\n'
'/opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_RHEL-7_aarch64-linux/bin\n', # NOQA
'19.3.1.75') ]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think is wrong with placing the testing logic in the package?
In the current form the unit test has knowledge of:
- The behavior of
spack external find
(or functions called within that module) - The data needed to mock a filesystem structure and the expected result of a search
and drives the test - the package class doesn't know about the test, as it's not its responsibility.
With a method in the package class we have a unit test that calls a package that knows how the unit test is structured and returns the appropriate input - the unit test knows about the package and the package knows about the unit test and this makes the entire structure more coupled.
Being that 55335df makes it clear that the data is for unit tests and moves it away from packages, would that be good enough for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I'll counter a question with a question: do you think we should move that into the corresponding compiler class to be returned from a method:
Yes: all the compiler classes will eventually be packages anyways, so it would fit my expectation to move the example output into the compiler class (in the same way that I want to move the example package detection output into the package class).
With a method in the package class we have a unit test that calls a package that knows how the unit test is structured and returns the appropriate input - the unit test knows about the package and the package knows about the unit test and this makes the entire structure more coupled.
I disagree with the first point of this and therefore with the conclusion, specifically you say:
With a method in the package class we have a unit test that calls a package that knows how the unit test is structured and returns the appropriate input
Just because the package would need to follow a specific API to return example version output doesn't mean that it needs to know about the structure of the unit test. The API it needs to follow is:
class ExampleOutput(object):
def __init__(self):
self.expected_version = '1.0
self.exe_to_output = {
'bin/exe1': 'version 1.0',
'bin/exe2': '1.0'
...
}
class Package(object):
def version_examples(self):
# Return a set of ExampleOutput objects
return []
But it wouldn't need to know how the unit test works
* - ``layout`` | ||
- Specifies the filesystem tree used for the test | ||
- List of objects | ||
- Yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "Yes" mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part renders in the docs as a table. It describes in details the attributes that can be added to the YAML file:
The "Yes" is under the "Required" column, to mean that the field is required. It's in fact a bit silly that now every field is required, but this was done thinking of future extensions to e.g. check that an external attribute is computed in a certain way. That wouldn't be required since not all packages have external attributes. If you think it's of no use at the moment I can remove the entire column.
285e263
to
5cb22f8
Compare
Closing and re-opening this PR to see if doing so will result in reporting the status of the three outstanding checks. |
55335df
to
2a828d6
Compare
2a828d6
to
382bb28
Compare
382bb28
to
1d7d87b
Compare
1d7d87b
to
d56648f
Compare
d56648f
to
12f6484
Compare
I prefer this PR to the alternative implementations |
12f6484
to
1c856f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alalazo sorry about the delay in getting this reviewed.
- layout: | ||
- subdir: [bin] | ||
name: gcc | ||
output: "echo 7.5.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this take "7.5.0"
and have the internal logic figure out how to make it print that output? echo
isn't part of the output we care about at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are cases where we want to mock some specific behavior from an executable. For instance, a gcc
that fails with an error message and an exitcode if called with --version
, but returns some output if called with -dumpversions
. To deal with that I renamed output
to script
, since that seems more appropriate given the nature of that field.
lib/spack/docs/packaging_guide.rst
Outdated
@@ -5388,6 +5388,86 @@ follows: | |||
|
|||
.. _package-lifecycle: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be in the wrong place now
"""""""""""""""""""""""""" | ||
Tests for PATH inspections | ||
"""""""""""""""""""""""""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we anticipate any other package detection method that we should be including tests for? We have the module-based detection for cray, but (a) I don't think we currently test it, because we can't test on the cray PE, and (b) I think we plan to deprecate that support as soon as we possibly can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fwiw, we still have module based search for Cray and we might add other inspection method too - e.g. maybe querying system package managers etc.
lib/spack/docs/packaging_guide.rst
Outdated
- layout: | ||
- subdir: [bin] | ||
name: clang-3.9 | ||
output: | | ||
echo "clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)" | ||
echo "Target: x86_64-pc-linux-gnu" | ||
echo "Thread model: posix" | ||
echo "InstalledDir: /usr/bin" | ||
- subdir: [bin] | ||
name: clang++-3.9 | ||
output: | | ||
echo "clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)" | ||
echo "Target: x86_64-pc-linux-gnu" | ||
echo "Thread model: posix" | ||
echo "InstalledDir: /usr/bin" | ||
results: | ||
- spec: 'llvm@3.9.1 +clang~lld~lldb' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that this format is very obtuse. Would we lose any information if we switched to something like:
- layout: | |
- subdir: [bin] | |
name: clang-3.9 | |
output: | | |
echo "clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)" | |
echo "Target: x86_64-pc-linux-gnu" | |
echo "Thread model: posix" | |
echo "InstalledDir: /usr/bin" | |
- subdir: [bin] | |
name: clang++-3.9 | |
output: | | |
echo "clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)" | |
echo "Target: x86_64-pc-linux-gnu" | |
echo "Thread model: posix" | |
echo "InstalledDir: /usr/bin" | |
results: | |
- spec: 'llvm@3.9.1 +clang~lld~lldb' | |
- layout: | |
- executable: bin/clang-3.9 | |
output: | | |
"clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)" | |
"Target: x86_64-pc-linux-gnu" | |
"Thread model: posix" | |
"InstalledDir: /usr/bin" | |
- executable: bin/clang++-3.9 | |
output: | | |
"clang version 3.9.1-19ubuntu1 (tags/RELEASE_391/rc2)" | |
"Target: x86_64-pc-linux-gnu" | |
"Thread model: posix" | |
"InstalledDir: /usr/bin" | |
specs: | |
- 'llvm@3.9.1 +clang~lld~lldb' |
This would seem to remove unnecessary verbosity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in d40f73f
1c856f3
to
7d54d74
Compare
7ee19e6
to
f6c8e3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is your plan to extend this to test extra_attributes
in a later PR?
Otherwise mostly looks good, change requests noted inline
lib/spack/spack/audit.py
Outdated
def packages_with_detection_tests(): | ||
"""Return the list of packages with a corresponding detection_test.yaml file.""" | ||
import spack.config | ||
import spack.util.path | ||
|
||
# Directories where we have repositories | ||
repo_dirs = [spack.util.path.canonicalize_path(x) for x in spack.config.get("repos")] | ||
|
||
# Compute which files need to be tested | ||
to_be_tested = [] | ||
for repo_dir in repo_dirs: | ||
pattern = os.path.join(repo_dir, "packages", "**", "detection_test.yaml") | ||
pkgs_with_tests = [os.path.basename(os.path.dirname(x)) for x in glob.glob(pattern)] | ||
to_be_tested.extend(pkgs_with_tests) | ||
|
||
return to_be_tested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will incorrectly flag a package as having a detection_test.yaml if the package is overridden in a higher priority repo and the overriding package does not inherit from the one with the tests (as the tests will now be invalid).
I think a possible solution to this would be to add a PackageBase instance method get_detection_test_info
. This method would:
- check for a
detection_tests.yaml
in the current directory - check any superclass in MRO below
PackageBase
for a detection_tests.yaml` file - return any files found (so tests could be layered between base and inheritor packages)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See b9c42ab This requires to add a file like:
includes:
- "builtin.gcc"
to reuse the detection tests file for a custom GCC, but still runs in 0.8s
instead of half a minute.
lib/spack/spack/detection/test.py
Outdated
result = [] | ||
pkg_dir = pathlib.Path(repository.filename_for_package_name(pkg_name)).parent | ||
detection_tests_yaml = pkg_dir / "detection_test.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be improved by my recommendation above re: an instance method to find detection test files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #18175 (comment)
c6115f8
to
fbf335c
Compare
Yes, that would be the idea. I'd like to check which extra attributes we need besides |
fbf335c
to
f53c57a
Compare
This commit adds a new audit type that, based on the content of a `detection_test.yaml` file placed alongside `package.py`, performs a series of synthetic detection tests.
Detection tests are now executed on fully qualified packages, for instance on "builtin.gcc" instead of just "gcc". This allows to test package customizations in user defined repositories. Extend the detection_tests.yaml file to allow inclusion of tests from other packages.
f53c57a
to
fe79717
Compare
fe79717
to
0888ce0
Compare
Outdated by an overhaul of the PR (now the tests are in an audit)
@becker33 How should we proceed with this? If there are doubts on
I think, at least initially, the percentage of people who'll add a detection test in their repo which overrides a detection test in built-in will be next to zero - so I wouldn't block this PR on that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that we'd be better off using a package attribute and the time savings isn't relevant compared to modeling it properly, but I recognize I'm being outvoted on this issue.
This PR adds a new audit sub-command to check that detection of relevant packages is performed correctly in a few scenarios mocking real use-cases. The data for each package being tested is in a YAML file called detection_test.yaml alongside the corresponding package.py file. This is to allow encoding detection tests for compilers and other widely used tools, in preparation for compilers as dependencies.
closes #21343
closes #28092
fixes #39633
This PR adds a new audit sub-command to check that detection of relevant packages is performed correctly in a few scenarios mocking real use-cases. The data for each package being tested is in a YAML file called
detection_test.yaml
alongside the correspondingpackage.py
file.This is to allow encoding detection tests for compilers and other widely used tools, in preparation for compilers as dependencies.
The audit can be run with:
Modifications:
gcc
,llvm
andintel