New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce stage testing #124
Conversation
Congratulations! The build has finished successfully. 🍾 You can install the built RPMs by following these steps:
Please note that the RPMs should be used only in a testing environment. |
Great stuff! As written in the chat, note that Ubuntu kernels don't have SELinux support. We may still be able to label our files using our userspace tool, but if running tools causes relabelling (which sounds wrong), then I would expect that to not happen on Ubuntu. For added/deleted trees, I totally agree that we should export all the info about added files, but for the deleted ones I guess only the root of the subtree is sufficient, no? Agreed, don't test @core, test just a minimal set. I think also that we should test stages on minimal/empty images. Some may not work at the moment, but I think (as mentioned before) that stages should be "write only" to the extent possible, so in general should not require anything particular to be installed. Mabye we need a very minimal dnf stage that puts in place the filesystem etc, but that should be it. General comment on SELinux: we may in practice want to always do the SELinux stage last, so not care so much if applying stages afterwords changes labels, but obviously that is not ideal. Of your four proposals, I would suggest number 2, but with maybe even without SELinux if we change the stage to read the labels from the build root, not the target (in which case the selinux package of course must be installed there instead). |
tree-diff
Outdated
hasher.update(buf) | ||
buf = os.read(fd, BLOCK_SIZE) | ||
|
||
return f"sha256-{hasher.hexdigest()}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sha512:
is the usual prefix, rather than sha512-
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
tree-diff
Outdated
import json | ||
import os | ||
|
||
|
||
def hash_file(fd): | ||
BLOCK_SIZE = 65536 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you measure this? I used 4k elsewhere (one page). If it makes a difference, we should use the same everywhere :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did no measures, I've changed it to 4096 to be consistent. We can do some benchmarks later if needed. :)
test/osbuildtest.py
Outdated
@@ -34,4 +34,4 @@ def run_osbuild(self, pipeline): | |||
raise e from None | |||
|
|||
result = json.loads(r.stdout) | |||
return result["tree_id"], result["output_id"] | |||
return f"{self.store}/refs/{result['tree_id']}", f"{self.store}/refs/{result['output_id']}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not 100% sure about this. I quite liked the current outputs, as they were about the semantics of what we produced, and could potentially be used for other things than just dereferencing the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This is only test code, but I liked the symmetry of it returning something similar to what osbuild --json
on the command line.
Is constructing the part inconvenient or impossible somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the fact that subclasses of osbuildtest.TestCase don't need to know anything about self.store
. But yeah, it makes sense to return the ids, I will make a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks!
.travis.yml
Outdated
script: sudo env "PATH=$PATH" python3 -m test --case locale --build-pipeline samples/build-from-yum.json | ||
- name: stage-tests | ||
before_install: sudo apt-get install -y systemd-container yum | ||
script: sudo env "PATH=$PATH" "OSBUILD_TEST_BUILD_PIPELINE=samples/base-from-yum.json" python3 -m unittest test.test_stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use samples/build-from-yum.json
instead. It's quicker. I've removed the other one, because it's confusing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by quicker? When I'm using base-from-yum
the pipeline works like this:
- build f27
- build f30
- build f30
On the contrary with build-from-yum
:
- build f27
- build f30
I think the first pipeline should be used, it sounds more reasonable to build the final tree by the same Fedora version. Currently, this is solved by using build step (F30) in each of test pipelines AND using build-pipelined argument (F27). IMO, this is confusing (which build pipeline goes first?) and it means I need to put the buildstep in each test pipeline, which is cluttering them a lot.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it's quicker because it has one stage instead of two.
The pipeline you're building always needs to include its build pipeline, because its hash must include all the tools that were used to assemble the final image. The --build-pipeline
parameter is there to get a host system into something that we know can produce the build pipeline. It is always run first. The Sorry if this is confusing.
The first case is indeed the one that we should use (for the reason you mention). The external pipeline makes a f27, to make it possible to build the "real" pipeline (f30 built on f30) on Ubuntu.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that --build-pipeline
gets also included into the tree id. But I was wrong, thanks for the explanation. :) Therefore I added the build stage to all the test pipelines. :)
test/osbuildtest.py
Outdated
shutil.rmtree(self.store) | ||
@classmethod | ||
def tearDownClass(cls): | ||
shutil.rmtree(cls.store) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm doing the same for testing image-info, but I think we need to be a bit careful here, where we actually test osbuild itself. Tests should be as independent from each other as possible.
I'm not entirely against it. If we go this way, please include a comment pointing this out as the doc string for this test class.
test/osbuildtest.py
Outdated
@@ -34,4 +34,4 @@ def run_osbuild(self, pipeline): | |||
raise e from None | |||
|
|||
result = json.loads(r.stdout) | |||
return result["tree_id"], result["output_id"] | |||
return f"{self.store}/refs/{result['tree_id']}", f"{self.store}/refs/{result['output_id']}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This is only test code, but I liked the symmetry of it returning something similar to what osbuild --json
on the command line.
Is constructing the part inconvenient or impossible somewhere?
test/osbuildtest.py
Outdated
diff = ('\n' + '\n'.join(difflib.ndiff( | ||
pprint.pformat(tree_diff1).splitlines(), | ||
pprint.pformat(tree_diff2).splitlines()))) | ||
raise AssertionError(diff) from None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise from None
removes the error message you've put in the AssertionErrors
above. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by introducing TreeAssertionError
and removing the ugly try-except-reraise thing.
def init_tests(): | ||
test_dir = 'test/stages_tests' | ||
for test in os.listdir(test_dir): | ||
setattr(TestDescriptions, f"test_{test}", generate_test_case(f"{test_dir}/{test}")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer using TestCase.subTest()
for parameterized tests. No need to change to that, just wanted to mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setUp
and tearDown
don't work with subTest()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Seriously?! Thanks for pointing that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested this once again today's morning and yeah, it works this way. But I guess it's a correct behaviour, if setUp got called before each subTest, it would got called twice - once before test method and then before subtest, which would be imho really confusing.
try: | ||
child_fd = os.open(dirent.name, os.O_DIRECTORY, dir_fd=dir_fd1) | ||
except OSError: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which OSError
do you want to ignore? I feel like this except handler might be too broad. Maybe something like this?
I don't think we need to include it. There's other ways to test it without bloating the size of the repo.
Ah sorry I didn't read this before putting my comment. I vote (3). It's the easiest solution. We are not testing that the osbuild store works, but that stages are doing the right thing. We have other ways to test the store. |
Idea with "write only" stages is certainly great. But in real use case we will probably always write e.g. users or firewall settings after dnf run. That's why I want to use full @core as base in all tests. |
0fe93c8
to
c832f61
Compare
In my opinion it is better to dump everything and then filter out unneeded entries elsewhere.
We need to know the exact difference of modified files in both trees. Outputting the whole files into a diff might make a huge diff file, therefore only their hashes are written.
4ea44cf
to
719b986
Compare
The stage testing is based on an output from the tree-diff tool. During one test two pipelines are run and their outputs are compared using tree-diff. The diff is then compared with expected diff included in the repository.
For stages testing it is too slow to rebuild the a image containing @core packages every time. Let's just reuse the a image for all tests. This should speed up the test running time a LOT.
719b986
to
c43a458
Compare
This version is imho ready to be merged. There's a brief description how we solved my discussion points:
|
@ondrejbudai this looks really great! I think there is still one outstanding comment about OSError exception from Lars. Otherwise, I'm happy to merge this. |
Can be done as a follow up. Thanks Ondřej! This is really nice 🎉 |
Greetings, this a pretty huge PR, grab yourself a cup of coffee and hear my words!
This PR introduces stage tests: The stage testing is based on an output from the tree-diff tool. During one test two pipelines are run and their outputs are compared using tree-diff. The diff is then compared with expected diff included in the repository.
There are currently some issues with this PR I would like to discuss. One of them is blocker and we shall not merge this PR before it's solved.
Blocker
SELinux
SELinux works differently on Fedora host than on Ubuntu host. On Fedora users and timezone stages change selinux labels on some files. On Ubuntu it doesn't happen. See also #89
We need to investigate what's exactly happening: Why Fedora changes the labels and why it doesn't happen on Ubuntu.
Currently, there's a diff from Fedora, which causes the tests to fail on CI (which is running Ubuntu). But if you run the tests on Fedora, everything is alright. I will investigate images built on Ubuntu to see how selinux labels are applied there.
Discussions
How should diffs be "deep"
I've changed the behaviour of
tree-diff
to dump the whole added/deleted subtree instead of just its root. This is imho good, because we can detect more stuff. However, we are not currently not dumping anything else than file names. Therefore we are not able to check for content/permissions/selinux when file is added (or deleted, but doesn't make any sense, I guess?).Should we dump more information about added (deleted) files?
Testing of dnf stage
The dnf stage test is weird, the diff is huge... What can we do about that? Install only a small package on top of @Core image (pipeline with dnf stage installing @Core packages)? Currently, it feels to me like we are testing the dnf itself...
Preserving store between tests
Preserving store between tests sounds bad. Unfortunately, it takes 1 hour (!!) to run the current 9 stage tests without preserving the store. So, I've changed the tests to preserve the store between them. This works right now, but can have unexpected consequences in future mainly due to not ideal reproducibility of osbuild pipelines. Possible solutions:
Just don't install @Core every time!
Yeah, that would probably speed up the tests drastically. But do we want to test the other stages on empty image? Doesn't sound like a realistic usecase of stage...
Introduce some kind of base image, which all stage tests should use (core packages + selinux maybe?).
Seems like an interesting idea for me, the only downside is that test framework will be probably a little bit more complex? Or not?
Leave as it is in current PR
Yeah, it works, but I'm scared it will break sometime and debugging might be hard.
1 hour test? We don't mind!
I mind! It's just too much.