System tests: add test tags #19302

edsantiago · 2023-07-20T21:10:48Z

BATS 1.8.0 introduces tags: metadata that can be applied to
a single test or one entire file, then used for filtering
in a test run.

Issue #19299 introduces the possibility of using OpenQA
for podman reverse dependency testing: continuous CI on
all packages that can affect podman, so we don't go two
months with no bodhi builds then get caught by surprise
when systemd or kernel or crun change in ways that break us.

This PR introduces one bats tag, "distro-integration". The intention
is for OpenQA (or other) tests to install the podman-tests package
and run:

bats --filter-tags distro-integration /usr/share/podman/test/system

Goal is to keep the test list short and sweet: we do not
need to test command-line option parsing. We DO need to
test interactions with systemd, kernel, nethack, and other
critical components.

Signed-off-by: Ed Santiago santiago@redhat.com

None

edsantiago · 2023-07-20T21:21:43Z

This is a preliminary, probably embarrassingly incomplete list of tests. Before we start in on the optimum subset, let's follow along in #19299 to see if this approach is even possible.

Oh, and I forgot to link to BATS tag documentation.

edsantiago · 2023-07-20T22:04:39Z

Discussion on #19299 is positive; looks like this is doable on the OpenQA front.

Okay, @containers/podman-maintainers, have at it. Please review my initial tags, then (the hard part) search for what I've missed.

AdamWill · 2023-07-20T22:06:29Z

I can actually PoC this before merging, even. I'm doing a scratch build (on F38 because build on Rawhide appears to be failing); if that succeeds I can just monkeypatch the test on openQA staging and run it against the scratch build and you can see how it will look.

edsantiago · 2023-07-20T22:09:07Z

uh, not yet maybe? We still have a bit of a problem with mismatched container-selinux. If you're Ok with having the test fail, though, then yeah, I'd love to see a PoC run.

AdamWill · 2023-07-20T22:11:55Z

sure, i don't care about tests failing on staging. they do that a lot. :P

AdamWill · 2023-07-20T22:15:01Z

Watch it live! https://openqa.stg.fedoraproject.org/tests/3003796#live (and probably get to laugh at me when it turns out I messed up the monkey edit somehow)

AdamWill · 2023-07-20T22:39:10Z

OK, so it ran and failed, so you can kinda see how it would look:
https://openqa.stg.fedoraproject.org/tests/3003796

it shows that the 'podman' test step is the one that failed, and the red outlined 'Failed' step shows you where it failed - if you click on that it tells you "# Test died: command 'bats --filter-tags openqa /usr/share/podman/test/system' failed at /usr/lib/os-autoinst/autotest.pm line 387." (openQA was designed around screenshot matching; originally all failures were screenshot match failures, so it kinda assumes every failure will be a thumbnail. when 'advanced' support for running stuff at consoles and checking the exit code was added, it was kinda shoehorned in with a 'fake thumbnail' like this). The screenshots around that show you what was on the screen at the time of the failure.

If we make it redirect the output of the test command to a file and upload that file, you would find it on the Logs & Assets tab. I can hack that in quick and run it again.

One wider thing to note, though: update testing is kind of a "full service" situation. I hate failed update tests. I never want there to be any failed update tests. Any time an update test fails, I will notice it and investigate it, and usually turn it into something more actionable - a bug report with the relevant logs attached, or whatever. In general, you (the upstream/package maintainer) aren't required to be looking at the openQA raw results, though you are of course welcome to.

AdamWill · 2023-07-20T22:42:08Z

Fun story: before the somewhat-clever console support was added in years ago, we used to just literally print the exit code (with an identifier before it) and screenshot match on that, if it wasn't a 0, the test failed!

edsantiago · 2023-07-20T22:51:28Z

Test failure looks pretty much like what I was expecting, although, is there any way to capture previous lines? All I see is the end (final) summary lines. This might be consistent with what you say about screenshot-capture origins. So, yes, I think capturing text logs would be nice.

I'm feeling hopeful about this. Thanks for all your efforts.

AdamWill · 2023-07-20T22:52:00Z

yeah, to get previous lines we'd definitely just want to upload the text output. I'm re-running the test with that wired in right now.

edsantiago · 2023-07-20T23:10:33Z

In general, you (the upstream/package maintainer) aren't required to be looking at the openQA raw results, though you are of course welcome to.

It seems insane not to: of course we want to have early warning of integration problems. And it seems cruel to ask you to diagnose and report problems with our code/tests. We, too strive for zero errors, but we don't have that and never will as long as networks are involved.

How do we (or, say, me) subscribe to openqa results for a given component?

Also, my work week is long since over, but I'll try to check in tomorrow morning.

AdamWill · 2023-07-20T23:27:25Z

Well, I'm the QA guy, it's kinda my job :P But of course, you certainly can. There isn't a real convenient 'off the shelf' way to subscribe to results, but the way you'd basically want to do it is via fedora messaging. Usually, when an openQA test finishes, three messages get published: a 'native' openQA message, a message in the kinda-standardized ci-messages format, and a message from ResultsDB when the result of the test is submitted there. For these trial-run tests only the 'native' openQA message happens, because there isn't a ci-messages standard for tests run on a scratch build, and we don't submit results for tests on scratch builds to ResultsDB (as they're assumed to usually be this kinda 'experiemental' test).

Here's the 'openQA native' message from the trial run earlier: https://apps.stg.fedoraproject.org/datagrepper/v2/id?id=da3f88c8-57de-4278-9778-e31ec1d4d4fe&is_raw=true&size=extra-large . For a test run on an update (as is the usual case), the BUILD would be "Update-(updatealias)", here it is "Kojitask-(taskid)-NOREPORT", where NOREPORT is a magic string that tells the scheduler/reporter not to submit the result to resultsdb.

Here's how a ci-messages message and a resultsdb result message for a completed update test generally look.

What I guess you'd want to do is just write a consumer for one of those types of message and have it ping you somehow if it sees a failure of the 'podman' test. Any of the messages should provide enough info for the ping to tell you what update the test failed on and generate a URL to the test (the messages have the test ID in them, the URL format is just https://openqa[.stg].fedoraproject.org/tests/(testid) ). We do also run the test on composes, which produces similar messages but with somewhat different information about the 'thing under test' (it'll be a compose ID, not a Bodhi update alias).

I dunno if you've dealt with fedora messaging before, but if not, the docs are at https://fedora-messaging.readthedocs.io/en/stable/ . Writing a simple consumer is pretty easy, there's a sample at https://fedora-messaging.readthedocs.io/en/stable/user-guide/consuming.html (https://fedora-messaging.readthedocs.io/en/stable/user-guide/configuration.html#consumer-options explains what to put in the TOML config file).

AdamWill · 2023-07-20T23:29:06Z

Here's a new run of the test where I redirected the output to a text file and uploaded it: https://openqa.stg.fedoraproject.org/tests/3003804

You can find the output on the Logs & Assets tab as podman-podmanbats.txt - https://openqa.stg.fedoraproject.org/tests/3003804/logfile?filename=podman-podmanbats.txt

AdamWill · 2023-07-20T23:32:17Z

I guess it might be nice if the new FMN let you just configure yourself to get notified of openQA failures, but I dunno if I have enough time to look at implementing that.

Luap99 · 2023-07-21T08:06:01Z

Goal is to keep the test list short and sweet: we do not
need to test command-line option parsing. We DO need to
test interactions with systemd, kernel, nethack, and other
critical components.

Ok I fully I understand that but the problem is with a selective list we will always miss certain things.
If we have to do a selective list we should get everyone on the team suggest possible candidates because I at least see that critical network tests are missing.

edsantiago · 2023-07-21T15:48:53Z

If we have to do a selective list we should get everyone on the team suggest possible candidates because I at least see that critical network tests are missing.

That's exactly what I'm requesting, it just got lost in the conversation.

AdamWill · 2023-07-21T15:58:09Z

Yeah, as this will gate other updates, please focus on critical features that would constitute a broken experience if the tests fail.

edsantiago · 2023-07-26T14:36:47Z

Since no response: I revisited my list of tagged tests, added a few more (including network).

@containers/podman-maintainers, in the interests of getting something going in terms of , can we merge this, then fine-tune the essential-test list later? I don't see any possible way that this PR introduces any risk to existing CI or gating tests (but please review with that in mind).

vrothberg

LGTM

test/system/190-run-ipcns.bats

openshift-ci · 2023-07-26T14:41:26Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [edsantiago,vrothberg]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

AdamWill · 2023-07-26T15:12:56Z

btw, I hate to bikeshed, but 'openqa' seems like a possibly over-targeted name. can anyone think of a name which means more 'tests that may be useful as integration tests for distros', or however we want to describe the real concept here?

edsantiago · 2023-07-26T17:05:55Z

Naming is usually the hardest part, so this is the time to bikeshed. fullsystem? rolling? minimal? distro?

integration is probably the most intuitive, but unfortunately we already have an existing set of (misnamed) "integration tests" and the ambiguity would be too painful to consider.

AdamWill · 2023-07-26T17:16:22Z

downstream-integration? distro-integration?

BATS 1.8.0 introduces tags: metadata that can be applied to a single test or one entire file, then used for filtering in a test run. Issue containers#19299 introduces the possibility of using OpenQA for podman reverse dependency testing: continuous CI on all packages that can affect podman, so we don't go two months with no bodhi builds then get caught by surprise when systemd or kernel or crun change in ways that break us. This PR introduces one bats tag, "distro-integration". The intention is for OpenQA (or other) tests to install the podman-tests package and run: bats --filter-tags distro-integration /usr/share/podman/test/system Goal is to keep the test list short and sweet: we do not need to test command-line option parsing. We *DO* need to test interactions with systemd, kernel, nethack, and other critical components. Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago · 2023-07-26T19:52:39Z

Going once, going twice, distro-integration it is (for now).

AdamWill · 2023-07-26T19:55:06Z

cool! once this makes it to official downstream (Fedora) builds, I can put it into openQA for reals. it would be best if this can land in F37, F38 and Rawhide so I can run the additional test unconditionally; if it can only go in Rawhide, I can just only run the test on Rawhide updates.

vrothberg · 2023-07-27T07:20:17Z

Until the 4.7 release (probably somewhere in September), it will only land in Rawhide.

vrothberg

@containers/podman-maintainers PTAL

rhatdan · 2023-07-28T12:29:05Z

/lgtm

AdamWill · 2023-07-28T16:14:42Z

cool. poke me when it lands in Rawhide and I will add the openQA part to run only on Rawhide updates for now.

[ Clean cherry-pick of containers#19302. This is a low-risk change with potentially very high ROI: the opportunity to catch interaction problems with updates in other system components. ] BATS 1.8.0 introduces tags: metadata that can be applied to a single test or one entire file, then used for filtering in a test run. Issue containers#19299 introduces the possibility of using OpenQA for podman reverse dependency testing: continuous CI on all packages that can affect podman, so we don't go two months with no bodhi builds then get caught by surprise when systemd or kernel or crun change in ways that break us. This PR introduces one bats tag, "distro-integration". The intention is for OpenQA (or other) tests to install the podman-tests package and run: bats --filter-tags distro-integration /usr/share/podman/test/system Goal is to keep the test list short and sweet: we do not need to test command-line option parsing. We *DO* need to test interactions with systemd, kernel, nethack, and other critical components. Signed-off-by: Ed Santiago <santiago@redhat.com>

AdamWill · 2023-08-11T21:42:25Z

It looks like this landed in 4.6.1 which is landing in Fedora across all branches, so I'll look at extending the openQA tests in production. thanks!

AdamWill · 2023-08-11T21:49:40Z

hmm, actually, 4.6.1 is failing the Fedora CI checks ATM, so maybe I'll wait for that to be sorted.

AdamWill · 2023-08-17T05:59:18Z

OK, this is now done in openQA.

edsantiago added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 20, 2023

openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jul 20, 2023

edsantiago mentioned this pull request Jul 20, 2023

Review tests + request for additional test ideas for Podman #19299

Closed

edsantiago force-pushed the openqa branch from 91c265b to 48d3141 Compare July 25, 2023 14:23

vrothberg approved these changes Jul 26, 2023

View reviewed changes

test/system/190-run-ipcns.bats Outdated Show resolved Hide resolved

edsantiago force-pushed the openqa branch 3 times, most recently from ef10d62 to 83a9f29 Compare July 26, 2023 15:01

edsantiago force-pushed the openqa branch from 83a9f29 to 33891e8 Compare July 26, 2023 19:51

vrothberg reviewed Jul 28, 2023

View reviewed changes

openshift-ci bot assigned rhatdan Jul 28, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 28, 2023

openshift-merge-robot merged commit 4153e97 into containers:main Jul 28, 2023
84 of 88 checks passed

edsantiago deleted the openqa branch July 28, 2023 17:38

edsantiago mentioned this pull request Jul 31, 2023

unmounting/EINVAL flake #18831

Closed

edsantiago mentioned this pull request Aug 1, 2023

[v4.6] System tests: add test tags #19473

Merged

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Nov 16, 2023

github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System tests: add test tags #19302

System tests: add test tags #19302

edsantiago commented Jul 20, 2023 •

edited

edsantiago commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023 •

edited

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023

Luap99 commented Jul 21, 2023

edsantiago commented Jul 21, 2023

AdamWill commented Jul 21, 2023

edsantiago commented Jul 26, 2023

vrothberg left a comment

openshift-ci bot commented Jul 26, 2023

AdamWill commented Jul 26, 2023

edsantiago commented Jul 26, 2023

AdamWill commented Jul 26, 2023

edsantiago commented Jul 26, 2023

AdamWill commented Jul 26, 2023

vrothberg commented Jul 27, 2023

vrothberg left a comment

rhatdan commented Jul 28, 2023

AdamWill commented Jul 28, 2023

AdamWill commented Aug 11, 2023

AdamWill commented Aug 11, 2023

AdamWill commented Aug 17, 2023

System tests: add test tags #19302

System tests: add test tags #19302

Conversation

edsantiago commented Jul 20, 2023 • edited

edsantiago commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023 • edited

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

edsantiago commented Jul 20, 2023

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023

AdamWill commented Jul 20, 2023

Luap99 commented Jul 21, 2023

edsantiago commented Jul 21, 2023

AdamWill commented Jul 21, 2023

edsantiago commented Jul 26, 2023

vrothberg left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Jul 26, 2023

AdamWill commented Jul 26, 2023

edsantiago commented Jul 26, 2023

AdamWill commented Jul 26, 2023

edsantiago commented Jul 26, 2023

AdamWill commented Jul 26, 2023

vrothberg commented Jul 27, 2023

vrothberg left a comment

Choose a reason for hiding this comment

rhatdan commented Jul 28, 2023

AdamWill commented Jul 28, 2023

AdamWill commented Aug 11, 2023

AdamWill commented Aug 11, 2023

AdamWill commented Aug 17, 2023

edsantiago commented Jul 20, 2023 •

edited

AdamWill commented Jul 20, 2023 •

edited