Skip to content

Conversation

martinpitt
Copy link
Contributor

We use "behaviour" (BE) in the majority of cases, so change the "behavior"s (AE) too.

Please don't merge this yet!

I would like to use this as a "dummy" PR to see how the tests run against the current Ubuntu development version, and possibly fix them. Issue #4575 was a bit embarrassing, and while it's a bug in libseccomp, it's still a complete showstopper for users. So let's discuss the balance between the added burden of test failures due to other upstream project changes (like the kernel, util-linux, or said libseccomp) and catching such issues early. With all the work that's going on with cgroups v2, testing on more current kernels probably also couldn't hurt.

@martinpitt martinpitt added needs-discussion 🤔 ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR labels Nov 7, 2016
@evverx
Copy link
Member

evverx commented Nov 8, 2016

catching such issues early

Well, nobody knows how to reproduce these tests manually.
How will we debug such issues?

@martinpitt
Copy link
Contributor Author

The amd64 failure (logind test) is/was due to issue #4602 , fix pushed and test retried. The i386 failure reproduces issue #4575. The s390x failure is a bit surprising, it seems timedated/hostnamed etc. don't start up; I'll look at that in more detail.

@evverx : Indeed, that's a bit of a problem. They are fairly easy to reproduce on a Debian/Ubuntu system, but essentially impossible on a Fedora/RPM based one. So for the time being, realistically it's only @mbiebl, @fsateler, or me who can debug them, and if neither of us is around we could ignore regressions on that (or disable them entirely again). At least we'd still have some historical data which helps to investigate when an issue started to happen.

@martinpitt
Copy link
Contributor Author

Now zesty/amd64 only fails on the newly introduced CLI test and spots PR #4609 (i. e. this branch just needs rebasing). So amd64 is good now.

@martinpitt
Copy link
Contributor Author

On s390x, it's the new RestrictAddressFamilies=AF_UNIX in systemd-timedated.service and others which landed post-232: This breaks the daemons with

systemd-timedated[3751]: Failed to get system bus connection: Protocol not supported

So this is very similar to issue #4575 except that s390x is big-endian 64 bit (but nevertheless it involves seccomp).

@martinpitt
Copy link
Contributor Author

RestrictAddressFamilies=AF_UNIX which landed post-232

Sorry, it landed in 232, we just reverted it in the Debian package because it breaks too much. That's why our downstream CI tests are still green.

@martinpitt martinpitt changed the title NEWS: consistent spelling of "behaviour" run tests against Ubuntu development version? Nov 13, 2016
@martinpitt martinpitt removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Nov 15, 2016
@martinpitt
Copy link
Contributor Author

Tests all pass now that the downstream package has a workaround for issue #4575.

@evverx
Copy link
Member

evverx commented Nov 17, 2016

At least we'd still have some historical data which helps to investigate when an issue started to happen

For the record:
TEST-13-NSPAWN-SMOKE fails on Yakkety (against the master-branch too): #4670 (comment)

I dist-upgraded and can't reproduce the issue

@keszybz
Copy link
Member

keszybz commented Nov 18, 2016

So let's discuss the balance between the added burden of test failures due to other upstream project changes (like the kernel, util-linux, or said libseccomp) and catching such issues early. With all the work that's going on with cgroups v2, testing on more current kernels probably also couldn't hurt.

I think the burden is entirely worth it. A lot of development happens on Fedora machines, testing on newest Ubuntu will be quite beneficial. And more testing is better, we certainly don't have nearly enough different testbeds.

What do we need to enable those new tests? Merge this?

@evverx
Copy link
Member

evverx commented Nov 18, 2016

And more testing is better

We test a very limited subset of the systemd's features. So, this is a false sense of safety.

But I'm not against tests of course :-)

@martinpitt
Copy link
Contributor Author

@keszybz : No, this merge was just a place to do test runs without creating noise on a "real" PR. We need to add more webhooks with release=zesty if we want this.

@evverx tested on yakkety (Ubuntu 16.10), which might actually be a better compromise/next step: It has more up-to-date components (kernel 4.8 instead of 4.4, util-linux 2.28.2 instead of 2.27.1, newer libseccomp etc.) but it's stable, unlike the development release which gets constant jitter. So let me do infra runs against that release and see how it goes.

As for this PR itself, this can be merged if we care about the consistent spelling of "behaviour", but certainly not a biggie.

@evverx: Yes, we certainly lack tests for the plethora of Protect*=, cgroup settings, many corner cases of resolved and so on. Right now they tell you that systemd boots and generally works in its environment (kernel, util-linux, etc.), you get a display manager, d-bus and networking, and you can survive several reboots without failures. I. e. we can at least be sure that it's not a complete trainwreck that breaks your OS, and it's at least possible to deliver an update to fix a regression.

But of course I fully agree that adding tests to the other features that we have is entirely worthwhile. :-)

@evverx
Copy link
Member

evverx commented Nov 18, 2016

yakkety (Ubuntu 16.10), which might actually be a better compromise/next step

👍

Unrelated to this PR @martinpitt , I'm not sure about Github notifications. I prepared a patch: https://gist.github.com/evverx/6805f228a00f5cd362d3f0cd9d30f5c6 . Please, take a look.

@martinpitt
Copy link
Contributor Author

@evverx : nice one, thanks! Applied.

@poettering
Copy link
Member

getting CI hooked up against the current ubuntu devel version would be particularly useful to fix the seccomp issue properly... So I am all for it!

@martinpitt
Copy link
Contributor Author

So yakkety failed, in TEST-04-JOURNAL as @evverx mentioned. I'll try and rebase against current master.

But indeed we can just try the devel series for a while, and if it gets too annoying we'll just revert it (data beats speculation). And we of course always have the option to ignore failures, but that's annoying and error prone over extended time periods.

@evverx
Copy link
Member

evverx commented Nov 19, 2016

@martinpitt

in TEST-04-JOURNAL as @evverx mentioned

I didn't mention that.

Device       Boot  Start    End Sectors  Size Id Type
/dev/loop7p1        2048 800767  798720  390M 83 Linux
/dev/loop7p2      800768 819199   18432    9M 83 Linux

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
The file /dev/loop7p1 does not exist and no size was specified.
F: Failed to mkfs -t ext3
Makefile:4: recipe for target 'setup' failed

I'm not sure. Maybe, we should

diff --git a/test/test-functions b/test/test-functions
index c0128b8..bf48e03 100644
--- a/test/test-functions
+++ b/test/test-functions
@@ -342,6 +342,7 @@ EOF
     local _label="-L systemd"
     # mkfs.reiserfs doesn't know -L. so, use --label instead
     [[ "$FSTYPE" == "reiserfs" ]] && _label="--label systemd"
+    udevadm settle
     if ! mkfs -t "${FSTYPE}" ${_label} "${LOOPDEV}p1" -q; then
         dfatal "Failed to mkfs -t ${FSTYPE}"
         exit 1

@evverx
Copy link
Member

evverx commented Dec 10, 2016

Hm, I'm reading 9d06297

Without the "udevadm settle" systemd unmounted mnt while the script was
operating on mnt.

Of course the question is, why there was a REMOVE in the first place,
but this is not part of this patch.

And this looks like #4611 (comment)

Also dracutdevs/dracut@e54d961

We use "behaviour" (BE) in the majority of cases, so change the
"behavior"s (AE) too.
@martinpitt
Copy link
Contributor Author

I retriggered the tests to see how far the new "test-seccomp" holds up. This is tracked in issue #5215.

@keszybz
Copy link
Member

keszybz commented Sep 15, 2017

So what's the status here — OK to close this?

@martinpitt
Copy link
Contributor Author

Yes, it's just clutter and not precious. If that comes up again, I can reopen or open a new PR. That said, I consider running a test against the latest stable ubuntu too, to test against more recent plumbing/kernel versions.

@martinpitt martinpitt closed this Sep 15, 2017
@martinpitt martinpitt deleted the typos branch September 15, 2017 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants