Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Fedora 26 has branched start testing there #6294

Merged
merged 10 commits into from Apr 13, 2017

Conversation

mvollmer
Copy link
Member

@mvollmer mvollmer commented Apr 7, 2017

Now that Fedora 26 has branched, we won't be releasing into the
main Fedora 25 channels any longer.

We don't yet remove support for Fedora 24 because other pull
requests are coming that migrate Avocado and Selenium stuff
later.

Also remove a HACK that's supposedly fixed on Fedora 26.

@cockpituous
Copy link
Contributor

bot: Image refresh for fedora-26

@mvollmer
Copy link
Member Author

mvollmer commented Apr 7, 2017

Building the image fails right now somewhere during kickstarting with

No such interface 'org.freedesktop.NetworkManager.Settings.Connection' on object at path /org/freedesktop/NetworkManager/Settings/2 (19)

Let's see what the bot says, and then I'll file a bug.

@mvollmer
Copy link
Member Author

Let's see what the bot says, and then I'll file a bug.

Bots don't seem to pick up fedora-26 images yet, but kickstarting started working for me again locally.

@martinpitt
Copy link
Member

Some of the storage tests should be fixed by PR #6293.

@mvollmer
Copy link
Member Author

Some of the storage tests should be fixed by PR #6293.

Nice! I was just starting to figure out the same things.

@mvollmer
Copy link
Member Author

@petervo, could you look at the kubernetes failures?

@mvollmer
Copy link
Member Author

Cockpit expects NetworkManager 1.6.0 to have certain checkpoint bugs fixed. Looks like they might still exist. I'll check.

@mvollmer
Copy link
Member Author

Rebased after merging #6293.

@martinpitt
Copy link
Member

Wrt. the "internal error in login process": This is a real SELinux policy bug, not a test problem. The relevant message seems to be:

audit[1348]: AVC avc:  denied  { execute_no_trans } for  pid=1348 comm="cockpit-ws" path="/usr/libexec/cockpit-ssh" dev="dm-0" ino=4888910 scontext=system_u:system_r:cockpit_ws_t:s0 tcontext=system_u:object_r:cockpit_ws_exec_t:s0 tclass=file permissive=0

audit[1348]: SYSCALL arch=c000003e syscall=59 success=no exit=-13 a0=55f9dcc9866f a1=7ffceb63ec80 a2=55f9dd59ce50 a3=7faed38a4da0 items=0 ppid=1249 pid=1348 auid=4294967295 uid=989 gid=985 euid=989 suid=989 fsuid=989 egid=985 sgid=985 fsgid=985 tty=(none) ses=4294967295 comm="cockpit-ws" exe="/usr/libexec/cockpit-ws" subj=system_u:system_r:cockpit_ws_t:s0 key=(null) audit: PROCTITLE proctitle=2F7573722F6C6962657865632F636F636B7069742D7773002D2D6E6F2D746C73

and if I add a self.machine.execute("setenforce 0") at the start of the test case, it works.

This changed recently in Fedora 26. However, both in F25 and F26 the labels on /usr/libexec/cockpit-ssh are the same: system_u:object_r:cockpit_ws_exec_t:s0

Trying to debug this:

# journalctl -b | grep AVC > /tmp/log
# audit2allow  -i /tmp/log -e

#============= cockpit_ws_t ==============
# 
#  scontext="system_u:system_r:cockpit_ws_t:s0" tcontext="system_u:object_r:cockpit_ws_exec_t:s0"
#  class="file" perms="execute_no_trans"
#  comm="cockpit-ws" exe="" path=""
#  message="Apr 11 05:04:07 localhost.localdomain audit[1321]: AVC avc:  denied
#   { execute_no_trans } for  pid=1321 comm="cockpit-ws"
#   path="/usr/libexec/cockpit-ssh" dev="dm-0" ino=4888910
#   scontext=system_u:system_r:cockpit_ws_t:s0
#   tcontext=system_u:object_r:cockpit_ws_exec_t:s0 tclass=file permissive=0 "
allow cockpit_ws_t cockpit_ws_exec_t:file execute_no_trans;

With audit2allow -i /tmp/log -M local && semodule -i local.pp it then works.

I just found https://bugzilla.redhat.com/show_bug.cgi?id=1381331 which seems to be about this. We have this hack in our spec file:

%post dashboard
# HACK: Until policy changes make it downstream
# https://bugzilla.redhat.com/show_bug.cgi?id=1381331
test -f %{_bindir}/chcon && chcon -t cockpit_ws_exec_t %{_libexecdir}/cockpit-ssh
%endif

But without it it doesn't work either.

I followed up on the Fedora bug.

So a canned fix would be this, but I don't know where to place it:

cat <<EOF > /tmp/local.te
module local 1.0;

require {
	type cockpit_ws_exec_t;
	type cockpit_ws_t;
	class file execute_no_trans;
}

allow cockpit_ws_t cockpit_ws_exec_t:file execute_no_trans;
EOF

checkmodule -M -m -o /tmp/local.mod /tmp/local.te
semodule_package -o /tmp/local.pp -m /tmp/local.mod
semodule -i /tmp/local.pp

We could put it into the tests, but it's a fix that you really need at runtime. For that we presumably need some spec modules for compiling a SELinux policy "properly" and ship the compiled one in the package? Or do we need to wait until it goes into selinux-policy?

@martinpitt
Copy link
Member

martinpitt commented Apr 11, 2017

Looking at the kubernetes tests, it looks like there's a bunch of adjustments necessary: First thing is

Invalid Authentication Config: open 1 permission denied

which causes kube-apiserver.service to fail. The latter has User=kube, so we need this:

chgrp kube /etc/kubernetes/{ca,server}.key; chmod 640 /etc/kubernetes/{ca,server}.key

I pushed this fix. But this will require an image rebuild (I just hacked it in with vm-run --maintain for now).

After that, it fails with

kube-apiserver[3749]: E0411 07:31:50.850427    3749 server.go:254] Failed to create clientset: parse 127.0.0.1:8080: first path segment in URL cannot contain colon

This was reported in kubernetes/kubernetes#38380 and is apparently some fallout from a Go library change. There was a fix/workaround in kubernetes, but this might not yet have landed in Fedora 26? Allegedly this happens when parsing an URL without a schema, but the only two occurrences that we have do have a schema already:

# grep -r 127.0.0.1:8080 /etc /var
/etc/kubernetes/config:KUBE_MASTER="--master=http://127.0.0.1:8080"
/etc/kubernetes/kubelet:KUBELET_API_SERVER="--api-servers=http://127.0.0.1:8080"

So I'm not sure how we can work around that. I reproduced in a clean Fedora 26 env and reported a bug.

@mvollmer
Copy link
Member Author

The lsblk change is reported in https://bugzilla.redhat.com/show_bug.cgi?id=1441175, and they say it's a real regression.

@mvollmer
Copy link
Member Author

Re check-service failure: It seems that one can't have WantedBy=default.target anymore, maybe because default.target is just a symlink to some other target. Also see #6324.

@martinpitt
Copy link
Member

I pushed a ridiculously ugly hack to adjust the SELinux policy. The original plan was to only do this if the selinux-policy shipped rules don't allow this yet, but this is difficult: One needs something like that to determine this:

# sesearch -s cockpit_ws_t -t cockpit_ws_exec_t -c file -p execute -A
allow cockpit_ws_t cockpit_ws_exec_t:file { entrypoint execute getattr ioctl lock open read };

But that would require setools-console and setools-python3 as new dependencies, and the command also takes painfully long. I didn't find a better way to inquire the current capabilities of cockpit_ws_t, suggestions welcome. But this policy snippet should be additive, so it shouldn't collide with selinux-policy-targetted once that gets updated.

If you think this is too ugly (and we most definitively should not release F26 with that!), I'm also okay with dropping that commit and adding a known issue instead.

Copy link
Contributor

@stefwalter stefwalter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a big nasty error message in the %post script to highlight the SELinux fail?

@@ -358,6 +358,16 @@ Cockpit support for remoting to other servers, bastion hosts, and a basic dashbo
# HACK: Until policy changes make it downstream
# https://bugzilla.redhat.com/show_bug.cgi?id=1381331
test -f %{_bindir}/chcon && chcon -t cockpit_ws_exec_t %{_libexecdir}/cockpit-ssh
%if 0%{?fedora} > 0 && 0%{?fedora} >= 26
if type semodule >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo "HACK: Workaround for broken SELinux policy: https://bugzilla.redhat.com/show_bug.cgi?id=1381331" > &2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! The current test finished and confirmed we are down to one failing test (selinux troubleshooter), so I pushed a fixup.

@mvollmer
Copy link
Member Author

About check-setroubleshoot:

We don't seem to get any notifications about new alerts from setroubleshootd. That's why the page stays empty. Manual reloading shows the expected alert.

Once a alert is shown, getting its details fails. Setroubleshootd returns a backtrace as a D-Bus Error:

Traceback (most recent call last):
  File "/usr/lib64/python3.6/site-packages/dbus/service.py", line 707, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/lib/python3.6/site-packages/setroubleshoot/server.py", line 584, in get_alert
    _("If ") + alert.substitute(plugin.get_if_text(avc, args)),
  File "/usr/share/setroubleshoot/plugins/catchall_boolean.py", line 60, in get_if_text
    txt=seobject.boolean_desc(args[0])
AttributeError: module 'seobject' has no attribute 'boolean_desc'

@mvollmer
Copy link
Member Author

We don't seem to get any notifications about new alerts from setroubleshootd.

The reason for that also seems to be the AttributeError. The journal shows

Exception during AVC analysis: module 'seobject' has no attribute 'boolean_desc'

@mvollmer
Copy link
Member Author

Rebased, still needs a new image. I'll make it.

@mvollmer
Copy link
Member Author

Rebased, still needs a new image. I'll make it.

That didn't work, but it might tomorrow. Let's just use what we have.

Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would squash the commits "test: Add known PCP issue also for fedora-26" and the "New" image into the first one. Otherwise this looks good to me now, thanks!

Someone else should also review though, as I meddled with this PR a lot too.

Now that Fedora 26 has branched, we won't be releasing into the
main Fedora 25 channels any longer.

We don't yet remove support for Fedora 24 because other pull
requests are coming that migrate Avocado and Selenium stuff
later.

Also remove a HACK that's supposedly fixed on Fedora 26.
mvollmer and others added 9 commits April 13, 2017 14:14
It conflicts with plain docker...
We need to pass extra flags to activate the fixes in NM that we need
for them.  Let's do that separately.
kube-apiserver.service runs with "User=kube" on Fedora 26, so make sure
it can read the keys.
With Fedora 26's SELinux version, cockpit_ws_t needs the
"execute_no_trans" capability to run cockpit-ssh. Apply a ridiculously
ugly %post hack until it gets fixed properly in the policy. See
<https://bugzilla.redhat.com/show_bug.cgi?id=1381331>.
@mvollmer mvollmer merged commit 4bb89ae into cockpit-project:master Apr 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants