Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

launching aardvark-dns with systemd user on centos7 fails due to session dbus perms? #473

Closed
chrishecker opened this issue Oct 28, 2022 · 11 comments

Comments

@chrishecker
Copy link

chrishecker commented Oct 28, 2022

I assume centos7 is not a supported platform but please bear with me for a second, I've got it almost working well, I'm super close, and there's just one weird thing I could use some advice/help on...

Backstory: I gave up on docker years ago after I decided it was too much of a security risk, but when I checked back in on containers recently I saw podman was thing and was natively rootless and I wanted to give it a go since containers are cool but not so cool I want to run a root docker daemon. I played around with the centos7 version 1.x podman and it was great, but I wanted some of the newer podman features, specifically the better rootless network stuff.[1]

Anyway, I went about building and installing and updating things that seemed like they needed updating, and ended up with this:

  • netavark v1.2.0
  • aardvark-dns v1.2.0
  • slirp4netns v1.2.0
  • podman v4.3.0
  • iptables v1.8.8
  • kernel v6.0.3
  • dbus v1.14.5 (this isn't installed system-wide, but I've built tested a local user dbus-daemon with it, same results)
  • systemd v219
    (this is centos 7 original 😬 but I (un)patched it to support session/user systemd
    and added user dbus.service and dbus.socket files)
  • I'm probably forgetting other stuff I've updated, it's been a long week.

I have basically everything working, including per user dbus launching with session systemd automatically and working as expected with dbus-test-tool and dbus-monitor and whatnot.

The only remaining problem is aardvark-dns won't launch properly in the containers, and it's due to a dbus authn issue.

Here's what happens:

[checker] ~$ podman network create test
test
[checker] ~$ podman run -dt --network=test --name echo busybox /bin/nc -lk -p 1111 -e echo hello
Failed to start transient scope unit: Operation not permitted
180dd8c8ec4b9eae28457b6cfb1fc9633586744fa78f59643037a2498fc5139f
[checker] ~$ cat /run/user/1000/containers/networks/aardvark-dns/test
10.89.0.1
180dd8c8ec4b9eae28457b6cfb1fc9633586744fa78f59643037a2498fc5139f 10.89.0.30  echo,180dd8c8ec4b
[checker] ~$ podman run -it --network=test  busybox /bin/sh
Failed to start transient scope unit: Operation not permitted
/ # ping echo
ping echo
ping: bad address 'echo'
/ # ping 10.89.0.30
ping 10.89.0.30
PING 10.89.0.30 (10.89.0.30): 56 data bytes
64 bytes from 10.89.0.30: seq=0 ttl=64 time=0.212 ms
64 bytes from 10.89.0.30: seq=1 ttl=64 time=0.076 ms
  C-c C-c^C
--- 10.89.0.30 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.076/0.144/0.212 ms
/ # nc 10.89.0.30 1111
nc 10.89.0.30 1111
hello
  C-c C-c^Cpunt!
/ #

Okay, so the Failed to start transient scope unit: Operation not permitted is obviously an issue. There used to be way more errors before I got systemd user dbus working, including the dbus-daemon leaking (containers/podman#4483, containers/podman#9727, etc.), and ERRO[0000] failed to move the rootless netns slirp4netns process to the systemd user.slice: dbus: invalid bus address (no transport), but once I got session dbus working smoothly, all those went away.

Debugging Failed to start transient scope unit: Operation not permitted let me to this in the --log-level trace for the run:

[DEBUG netavark::dns::aardvark] Spawning aardvark server
[DEBUG netavark::dns::aardvark] start aardvark-dns: ["systemd-run", "-q", "--scope", "--user", "/usr/libexec/podman/aardvark-dns", "--config", "/run/user/1000/containers/networks/aardvark-dns", "-p", "53", "run"]
Failed to start transient scope unit: Operation not permitted

so I set about debugging that. The command runs fine from a normal shell, but it turns out this command also fails in the same way inside a podman unshare --rootless-netns shell:

[checker] ~$ podman unshare --rootless-netns
[root] ~$ systemd-run -q --scope --user /usr/libexec/podman/aardvark-dns --config /run/user/1000/containers/networks/aardvark-dns -p 53 run
Failed to start transient scope unit: Operation not permitted

which is nice because this is a lot easier to debug than a full container. I debugged systemd-run with gdb because I'd already built it to remove the centos patch to disable user systemd (which works fine, others have done it), and system-run was failing to talk on the /run/users/1000/systemd/private socket to the mothership. You can see that here:

...
sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0AUTH EXTERNAL ", iov_len=15}, {iov_base="30", iov_len=2}, {iov_base="\r\nNEGOTIATE_UNIX_FD\r\nBEGIN\r\n", iov_len=28}], msg_iovlen=3, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 45
...
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="REJECTED\r\nERROR\r\nERROR\r\n", iov_len=256}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS, cmsg_data={pid=13971, uid=0, gid=0}}], msg_controllen=32, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
...

Turns out no dbus apps will run in the rootless-netns, here's the trace on busctl --user which tries to connect to /run/user/1000/bus which is the normal (non-systemd/private dbus for users):

sendto(3, "AUTH EXTERNAL 30\r\n", 18, MSG_NOSIGNAL, NULL, 0) = 18
poll([{fd=3, events=POLLIN}], 1, -1)    = 1 ([{fd=3, revents=POLLIN}])
read(3, "REJECTED EXTERNAL DBUS_COOKIE_SH"..., 2048) = 46

It looks like the dbus sockets are available, but the dbus-daemon is using the peer credentials unix socket EXTERNAL authn and rejecting the connection maybe because of the uid mapping? But isn't root in my unshare --network-netns shell uid 1000 outside the namespace? Here is somebody else who hacked his dbus authn off to work around this.

So then I went and built and debugged the latest dbus-daemon, dbus-test-tool, etc. It is indeed failing the authn in handle_server_data_external_mech because the uid it gets off the SO_PEERCRED is 1000 for the connection from the dbus app inside the netns but the app itself is passing 0 in the AUTH EXTERNAL greeting and so these fail _dbus_credentials_are_superset. In debugging the daemon, I noticed a bunch of dbus connections as the unshare is set up, but those seem to work because they're not passing the uid the auth line, although it's hard to tell if they're happening before or after the namespace is set up.

Then I tried launching a dbus-daemon inside the netns and I got that to work with aardvark-dns launching, but that seems like a weird thing to have to do.

The other hack I tried was renaming /bin/systemd-run to /bin/systemd-runx so that netavark failed to find it here

if Path::new(SYSTEMD_CHECK_PATH).exists() && Aardvark::is_executable_in_path(SYSTEMD_RUN) {

which also allowed aardvark to launch in the container.

Okay, so questions:

  1. Is this supposed to work? I assume because rootless netavark and aardvark-dns are the new hotness that this is just supposed to work with user sessions and dbus and whatnot? netavark seems to require user session systemd to not complain on creation of the container?
  2. If it's supposed to work, which part of my system is buggy? Once I built the latest dbus, that seems to be the last bit of old centos 7 code that was running, so now it isn't the old systemd that's rejecting the dbus connection, even the latest dbus-test-tool talking to the latest dbus-daemon is sending the wrong (root) uid in the connection and getting rejected, so there's something else going on here maybe?

Thanks for reading all this rambling, maybe somebody with more of a systemd/dbus/container/namespace clue than I have can help out!

Chris

[1] although this was working great (and still is): https://github.com/AkihiroSuda/podman-network-create-for-rootless-podman

@Luap99
Copy link
Member

Luap99 commented Oct 28, 2022

Yes we require systemd user session when systemd is booted and system-run exists.
I have no idea how to debug this and do not have time for it, since as you said we do not support centos7.
If you already patch most things I recommend to remove this check so it will only start the process normally. The problem is that systemd likes to kill all process that were spawned in your session when you close it, e.g. ssh connection. Then dns stops working even though your containers are still running.
I don't even know if this is the case with your old systemd version so might not need it anyway.

@chrishecker
Copy link
Author

chrishecker commented Oct 28, 2022

I have no idea how to debug this and do not have time for it, since as you said we do not support centos7.

Oh yeah, I wasn't expecting you guys to debug this or anything, just was hoping for something pointing in the right direction. I think I've gotten all the centos7 stuff out of the way, and it's just down to this dbus netns uid thing...I assume my first recipe above is just supposed to work with dns enabled in there?

@chrishecker
Copy link
Author

Oh, and the other question is if you launch a namespace, is the user dbus (usually /run/user/1000/bus) supposed to be mapped in and work? Like even ignoring aardvark, if you do this what happens?

[checker] ~$ podman unshare --rootless-netns
[root] ~$ dbus-test-tool echo

I get a failed to connect due to the uid thing with dbus. What happens on a working system that launches aardvark?

@Luap99
Copy link
Member

Luap99 commented Oct 28, 2022

On my fedora 36 laptop:

$ podman unshare --rootless-netns dbus-test-tool echo
Failed to connect to bus: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

Also busctl --user works.

@chrishecker
Copy link
Author

Wait, busctl --user works in the unshare namespace but dbus-test-tool doesn't? That's odd if so...

@Luap99
Copy link
Member

Luap99 commented Oct 28, 2022

No dbus-test-tool also works, it can successfully connected to dbus, see the org.freedesktop.DBus.Error.NoReply response, I get the same without the namespace.

@chrishecker
Copy link
Author

Huh, when I run dbus-test-tool echo I get the name back, like:

[checker] ~$ dbus-test-tool echo
:1.182

@chrishecker
Copy link
Author

As expected, commenting out the systemd-run part of the aardvark launch works and dns works, etc. I am very curious about this dbus in namespaces, maybe I'll launch a cheap vps with latest rocky or whatever and debug it to see how it's handling the uid thing, it's very curious where it could be going wrong...

@chrishecker
Copy link
Author

Okay, I am testing this on rocky 9...it works, but there's a couple interesting things. First, user systemd isn't enabled by default, podman warns and you have to loginctl enable-linger user, which starts up the session systemd and creates the dbus sockets. However, systemctl --user status errors with Failed to connect to bus: No medium found which is kind of weird? Oh, wait, if I manually set DBUS_SESSION_BUS_ADDRESS then dbus-monitor and systemctl --user status start working.

The one difference between this setup and mine that I immediately notice is their dbus socket listener is dbus-broker and not dbus-daemon...I wonder if dbus-broker deals with container uid mismatch better...more debugging after dinner.

@chrishecker
Copy link
Author

Okay, I think I have finally figured out what's going on here. This isn't exactly a dbus issue, I mean it's a dbus issue, but it's not an issue with dbus-daemon or dbus-broker, because none of the podman stuff is calling the normal user dbus, and in fact I think normal dbus actually is busted going from inside a namespace to outside a namespace (on rocky 9 it doesn't work, and your error above on Fedora 36 for dbus-test-tool echo is not a successful run either). The problem is all happening with the private systemd "dbus", which is listened to directly by systemd itself by its internal copy of the dbus code, and is at /run/user/1000/systemd/private. I haven't exactly nailed it down yet, but I think my old version 219 of systemd does not have the right code in it to deal with connections from inside namespaces. Sadly, there's no way I can update systemd, seems too risky, so that might be the end of this journey unless I can patch 219 easily, but hopefully all this will help somebody else who encounters one of the many problems I worked through in a different more supported setup (there are certainly lots of folks out there with weird edge cases with matching error messages I found when doing all this). I'll probably post again if I figure out exactly where in systemd or systemd-run the problem is, but otherwise I think this is somewhat resolved in my mind.

I guess the one remaining question is whether dbus is supposed to work from inside a netns to outside, and I don't think it is? Or, at least, it doesn't right now even on supported systems like rocky 9 and fedora 36. My guess is you'd have to do some user id preservation, or make the client dbus lib understand it needs to use the _CONTAINERS_ROOTLESS_UID or whatever in its AUTH EXTERNAL connections to the dbus.

Chris

@chrishecker
Copy link
Author

chrishecker commented Oct 29, 2022

OMG FINALLY. I knew there was no deep unresolvable mystery here, there had to be somebody doing the wrong thing. After a bunch of debugging of systemd and systemd-run, I found that on the rocky 9 version the uid simply isn't sent from the systemd-run client (because it comes on the SO_PEERCRED anyway and so it's just a waste and messes up namespaces), and so I found the changelist where they fixed it in the systemd project (systemd/systemd@1ed4723) and backported it to my 219 systemd, which I was building anyway because for some reason Red Hat patched out all the user/session stuff (Patch0004 for those interested, you also need dbus.socket and dbus.service for user space). Works perfectly, acts just exactly like rocky 9 now even with the unpatched systemd-run netavark, woot!

For anybody in the future trying to get dbus working from inside to outside a rootless container even on a supported os, I assume a similar patch would need to be made to the dbus libs, since it looks like they always append the UID (at least as of the date of this post: https://gitlab.freedesktop.org/dbus/dbus/-/blob/master/dbus/dbus-auth.c#L1231), so if for some reason you're trying to tunnel dbus out of the rootless container to the user/session dbus, this is probably why it's failing (the AUTH EXTERNAL uid\r\n needs to be switched to not pass the uid and do the DATA\r\n trick like in the comment on that systemd commit).

Anyway, I think I'm really actually done now, mystery solved, phew.

@Luap99 Luap99 closed this as completed Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants