Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman doesn't start aardvark-dns when in an LXC container #18783

Open
abalmos opened this issue Jun 2, 2023 · 22 comments
Open

podman doesn't start aardvark-dns when in an LXC container #18783

abalmos opened this issue Jun 2, 2023 · 22 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. network Networking related issue or feature stale-issue

Comments

@abalmos
Copy link

abalmos commented Jun 2, 2023

Issue Description

Inside an LXC container, podman does not start aardvark-dns. This breaks container DNS resolution when a podman network is used because podman still properly re-writes the containers /etc/resolve.conf to point to (expected to be running) aardvark-dns.

I do not see this issue when using a very similar environment, but on metal rather than an LXC.

After starting the aardvark-dns daemon by hand with /usr/libexec/podman/aardvark-dns --config /run/containers/storage/networks/aardvark-dns -p 53 run, everything works as expected for the remainder of the LXC container's lifetime.

I understand that this environment may not be available to the podman developers. I am willing to do some debugging, but so far I have not been able to identity any error or other logging which may indicate the issue.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Start (in my case) a Fedora 38 LXC container.
  2. podman network create test
  3. podman run --rm -it --network test fedora:38 bash
  4. getent hosts example.org <-- this returns nothing.

Without the --network flag getent hosts example.org returns 2606:2800:220:1:248:1893:25c8:1946 example.org.

After starting aardvark-dns with /usr/libexec/podman/aardvark-dns --config /run/containers/storage/networks/aardvark-dns -p 53 run on the host, then the DNS resolution works even with the --network flag on podman.

Describe the results you received

DNS resolution doesn't work.

Describe the results you expected

DNS resolution to work.

podman info output

If you are unable to run podman info for any reason, please provide the podman version, operating system and its version and the architecture you are running.

Fedora 38


Client:       Podman Engine
Version:      4.5.0
API Version:  4.5.0
Go Version:   go1.20.2
Built:        Fri Apr 14 15:42:22 2023
OS/Arch:      linux/amd64

I can not easily upgrade the environment to podman 4.5.1, but based on the changelog, I do not think this issue is addressed.



### Podman in a container

Yes

### Privileged Or Rootless

Privileged

### Upstream Latest Release

No

### Additional environment details

Within an LXC container.

### Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
@abalmos abalmos added the kind/bug Categorizes issue or PR as related to a bug. label Jun 2, 2023
@abalmos
Copy link
Author

abalmos commented Jun 2, 2023

I should note that in our actual environment, we are using Quadlet. It may be that Quadlet is the cause of not starting the dns server.

@Luap99
Copy link
Member

Luap99 commented Jun 5, 2023

Please add --log-level debug to your podman run commands. That should show how aardvark-dns is started.
Is systemd running in your LXC container? If not does /run/systemd/system/ exists? We try to use systemd-run in such case.

@Luap99 Luap99 added the network Networking related issue or feature label Jun 5, 2023
@aither64
Copy link

I'm not sure if it is the same issue, but I have hit upon aardvark-dns not starting in an LXC container as well and perhaps it will help. It happens as:

  1. Create an unprivileged LXC container, so that it is running in a user namespace
  2. Login to the container as a non-root user
  3. sudo podman network create test
  4. sudo podman run --rm -it --network test --log-level debug fedora:38 bash

The issue here is that since podman is running in a user namespace (the LXC container), it considers it as rootless. netavark in rootless mode uses systemd-run --user to start aardvark-dns. That however requires a proper systemd user session to be created... sudo does not create such a session, so the systemd-run fails:

[DEBUG netavark::dns::aardvark] Spawning aardvark server
[DEBUG netavark::dns::aardvark] start aardvark-dns: ["systemd-run", "-q", "--scope", "--user", "/usr/libexec/podman/aardvark-dns", "--config", "/run/containers/storage/networks/aardvark-dns", "-p", "53", "run"]
Failed to connect to bus: No medium found

It works when you login as root directly and do not use sudo/su to switch privileges. It could be the same issue if Quadlet does not run podman with a proper user session, which is possible, I guess.

@abalmos abalmos closed this as completed Jun 19, 2023
@abalmos
Copy link
Author

abalmos commented Jun 19, 2023

Sorry for the lack of reply, I couldn't find much free time while on a work trip.

I think @aither64 is exactly right. I should have clarified that I am running these commands as root within a rootless LXC container. It appears that somehow podman is miss identifying the situation as a rootless podman environment, and adding --user to the systemd-run command. I don't think it should in this case.

@abalmos abalmos reopened this Jun 19, 2023
@Luap99
Copy link
Member

Luap99 commented Jun 20, 2023

Do you have the CAP_SYS_ADMIN and CAP_NET_ADMIN capabilities in the LXC container?
Seems like podman is automatically switching to rootless mode when run as root without CAP_SYS_ADMIN because it will not work otherwise.

@giuseppe Do we need to check for _CONTAINERS_ROOTLESS_UID != 0 instead we try to talk to systemd? This is not exclusive to the aardvark-dns startup? We do the same for the systemd healthchecks.

@giuseppe
Copy link
Member

@Luap99 yes that seems like a better check. We need to talk to the --user session only when running as ID!=0, not when we lack CAP_SYS_ADMIN

@abalmos
Copy link
Author

abalmos commented Jun 20, 2023

@Luap99 I do have CAP_SYS_ADMIN and CAP_NET_ADMIN in the LXC container.

[abalmos@CT103 ~]$ sudo capsh --print | grep -i "CAP_SYS_ADMIN"
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read,cap_perfmon,cap_bpf,cap_checkpoint_restore
[abalmos@CT103 ~]$ sudo capsh --print | grep -i "CAP_NET_ADMIN"
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read,cap_perfmon,cap_bpf,cap_checkpoint_restore

@Luap99
Copy link
Member

Luap99 commented Jun 20, 2023

AFAIK bounding does not mean you have those caps it just means you may gain those via exec().

$ grep Cap /proc/self/status 
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000
$ sudo grep Cap /proc/self/status 
CapInh:	0000000000000000
CapPrm:	000001ffffffffff
CapEff:	000001ffffffffff
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000

CapEff is important here

@giuseppe
Copy link
Member

You have them in the bounding set, it means you can gain them but you haven't them

@abalmos
Copy link
Author

abalmos commented Jun 20, 2023

Of course, that makes sense. Then correct, I don't have those capabilities:

$ grep Cap /proc/self/status
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000

@aither64
Copy link

@abalmos have you run that as root in the LXC container? Because I do have them as root:

# grep Cap /proc/self/status
CapInh:	0000000000000000
CapPrm:	000001ffffffffff
CapEff:	000001ffffffffff
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000

@abalmos
Copy link
Author

abalmos commented Jun 20, 2023

@aither64 Your right, as root I seem to have them.

$ grep Cap /proc/self/status
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
$ sudo grep Cap /proc/self/status
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
# grep Cap /proc/self/status
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000

@Luap99
Copy link
Member

Luap99 commented Jul 13, 2023

Can you show me a full podman run with --log-level debug this should give some clarity on what is happening, also podman info would be helpful.

@aither64
Copy link

The following commands are run as root from an LXC container, but without a proper systemd user session as I've described above.

podman run --rm -it --network test --log-level debug fedora:38 bash

root@vps:~# podman network create test

root@vps:~# podman run --rm -it --network test --log-level debug fedora:38 bash
INFO[0000] podman filtering at log level debug
DEBU[0000] Called run.PersistentPreRunE(podman run --rm -it --network test --log-level debug fedora:38 bash)
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/containers/storage
DEBU[0000] Using run root /run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod
DEBU[0000] Using tmp dir /run/libpod
DEBU[0000] Using volume path /var/lib/containers/storage/volumes
DEBU[0000] Using transient store: false
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] Unknown filesystem type 0x13372fc1 reported for /var/lib/containers/storage
DEBU[0000] overlay: storage already configured with a mount-program
DEBU[0000] backingFs=, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false
DEBU[0000] Initializing event backend journald
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
INFO[0000] Setting parallel job count to 25
DEBU[0000] Successfully loaded network test: &{test 18ac048c4bf7f829ae083a84046b1b8c9aaa110bc16bc7939ef270b8546384e4 bridge podman1 2023-06-19 08:45:25.319348489 +0200 CEST [{{{10.89.0.0 ffffff00}} 10.89.0.1 }] false false true [] map[] map[] map[driver:host-local]}
DEBU[0000] Successfully loaded 2 networks
DEBU[0000] Pulling image fedora:38 (policy: missing)
DEBU[0000] Looking up image "fedora:38" in local containers storage
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux [] }
DEBU[0000] Loading registries configuration "/etc/containers/registries.conf"
DEBU[0000] Loading registries configuration "/etc/containers/registries.conf.d/000-shortnames.conf"
DEBU[0000] Trying "registry.fedoraproject.org/fedora:38" ...
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Found image "fedora:38" as "registry.fedoraproject.org/fedora:38" in local containers storage
DEBU[0000] Found image "fedora:38" as "registry.fedoraproject.org/fedora:38" in local containers storage ([overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c)
DEBU[0000] exporting opaque data as blob "sha256:ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Looking up image "registry.fedoraproject.org/fedora:38" in local containers storage
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux [] }
DEBU[0000] Trying "registry.fedoraproject.org/fedora:38" ...
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Found image "registry.fedoraproject.org/fedora:38" as "registry.fedoraproject.org/fedora:38" in local containers storage
DEBU[0000] Found image "registry.fedoraproject.org/fedora:38" as "registry.fedoraproject.org/fedora:38" in local containers storage ([overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c)
DEBU[0000] exporting opaque data as blob "sha256:ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Looking up image "fedora:38" in local containers storage
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux [] }
DEBU[0000] Trying "registry.fedoraproject.org/fedora:38" ...
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Found image "fedora:38" as "registry.fedoraproject.org/fedora:38" in local containers storage
DEBU[0000] Found image "fedora:38" as "registry.fedoraproject.org/fedora:38" in local containers storage ([overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c)
DEBU[0000] exporting opaque data as blob "sha256:ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Inspecting image ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c
DEBU[0000] exporting opaque data as blob "sha256:ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] exporting opaque data as blob "sha256:ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Inspecting image ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c
DEBU[0000] Inspecting image ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c
DEBU[0000] Inspecting image ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c
DEBU[0000] Inspecting image ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c
DEBU[0000] using systemd mode: false
DEBU[0000] No hostname set; container's hostname will default to runtime default
DEBU[0000] Loading seccomp profile from "/usr/share/containers/seccomp.json"
DEBU[0000] Allocated lock 0 for container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] exporting opaque data as blob "sha256:ded636d6da3e0389777047036bf6161df6a208e17655ee99aca4af042198c50c"
DEBU[0000] Created container "caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb"
DEBU[0000] Container "caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb" has work directory "/var/lib/containers/storage/overlay-containers/caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb/userdata"
DEBU[0000] Container "caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb" has run directory "/run/containers/storage/overlay-containers/caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb/userdata"
DEBU[0000] Handling terminal attach
INFO[0000] Received shutdown.Stop(), terminating! PID=461067
DEBU[0000] Enabling signal proxying
DEBU[0000] could not get XDG_RUNTIME_DIR
DEBU[0000] Ignoring global metacopy option, the mount program doesn't support it
DEBU[0000] overlay: mount_data=lowerdir=/var/lib/containers/storage/overlay/l/ZQZG5KTOJ7XHPRLYIEA4563LZC,upperdir=/var/lib/containers/storage/overlay/dc905444c12c11a4ff0a8799e3762434c2f9c4771bc5cb3cd082f08649e1dcc4/diff,workdir=/var/lib/containers/storage/overlay/dc905444c12c11a4ff0a8799e3762434c2f9c4771bc5cb3cd082f08649e1dcc4/work,nodev,volatile
DEBU[0000] Made network namespace at /run/user/0/netns/netns-bb906e7c-7708-8385-7e63-b8c63540555b for container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb
DEBU[0000] Mounted container "caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb" at "/var/lib/containers/storage/overlay/dc905444c12c11a4ff0a8799e3762434c2f9c4771bc5cb3cd082f08649e1dcc4/merged"
[DEBUG netavark::network::validation] "Validating network namespace..."
[DEBUG netavark::commands::setup] "Setting up..."
[INFO netavark::firewall] Using iptables firewall driver
DEBU[0000] Created root filesystem for container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb at /var/lib/containers/storage/overlay/dc905444c12c11a4ff0a8799e3762434c2f9c4771bc5cb3cd082f08649e1dcc4/merged
[DEBUG netavark::network::bridge] Setup network test
[DEBUG netavark::network::bridge] Container interface name: eth0 with IP addresses [10.89.0.3/24]
[DEBUG netavark::network::bridge] Bridge name: podman1 with IP addresses [10.89.0.1/24]
[DEBUG netavark::network::core_utils] Setting sysctl value for net.ipv4.ip_forward to 1
[DEBUG netavark::network::core_utils] Setting sysctl value for /proc/sys/net/ipv6/conf/eth0/autoconf to 0
[INFO netavark::network::netlink] Adding route (dest: 0.0.0.0/0 ,gw: 10.89.0.1, metric 100)
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-EE26B0DD4AF7E created on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_ISOLATION_2 exists on table filter
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_ISOLATION_2 exists on table filter
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_ISOLATION_3 exists on table filter
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_ISOLATION_3 exists on table filter
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_FORWARD exists on table filter
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_FORWARD exists on table filter
[DEBUG netavark::firewall::varktables::helpers] rule -d 10.89.0.0/24 -j ACCEPT created on table nat and chain NETAVARK-EE26B0DD4AF7E
[DEBUG netavark::firewall::varktables::helpers] rule ! -d 224.0.0.0/4 -j MASQUERADE created on table nat and chain NETAVARK-EE26B0DD4AF7E
[DEBUG netavark::firewall::varktables::helpers] rule -s 10.89.0.0/24 -j NETAVARK-EE26B0DD4AF7E created on table nat and chain POSTROUTING
[DEBUG netavark::firewall::varktables::helpers] rule -d 10.89.0.0/24 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT created on table filter and chain NETAVARK_FORWARD
[DEBUG netavark::firewall::varktables::helpers] rule -s 10.89.0.0/24 -j ACCEPT created on table filter and chain NETAVARK_FORWARD
[DEBUG netavark::firewall::iptables] Adding firewalld rules for network 10.89.0.0/24
[DEBUG netavark::firewall::firewalld] Adding subnet 10.89.0.0/24 to zone trusted as source
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-SETMARK exists on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-SETMARK exists on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-MASQ exists on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-MASQ exists on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-DNAT exists on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-DNAT exists on table nat
[DEBUG netavark::firewall::varktables::helpers] rule -j MARK --set-xmark 0x2000/0x2000 exists on table nat and chain NETAVARK-HOSTPORT-SETMARK
[DEBUG netavark::firewall::varktables::helpers] rule -j MASQUERADE -m comment --comment 'netavark portfw masq mark' -m mark --mark 0x2000/0x2000 exists on table nat and chain NETAVARK-HOSTPORT-MASQ
[DEBUG netavark::firewall::varktables::helpers] rule -j NETAVARK-HOSTPORT-DNAT -m addrtype --dst-type LOCAL exists on table nat and chain PREROUTING
[DEBUG netavark::firewall::varktables::helpers] rule -j NETAVARK-HOSTPORT-DNAT -m addrtype --dst-type LOCAL exists on table nat and chain OUTPUT
[DEBUG netavark::dns::aardvark] Spawning aardvark server
[DEBUG netavark::dns::aardvark] start aardvark-dns: ["systemd-run", "-q", "--scope", "--user", "/usr/libexec/podman/aardvark-dns", "--config", "/run/containers/storage/networks/aardvark-dns", "-p", "53", "run"]
Failed to connect to bus: No medium found
[DEBUG netavark::commands::setup] {
"test": StatusBlock {
dns_search_domains: Some(
[
"dns.podman",
],
),
dns_server_ips: Some(
[
10.89.0.1,
],
),
interfaces: Some(
{
"eth0": NetInterface {
mac_address: "3e:aa:e5:f5:9e:b0",
subnets: Some(
[
NetAddress {
gateway: Some(
10.89.0.1,
),
ipnet: 10.89.0.3/24,
},
],
),
},
},
),
},
}
[DEBUG netavark::commands::setup] "Setup complete"
DEBU[0000] Adding nameserver(s) from network status of '["10.89.0.1"]'
DEBU[0000] Adding search domain(s) from network status of '["dns.podman"]'
DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription
DEBU[0000] Setting Cgroup path for container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb to /libpod_parent/libpod-caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d
DEBU[0000] Workdir "/" resolved to host path "/var/lib/containers/storage/overlay/dc905444c12c11a4ff0a8799e3762434c2f9c4771bc5cb3cd082f08649e1dcc4/merged"
DEBU[0000] Created OCI spec for container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb at /var/lib/containers/storage/overlay-containers/caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb/userdata/config.json
DEBU[0000] /usr/bin/conmon messages will be logged to syslog
DEBU[0000] running conmon: /usr/bin/conmon args="[--api-version 1 -c caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb -u caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb -r /usr/bin/crun -b /var/lib/containers/storage/overlay-containers/caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb/userdata -p /run/containers/storage/overlay-containers/caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb/userdata/pidfile -n optimistic_clarke --exit-dir /run/libpod/exits --full-attach -l journald --log-level debug --syslog -t --conmon-pidfile /run/containers/storage/overlay-containers/caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /run/containers/storage --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/libpod --exit-command-arg --network-config-dir --exit-command-arg --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /var/lib/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg boltdb --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev,metacopy=on --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb]"
DEBU[0000] Received: 461125
INFO[0000] Got Conmon PID as 461123
DEBU[0000] Created container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb in OCI runtime
DEBU[0000] Attaching to container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb
DEBU[0000] Received aa resize event: {Width:119 Height:69}
DEBU[0000] Starting container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb with command [bash]
DEBU[0000] Started container caee546ed51f5522141eab5bd36246f2e465be4fbbe238d43bf6706c7b3e4ffb
DEBU[0000] Notify sent successfully
[root@caee546ed51f /]#

podman info

root@vps:~# podman info
host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 99.94
    systemPercent: 0.01
    userPercent: 0.05
  cpus: 8
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    version: "38"
  eventLogger: journald
  hostname: vps
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.10.164
  linkmode: dynamic
  logDriver: journald
  memFree: 3535634432
  memTotal: 4294967296
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.5-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.5
      commit: b6f80f766c9a89eb7b1440c0a70ab287434b17ed
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 772h 27m 50.00s (Approximately 32.17 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 64424509440
  graphRootUsed: 1385693184
  graphStatus:
    Backing Filesystem: 
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.5.1
  Built: 1685123928
  BuiltTime: Fri May 26 19:58:48 2023
  GitCommit: ""
  GoVersion: go1.20.4
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.1

You might notice the backing file system being unknown, it is ZFS with a modified signature -- I don't think this matters in this case. I think it can be related to podman being run in a user namespace:

root@vps:~# cat /proc/self/uid_map
         0   10616832     524288

@Luap99
Copy link
Member

Luap99 commented Jul 21, 2023

You might notice the backing file system being unknown, it is ZFS with a modified signature -- I don't think this matters in this case. I think it can be related to podman being run in a user namespace:

root@vps:~# cat /proc/self/uid_map
0 10616832 524288

That is it, if I read the logs correctly podman's rootless.IsRootless() works as expected while the c/storage function unshare.Isrootless() used by the networking code does not.

@giuseppe PTAL, this seems concerning that both function behave differently. Not sure why you added the uidmapping check?

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@abalmos
Copy link
Author

abalmos commented Oct 6, 2023

I just wanted to report that this issue is not stale. We noticed it again today after some system updates.

@Luap99
Copy link
Member

Luap99 commented Oct 17, 2023

@giuseppe Can you respond to #18783 (comment)?

The fact that podman and c/storage have a different meaning of what IsRootless() means is concerning to me. c/storage checks for the init userns while podman is happy with uid 0 and CAP_SYS_ADMIN.

I think the podman check makes more sense in the context of nested containers. There should be no need for the network code to fall into the rootless logic as this makes things much slower with extra slirp4netns setup.

@giuseppe
Copy link
Member

yeah it is messy :/

The c/storage meaning is related to what the kernel allows us to do, since some features are not enabled when we are not in the initial user namespace (e.g. idmapped mounts except for tmpfs, or not using "-o user" for overlay), while the expectation for rootless podman is to behave more as root when running in a user namespace especially for the nested-podman use case.

Should the network code really be like useSlirp := rootless.IsRootless() || !HasCapNetAdmin() since it also needs cap_net_admin ?

@Luap99
Copy link
Member

Luap99 commented Oct 19, 2023

I think check net_admin for the network code would make sense but needs likely some bigger change that I don't want to do right now to fix this.
I feel like it would make sense for c/storage to have two function. IsRootless() which keeps the podman logic and IsInitUserns() to check for the stuff were this is needed.

My main issue is that basically all the code I moved to c/common breaks in these nested container scenarios because I changed the code from rootless.IsRootless() in podman to unshare.IsRootless() in c/storage.

@abalmos
Copy link
Author

abalmos commented Nov 21, 2023

@Luap99 Does #1740 fix this, or is the issue still standing? It is very surprising every time I come across it. Thanks!

@Luap99
Copy link
Member

Luap99 commented Nov 21, 2023

No this is not fixed by any of the XDG changes.
I guess the best fix for this issue is to just check the uid and if 0 use system session otherwise use the user session.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. network Networking related issue or feature stale-issue
Projects
None yet
Development

No branches or pull requests

4 participants