Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxc-start veth bridge misconfiguration (no master) #4433

Open
yegorius opened this issue Apr 8, 2024 · 4 comments
Open

lxc-start veth bridge misconfiguration (no master) #4433

yegorius opened this issue Apr 8, 2024 · 4 comments
Labels
Incomplete Waiting on more information from reporter

Comments

@yegorius
Copy link

yegorius commented Apr 8, 2024

Required information

  • Distribution: ArchLinux
  • Distribution version: rolling
> lxc-start --version
6.0.0
> uname -a
Linux 6.7.6-zen1-2-zen #1 ZEN SMP PREEMPT_DYNAMIC x86_64 GNU/Linux
> cat /proc/self/cgroup
0::/user.slice/user-1000.slice/session-1.scope
lxc-checkconfig
LXC version 6.0.0

--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Warning: newuidmap is not setuid-root
Warning: newgidmap is not setuid-root
Network namespace: enabled
Namespace limits:
  cgroup: 256831
  ipc: 256831
  mnt: 256831
  net: 256831
  pid: 256831
  time: 256831
  user: 256831
  uts: 256831

--- Control groups ---
Cgroups: enabled
Cgroup namespace: enabled
Cgroup v1 mount points: 
Cgroup v2 mount points: 
 - /sys/fs/cgroup
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled, loaded
Macvlan: enabled, not loaded
Vlan: enabled, not loaded
Bridges: enabled, loaded
Advanced netfilter: enabled, loaded
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: enabled
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: enabled
CONFIG_NETLINK_DIAG: enabled
File capabilities: enabled
cat /proc/1/mounts
dev /dev devtmpfs rw,nosuid,relatime,size=32874420k,nr_inodes=8218605,mode=755,inode64 0 0
sys /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
run /run tmpfs rw,nosuid,nodev,mode=755,inode64 0 0
efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
/dev/nvme0n1p2 / f2fs rw,lazytime,relatime,background_gc=on,nogc_merge,discard,discard_unit=block,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,barrier,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,checkpoint_merge,fsync_mode=posix,memory=normal,errors=continue 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
bpf /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=37,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=9236 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,nosuid,nodev,relatime,pagesize=2M 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
tracefs /sys/kernel/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0
systemd-1 /efi autofs rw,relatime,fd=52,pgrp=1,timeout=120,minproto=5,maxproto=5,direct,pipe_ino=4179 0 0
systemd-1 /mnt/webdav/domicus autofs rw,relatime,fd=53,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=4184 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev,nr_inodes=1048576,inode64 0 0
/dev/nvme0n1p3 /home f2fs rw,lazytime,relatime,background_gc=on,nogc_merge,discard,discard_unit=block,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,barrier,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,checkpoint_merge,fsync_mode=posix,memory=normal,errors=continue 0 0
tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=6576072k,nr_inodes=1644018,mode=700,uid=1000,gid=100,inode64 0 0
gvfsd-fuse /run/user/1000/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=100 0 0
rw,nosuid,nodev,relatime,user_id=1000,group_id=100,allow_other,max_read=16384 0 0
binder /dev/binderfs binder rw,relatime,max=1048576 0 0
tracefs /sys/kernel/debug/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0

Issue description

No network connection inside a freshly created vanilla unmodified container after lxc-start.
Distro doesn't matter, tested with Alpine edge as well as latest Waydroid.
I have done some debugging and found some clues.
For a veth net type, lxc-start should attach the container interface to a bridge, which I can confirm by running:

> lxc-start ... -l debug
...
lxc-start test 20240408183711.793 INFO     network - ../src/lxc/network.c:netdev_configure_server_veth:745 - Attached "veth0xYE0v" to bridge "lxcbr0"
lxc-start test 20240408183711.793 DEBUG    network - ../src/lxc/network.c:netdev_configure_server_veth:876 - Instantiated veth tunnel "veth0xYE0v <--> vethNYVz4E"

But there is still no network in the container, e.g. Alpine edge: udhcpc failed to get a DHCP lease.
On the host bridge link output is empty, ip link show master lxcbr0 is empty.
Bridge config in sysfs:

> ls -l /sys/class/net/lxcbr0/brif/
total 0
> ip link
...
31: lxcbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
36: veth0xYE0v@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether fe:95:81:4b:e9:bd brd ff:ff:ff:ff:ff:ff link-netnsid 0

Here you can see that the bridge is DOWN. But a correctly configured bridge should be UP and veth0xYE0v should have a master.
Now, if I run ip link set dev veth0xYE0v master lxcbr0 everything starts to work: the container receives an IP address, bridge becomes UP etc.
So somehow, the bridge is misconfigured by the lxc-start.
If we dig deeper, we can see that lxc-start does indeed end up calling br_add_if in the kernel, same as ip link set ... master:

bpftrace -e 'kfunc:br_add_if { printf("%s: %s %s\n", comm, args.br->dev->name, args.dev->name); }'
Attaching 1 probe...
lxc-start: lxcbr0 veth0xYE0v
ip: lxcbr0 veth0xYE0v

The only difference is that iproute2 uses netlink mechanism to configure the net layer, whereas lxc-start calls ioctl(SIOCBRADDIF) which is supposed to be a more outdated way of configuring net devices.
So clearly, either lxc-start misses something during net configuration or there is something wrong with the underlying host.

Steps to reproduce

  1. Set USE_LXC_BRIDGE="true" in /etc/default/lxc
  2. Start lxc-net
  3. Create new container with Alpine Linux edge amd64
  4. Start the container with lxc-start -n test -F
  5. Observe the init output: udhcpc failed to get a DHCP lease

Additional info

Kernel log doesn't contain anything unusual.

Container log
> lxc-start -n test -F

   OpenRC 0.54 is starting up Linux 6.7.6-zen1-2-zen (x86_64) [LXC]

 * /proc is already mounted
 * Mounting /run ... [ ok ]
 * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Caching service dependencies ... [ ok ]
 * Mounting local filesystems ... [ ok ]
 * Migrating /var/lock to /run/lock ... [ ok ]
 * Creating user login records ... [ ok ]
 * Cleaning /tmp directory ... [ ok ]
 * Remounting devtmpfs on /dev ... [ ok ]
 * Mounting /dev/mqueue ... [ ok ]
 * Mounting /dev/shm ... [ ok ]
 * Starting busybox syslog ... [ ok ]
 * Starting busybox crond ... [ ok ]
 * Starting networking ... *   lo ... [ ok ]
 *   eth0 ...udhcpc: started, v1.36.1
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc failed to get a DHCP lease
udhcpc: no lease, forking to background
 [ ok ]

Welcome to Alpine Linux 3.19
Kernel 6.7.6-zen1-2-zen on an x86_64 (/dev/console)

test login:
Container config file
# Template used to create this container: /usr/share/lxc/templates/lxc-download
# Parameters passed to the template:
# Template script checksum (SHA-1): f568fbbaa379c008dd8abe57067fd20be66ad75a
# For additional config options, please look at lxc.container.conf(5)

# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)

# Distribution configuration
lxc.include = /usr/share/lxc/config/common.conf
lxc.arch = linux64

# Container specific configuration
lxc.rootfs.path = dir:/var/lib/lxc/test/rootfs
lxc.uts.name = test

# Network configuration
lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:95:df:78
@DevonSchwartz
Copy link

I am a student from UT and we are contributing to open source repositories for our final project. Could we be assigned this issue?

@DevonSchwartz
Copy link

Would we also set what file do we need to change in the git repo to replicate this issue?

@stgraber
Copy link
Member

@yegorius do you have Network Manager or a similar network management tool running on that system?

@stgraber stgraber added the Incomplete Waiting on more information from reporter label Apr 30, 2024
@stgraber
Copy link
Member

@DevonSchwartz it's not clear that this is a bug in LXC yet, we've seen this kind of issues coming from external tools interfering with LXC. If this was a generalized issue, we'd have seen a LOT of people complaining about it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Incomplete Waiting on more information from reporter
Development

No branches or pull requests

3 participants