New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arch LXD (host + client): container not getting ipv4 address #4071

Closed
CountZukula opened this Issue Dec 4, 2017 · 14 comments

Comments

6 participants
@CountZukula

CountZukula commented Dec 4, 2017

Required information

config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    -----------
    -----END CERTIFICATE-----
  certificate_fingerprint: fdc630712d08eff1508eaf7cc281cea768ece662c75639fbeb6b4dfa9badda42
  driver: lxc
  driver_version: 2.1.1
  kernel: Linux
  kernel_architecture: x86_64
  kernel_version: 4.13.16-1-hardened
  server: lxd
  server_pid: 194
  server_version: "2.20"
  storage: btrfs
  storage_version: "4.13"

Issue description

I installed an Arch Virtualbox VM and started both an Ubuntu and an Arch container. The Ubuntu container gets an ipv4 address, the arch container doesn't. The host seems to behave well: it deals out ip's, the containers start OK, unprivileged mode works well. The arch container starts correctly, but can't seem to get the ipv4 address, only the ipv6 one. dhcpcd indicates that the interface is incorrectly installed? As far as I can see / know about the system, lxc's network system is working fine, so I'm at a loss where to look for answers. I suspect the archlinux image is not behaving correctly? To be sure: my goal is to get networking working in the arch container, getting an ipv4 address and be able to access the outside network.

[root@testcontainer ~]# dhcpcd       
dev: loaded udev
no valid interfaces found
no interfaces have a carrier
forked to background, child pid 22431

Steps to reproduce

  1. install arch on Virtualbox (linux-hardened to enable unprivileged containers)
  2. configure LXD / LXC using the aur repository
  3. launch an arch container (lxc launch images:archlinux/current/amd64) and an ubuntu container (lxc launch ubuntu:16.04)
  4. the ubuntu container gets an ipv4 address, the arch container doesn't

Information to attach

arch testcontainer config

[user@host ~]$ lxc config show testcontainer
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Archlinux current amd64 (20171203_01:27)
  image.os: Archlinux
  image.release: current
  image.serial: "20171203_01:27"
  volatile.base_image: 36aaa2701180327bf39efc7c70958be877d1fbef5f4762b9ebfefd9515ea847f
  volatile.eth0.hwaddr: 00:16:3e:ae:fb:95
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- default
stateful: false
description: ""

lxc uses lxdbr0 and is used by the two containers (arch and ubuntu)

[user@host ~]$ lxc network list
+--------+----------+---------+-------------+---------+
|  NAME  |   TYPE   | MANAGED | DESCRIPTION | USED BY |
+--------+----------+---------+-------------+---------+
| enp0s3 | physical | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| lxcbr0 | bridge   | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| lxdbr0 | bridge   | YES     |             | 2       |
+--------+----------+---------+-------------+---------+

configuration of the arch container

[user@host ~]$ lxc info testcontainer
Name: testcontainer
Remote: unix://
Architecture: x86_64
Created: 2017/12/03 21:13 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 609
Ips:
  eth0:	inet6	fd42:4c81:1f9c:71a5:216:3eff:feae:fb95	veth7XH4QA
  eth0:	inet6	fe80::216:3eff:feae:fb95	veth7XH4QA
  lo:	inet	127.0.0.1
  lo:	inet6	::1
Resources:
  Processes: 12
  CPU usage:
    CPU usage (in seconds): 240
  Memory usage:
    Memory (current): 237.16MB
    Memory (peak): 278.21MB
  Network usage:
    eth0:
      Bytes received: 400.45kB
      Bytes sent: 766B
      Packets received: 1076
      Packets sent: 9
    lo:
      Bytes received: 0B
      Bytes sent: 0B
      Packets received: 0
      Packets sent: 0

Ip addr output for the arch container

[root@testcontainer ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:ae:fb:95 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fd42:4c81:1f9c:71a5:216:3eff:feae:fb95/64 scope global dynamic mngtmpaddr 
       valid_lft 3579sec preferred_lft 3579sec
    inet6 fe80::216:3eff:feae:fb95/64 scope link 
       valid_lft forever preferred_lft forever

@brauner

This comment has been minimized.

Member

brauner commented Dec 4, 2017

Does ArchLinux still provide dhclient? If so, can you try and reproduce the same error with dhclient inside the container?

@CountZukula

This comment has been minimized.

CountZukula commented Dec 4, 2017

Dhclient is not available on the container. Looking around for it did set me on the path to some failed services inside the containers:

systemd-networkd.service
systemd-revolved.service
systemd-networkd.socket

Not sure if networkctl should be running by default, but here's the output of it (saw another question somewhere refer to it):

root@testcontainer ~]# networkctl status
WARNING: systemd-networkd is not running, output will be incomplete.

●        State: n/a
       Address: fd42:4c81:1f9c:71a5:216:3eff:feae:fb95 on eth0
                fe80::216:3eff:feae:fb95 on eth0
       Gateway: fe80::3cc7:fbff:feb3:7373 on eth0
[root@testcontainer ~]# networkctl
WARNING: systemd-networkd is not running, output will be incomplete.

IDX LINK             TYPE               OPERATIONAL SETUP     
  1 lo               loopback           n/a         unmanaged 
  4 eth0             ether              n/a         unmanaged 

2 links listed.

I'm reading around, normally you would edit /etc/netctl/* by adding a profile there, which you can then enable (such as a static ethernet connection on eth0). But I'm not sure whether that's the right course to take here, as eth0 is already getting an ipv6 somewhere (I'm guessing the outside LXD system).

@CountZukula

This comment has been minimized.

CountZukula commented Dec 4, 2017

I'm also seeing some errors in journalctl (in the container), but not sure whether they're related:

Dec 04 17:02:49 testcontainer systemd[1]: getty@lxc-tty2.service: Failed to set invocati
on ID on control group /system.slice/system-getty.slice/getty@lxc-tty2.service, ignoring
: Operation not permitted
Dec 04 17:02:49 testcontainer systemd[1]: Started Getty on lxc/tty2.
-- Subject: Unit getty@lxc-tty2.service has finished start-up
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit getty@lxc-tty2.service has finished starting up.
-- 
-- The start-up result is RESULT.
Dec 04 17:02:49 testcontainer agetty[25846]: /dev/lxc/tty2: cannot open as standard inpu
t: No such file or directory
Dec 04 17:02:49 testcontainer agetty[25845]: /dev/lxc/tty5: cannot open as standard inpu
t: No such file or directory

@CountZukula

This comment has been minimized.

CountZukula commented Dec 4, 2017

Ok, managed to dig a bit further w.r.t. the failed services. I'm still assuming the networkd service is responsible for getting the ipv4 address? I am testing an arch container on an Ubuntu laptop as well (lxd 2.20) which exhibits the same behaviour:

[root@testcontainer ~]# systemctl status systemd-networkd
● systemd-networkd.service - Network Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2017-12-04 17:01:07 UTC; 2h 7min ago
     Docs: man:systemd-networkd.service(8)
  Process: 25769 ExecStart=/usr/lib/systemd/systemd-networkd (code=exited, status=237/KEYRING)
 Main PID: 25769 (code=exited, status=237/KEYRING)

Dec 04 17:01:07 testcontainer systemd[1]: Failed to start Network Service.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Service has no hold-off time, scheduling restart.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Scheduled restart job, restart counter is at 5.
Dec 04 17:01:07 testcontainer systemd[1]: Stopped Network Service.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Start request repeated too quickly.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
Dec 04 17:01:07 testcontainer systemd[1]: Failed to start Network Service.
[root@testcontainer ~]# journalctl _PID=25769
-- Logs begin at Sun 2017-12-03 21:13:32 UTC, end at Mon 2017-12-04 19:08:34 UTC. --
Dec 04 17:01:07 testcontainer systemd[25769]: systemd-networkd.service: Failed to change ownership of session keyring: Permission denied
Dec 04 17:01:07 testcontainer systemd[25769]: systemd-networkd.service: Failed to set up kernel keyring: Permission denied
Dec 04 17:01:07 testcontainer systemd[25769]: systemd-networkd.service: Failed at step KEYRING spawning /usr/lib/systemd/systemd-networkd: Permission denied

The same error crops up with the resolved service.

@stgraber

This comment has been minimized.

Member

stgraber commented Dec 5, 2017

lxc profile set default security.syscalls.blacklist "keyctl errno 38"

And then restart your containers, that should take care of that.

The reason is that the networkd systemd unit somehow makes use of the kernel keyring, which doesn't work inside unprivileged containers right now. The line above makes that system call return not-implemented which is enough of a workaround to get things going again.

@CountZukula

This comment has been minimized.

CountZukula commented Dec 5, 2017

Magical, Stéphane, the container now has a proper ipv4 address without any extra effort. Thanks!

@CountZukula CountZukula closed this Dec 5, 2017

@fizzy123

This comment has been minimized.

fizzy123 commented Dec 19, 2017

I was running into this problem and I realized that this fix will only work on the feature branch and not the stable branch (for now). Commenting so other people don't get stuck like I did.

@cquike

This comment has been minimized.

cquike commented Jul 15, 2018

Hi,
I am having the same problem trying to start a vanilla Fedora 28 container on a fresh installed Ubuntu 18.04 host. By the way, these are privileged containers started by root. Luckily the fix with the syscall three comments above works fine (with lxc 3.0.1 from Ubuntu 18.04).
I wonder, however, if there is a way to get this working out of the box since I find it unfortunate the luck of interoperability between two of the major Linux distributions. I don't know where a proper solution would lie. Should lxd have more permissive defaults for that syscall? Should networkd be more resilient to failures opening the keyring? What do you think?
Thanks a lot!

@stgraber

This comment has been minimized.

Member

stgraber commented Jul 15, 2018

We've suggested a fix for systemd in the past but it didn't get included... Either that needs to be included or someone has to figure out unprivileged keyring use at the kernel level.

@cquike

This comment has been minimized.

cquike commented Jul 15, 2018

Thank you very much for the info. Do you have the link to the proposed systemd change?

@stgraber

This comment has been minimized.

Member

stgraber commented Jul 15, 2018

@brauner might

@brauner

This comment has been minimized.

Member

brauner commented Jul 16, 2018

@xnox might :)

@xnox

This comment has been minimized.

Contributor

xnox commented Jul 18, 2018

@cquike

This comment has been minimized.

cquike commented Jul 18, 2018

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment