Ergonomic creation of `macvtap` devices inside containers #45546

kroese · 2023-05-16T21:30:48Z

Description

Currently, to create a TAP interface for use with macvtap is extremely complicated from inside a Docker container.

Because to create the /dev/tapX file for the interface using mknod, you need to have device_cgroup_rules permissions for the corresponding major/minor numbers. But these numbers can never be known in advance, so you cannot include them in the docker-compose file.

The only workaround is let the container get these values, display them to the user via the logfile, and then let the user manually modify the containers device_cgroup_rules configuration with these values.

It would be so much more user-friendly if a special permission could be added, like NET_ADMIN, that would allow the creation of TAP devices without having to specify any cgroup rules. Because they are different on each system.

The text was updated successfully, but these errors were encountered:

corhere · 2023-05-17T15:50:49Z

create the /dev/tapX file for the interface using mknod

Huh? The kernel documentation says that a tun/tap device is created by opening /dev/net/tun and issuing an ioctl, and /dev/tapXX will appear.

Are you trying to "mount" an existing tap device on the host into the container? Use docker run --device /dev/tapX for that.

kroese · 2023-05-17T16:04:18Z

@corhere Maybe the problem is specific to macvtap, not to normal tuntap bridges.

Because if you create a macvtap interface like this (from inside the container):

ip link add link eth0 name vtap address xx type macvtap mode bridge
ip link set vtap up

no corresponding/dev/tapXX will appear in the Docker container, and you need to create it manually using mknod.

My current code to workaround this is:

TAP_NR=$(</sys/class/net/"${VTAP}"/ifindex)
TAP_PATH="/dev/tap${TAP_NR}"

# Create dev file (there is no udev in container: need to be done manually)
IFS=: read -r MAJOR MINOR < <(cat /sys/devices/virtual/net/"${VTAP}"/tap*/dev)
(( MAJOR < 1)) && echo "Cannot find: sys/devices/virtual/net/${VTAP}" && exit 18

{ mknod "${TAP_PATH}" c "$MAJOR" "$MINOR" ; rc=$?; } || :
(( rc != 0 )) && echo "Cannot mknod: ${TAP_PATH} ($rc)" && exit 20

{ exec 30>>"$TAP_PATH"; rc=$?; } 2>/dev/null || :

if (( rc != 0 )); then
    echo "Cannot create TAP interface ($rc). Please add the following docker settings to your "
    echo "container: --device-cgroup-rule='c ${MAJOR}:* rwm' " && exit 21
fi

This works.. But as it requires the user to manually add the resulting cgroup number to the compose file, it is far from user-friendly.

corhere · 2023-05-17T16:54:33Z

Could you not do something like this?

services:
  foo:
    devices:
      - "/dev/tap${TAP_NR}:/dev/my-vtap"

$ TAP_NR=$(</sys/class/net/"${VTAP}"/ifindex) docker compose up

kroese · 2023-05-17T17:02:50Z

I am not sure if I understand what you mean?

I create the macvtap interface on the container side, not on the host side? Because the users just download the container from DockerHub and start it.

If they need to first create a macvtap on their host system, it will be even more difficult to install than the current situation where they just have to add the cgroup numbers?

corhere · 2023-05-17T17:15:12Z

Sorry, I misunderstood. I thought you wanted to project an existing macvtap device on the host side into a container, not create a vtap interface inside the container.

corhere · 2023-05-17T17:57:54Z

Very relevant kernel discussion for exactly this sort of use-case. The patches were never merged. It does raise an interesting point, though: devices are not namespaced, so I'm not even sure how dockerd could be able to determine which container(s) to project a dynamically-created device node into in a generic way, so that it does not need specific knowledge about macvtap devices.

If you are okay with the macvtap device bridging to an interface in the host network namespace, I think that creating the macvtap interfaces on the host side and mounting them into the container is the only other viable workaround. You could wrap the logic to set up the tap device and start the container up into a script to make it more user-friendly.

kroese · 2023-05-17T18:16:31Z

I really want to avoid having to run any scripts on the host.

I can dynamicly create /dev/net/tun using mknod from inside the container without any problems (as long as NET_ADMIN is set). I hoped it would be just as simple for other /dev/tap devices.

kroese added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny status/0-triage labels May 16, 2023

kroese changed the title ~~Support creation of TAP devices~~ Support creation of MACVTAP devices May 17, 2023

corhere removed the status/0-triage label May 17, 2023

neersighted changed the title ~~Support creation of MACVTAP devices~~ Ergonomic creation of macvtap devices inside containers May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ergonomic creation of `macvtap` devices inside containers #45546

Ergonomic creation of `macvtap` devices inside containers #45546

kroese commented May 16, 2023 •

edited

corhere commented May 17, 2023

kroese commented May 17, 2023 •

edited

corhere commented May 17, 2023

kroese commented May 17, 2023 •

edited

corhere commented May 17, 2023

corhere commented May 17, 2023

kroese commented May 17, 2023 •

edited

Ergonomic creation of macvtap devices inside containers #45546

Ergonomic creation of macvtap devices inside containers #45546

Comments

kroese commented May 16, 2023 • edited

Description

corhere commented May 17, 2023

kroese commented May 17, 2023 • edited

corhere commented May 17, 2023

kroese commented May 17, 2023 • edited

corhere commented May 17, 2023

corhere commented May 17, 2023

kroese commented May 17, 2023 • edited

Ergonomic creation of `macvtap` devices inside containers #45546

Ergonomic creation of `macvtap` devices inside containers #45546

kroese commented May 16, 2023 •

edited

kroese commented May 17, 2023 •

edited

kroese commented May 17, 2023 •

edited

kroese commented May 17, 2023 •

edited