Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ergonomic creation of macvtap devices inside containers #45546

Open
kroese opened this issue May 16, 2023 · 7 comments
Open

Ergonomic creation of macvtap devices inside containers #45546

kroese opened this issue May 16, 2023 · 7 comments
Labels
kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@kroese
Copy link

kroese commented May 16, 2023

Description

Currently, to create a TAP interface for use with macvtap is extremely complicated from inside a Docker container.

Because to create the /dev/tapX file for the interface using mknod, you need to have device_cgroup_rules permissions for the corresponding major/minor numbers. But these numbers can never be known in advance, so you cannot include them in the docker-compose file.

The only workaround is let the container get these values, display them to the user via the logfile, and then let the user manually modify the containers device_cgroup_rules configuration with these values.

It would be so much more user-friendly if a special permission could be added, like NET_ADMIN, that would allow the creation of TAP devices without having to specify any cgroup rules. Because they are different on each system.

@kroese kroese added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny status/0-triage labels May 16, 2023
@corhere
Copy link
Contributor

corhere commented May 17, 2023

create the /dev/tapX file for the interface using mknod

Huh? The kernel documentation says that a tun/tap device is created by opening /dev/net/tun and issuing an ioctl, and /dev/tapXX will appear.

Are you trying to "mount" an existing tap device on the host into the container? Use docker run --device /dev/tapX for that.

@kroese
Copy link
Author

kroese commented May 17, 2023

@corhere Maybe the problem is specific to macvtap, not to normal tuntap bridges.

Because if you create a macvtap interface like this (from inside the container):

ip link add link eth0 name vtap address xx type macvtap mode bridge
ip link set vtap up

no corresponding/dev/tapXX will appear in the Docker container, and you need to create it manually using mknod.

My current code to workaround this is:

TAP_NR=$(</sys/class/net/"${VTAP}"/ifindex)
TAP_PATH="/dev/tap${TAP_NR}"

# Create dev file (there is no udev in container: need to be done manually)
IFS=: read -r MAJOR MINOR < <(cat /sys/devices/virtual/net/"${VTAP}"/tap*/dev)
(( MAJOR < 1)) && echo "Cannot find: sys/devices/virtual/net/${VTAP}" && exit 18

{ mknod "${TAP_PATH}" c "$MAJOR" "$MINOR" ; rc=$?; } || :
(( rc != 0 )) && echo "Cannot mknod: ${TAP_PATH} ($rc)" && exit 20

{ exec 30>>"$TAP_PATH"; rc=$?; } 2>/dev/null || :

if (( rc != 0 )); then
    echo "Cannot create TAP interface ($rc). Please add the following docker settings to your "
    echo "container: --device-cgroup-rule='c ${MAJOR}:* rwm' " && exit 21
fi

This works.. But as it requires the user to manually add the resulting cgroup number to the compose file, it is far from user-friendly.

@corhere
Copy link
Contributor

corhere commented May 17, 2023

Could you not do something like this?

services:
  foo:
    devices:
      - "/dev/tap${TAP_NR}:/dev/my-vtap"
$ TAP_NR=$(</sys/class/net/"${VTAP}"/ifindex) docker compose up

@kroese
Copy link
Author

kroese commented May 17, 2023

I am not sure if I understand what you mean?

I create the macvtap interface on the container side, not on the host side? Because the users just download the container from DockerHub and start it.

If they need to first create a macvtap on their host system, it will be even more difficult to install than the current situation where they just have to add the cgroup numbers?

@kroese kroese changed the title Support creation of TAP devices Support creation of MACVTAP devices May 17, 2023
@corhere
Copy link
Contributor

corhere commented May 17, 2023

Sorry, I misunderstood. I thought you wanted to project an existing macvtap device on the host side into a container, not create a vtap interface inside the container.

@corhere
Copy link
Contributor

corhere commented May 17, 2023

Very relevant kernel discussion for exactly this sort of use-case. The patches were never merged. It does raise an interesting point, though: devices are not namespaced, so I'm not even sure how dockerd could be able to determine which container(s) to project a dynamically-created device node into in a generic way, so that it does not need specific knowledge about macvtap devices.

If you are okay with the macvtap device bridging to an interface in the host network namespace, I think that creating the macvtap interfaces on the host side and mounting them into the container is the only other viable workaround. You could wrap the logic to set up the tap device and start the container up into a script to make it more user-friendly.

@kroese
Copy link
Author

kroese commented May 17, 2023

I really want to avoid having to run any scripts on the host.

I can dynamicly create /dev/net/tun using mknod from inside the container without any problems (as long as NET_ADMIN is set). I hoped it would be just as simple for other /dev/tap devices.

@neersighted neersighted changed the title Support creation of MACVTAP devices Ergonomic creation of macvtap devices inside containers May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests

2 participants