-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Feature request description
For a container which joins a podman-managed bridge, a veth pair is created, one side is moved into the containers network namespace and the other side is connected to the podman-bridge.
I disabled podmans firewall support and use my own, static nftables rules, so I can fully control which container can talk to which other container, which container can talk to the internet and which container can talk to which device on the host network. This is done using bridge chains(formerly ebtables).
There are two issues:
- The veth names are not configurable, so I don't know which port belongs to which container, preventing me from writing static rules.
- bridge chains don't have access to the name of the bridge, which complicates things on systems with multiple bridges (especially non-container bridges e.g. virbr0. Currently, they use vnet instead of veth prefixes, so that may be a bad example).
Suggest potential solution
Allow setting the name of the veth so I can simply match on it using static nftables rules. With that, I wouldn't even have to care about the bridge name anymore, because I know which port(veth) belongs to which container, no matter which bridge they are on.
Have you considered any alternatives?
These are my current workarounds for these issues:
- Use hard-coded MAC addresses for every container, so I can match on MACs instead of port names. This has a big limitation though: This only works for containers without the NET_ADMIN capability, because these could simply change their MAC address to bypass the firewall.
- Use an OCI prestart hook to insert/remove jumps to bridge-specific chains: https://github.com/M1cha/homeserver/blob/fff0ca9953544534c98f1b56a2ebb5f0b736ff2b/config/usr/local/bin/update-bridge-rules#L31 While this works great when all bridges are managed by podman and thus run this hook, it introduces a race condition if using other software which creates veths:
- podman creates veth1 and starts the container
- podman stops the container which removes veth1
- some other software creates veth1, so the old firewall rules apply to the new port for a short time.
- podman updates the rules to remove the veth1 rule. (I didn't even implement that, yet. Currently I only have a prestart hook)
Additional context
My main motivation for doing manual firewalling, is that podmans implementation isn't expressive enough to be able to control exactly which packets a container can send and receive. And it doesn't have to be thanks to how powerful nftables are nowadays. But static nftables don't always have all the info they need.