Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] nftables support #26824

Open
senden9 opened this issue Sep 22, 2016 · 101 comments
Open

[feature request] nftables support #26824

senden9 opened this issue Sep 22, 2016 · 101 comments
Labels
area/networking kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@senden9
Copy link

senden9 commented Sep 22, 2016

Docker seems to be optimized for iptables at the moment. Are there any plans to support nftables in future versions of Docker?

My workaround at the moment is to deactivate the iptables integration via --iptables=false and then set the right rules for nftables by hand.

@thaJeztah
Copy link
Member

I'm not aware of plans in this direction

ping @aboch is this planned? Worth doing?

@thaJeztah thaJeztah added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking labels Sep 22, 2016
@aboch
Copy link
Contributor

aboch commented Sep 22, 2016

I remember @mrjana had thought of using nftables last year. He knows more about the plan.
From what I read online, it seems nftables made it into kernel 3.13. Given docker supports up to linux 3.10, it may not be possible to move to nftables yet.

@mrjana
Copy link
Contributor

mrjana commented Sep 22, 2016

Yeah nftables are not in the kernel until 3.14 and we can't use it to generally replace iptables yet.

@Yamakaky
Copy link

Yamakaky commented Dec 4, 2016

Maybe add it as an option? That way, those who have the latest kernel can use it. Currently I have to disable my nftables firewall to get the network working, it's fine on my machine but it's not an option on a server.

@itagent
Copy link

itagent commented Apr 4, 2017

Any new ideas or progress. We are in transition to nftables and really would appreciate

@ford-perfect
Copy link

+1 for optional nftables support
Just some dates:
Linux LTS 3.10 has it's projected EOL in October 2017.
Debian 7.0's kernel is not supported anyway but Debian 8.0's one has nftables.
RHEL-7.3 EOL is not until 2024-06 and it runs 3.10 so there is a conflict here;
but the proposition is for an optional nfttables support additionally to the existing iptables support.

@gdahlm
Copy link

gdahlm commented Jun 1, 2017

I wanted that RHEL 7 does have nfttables as a tech preview, and it would greatly simplify ipv6 as well as allowing for a simpler implementations of throttling and very useful tools like connection tracking or load-balancing.

@Gunni
Copy link

Gunni commented Apr 16, 2018

I want to add that i've been using nftables on Centos 7 for over a year now i believe, on dozens of different servers both with and without nat, using ipv6 and more, and have had no issues other than understanding the parse errors when i mess up. And i'm using Ansible to manage and generate the nftables rules file and atomically reload the service to apply new rules, or do nothing if it fails to parse.

And since nftables applies the entire ruleset in one atomic operation, there is no moment when the system is in a partially configured state.

In my opinion i would NOT use nftables integration with docker unless i could control which file docker puts rules into and control the imports into my current ruleset myself and that docker would only issue reload commands to nftables (reload meaning nft -f , or systemctl which does it correctly).

I currently manage docker nat rules using ansible/manually.

@ojab
Copy link

ojab commented Jun 21, 2018

Meanwhile iptables is officially deprecated.

@cpuguy83
Copy link
Member

I don't see the reason to bother with nftables when the whole community seems to be (rightly) pushing for bpf.

@ojab
Copy link

ojab commented Jun 21, 2018

nftables uses bpf internally. If you've implied bpfilter — it's not there yet.

@cpuguy83
Copy link
Member

Sure it uses bpf internally, but it's not really any better than without bpf but rather about deduplication.
Even with bpf in the backend, nftables is still "slightly better" than iptables.

For that matter isn't iptables using nftables in the backed? (Don't quote me on that, I think I read that somewhere at some point, haven't looked into it).

@tianon
Copy link
Member

tianon commented Aug 12, 2018

According to https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables (which I'd imagine is pretty authoritative), using nftables and iptables at the same time is highly discouraged:

Beware of using both the nft and the legacy tools at the same time. That means using both x_tables and nf_tables kernel subsystems at the same time, and could lead to unexpected results.

I'd been playing with firewalld for building a router system and got tired of the way firewalld does things, so I was evaluating nftables, but the fact that I'd then have to disable Docker's iptables behavior and handle Docker's routing rules myself is a bit of a hurdle.

I've looked at doing eBPF, but it doesn't seem like there's nearly as many good examples (even nftables is a bit low on examples, but I've managed to find a few people doing things similar enough to what I need that I'm comfortable), so I don't really think it's totally fair to tell folks "we should just go straight to BPF instead" yet.

Just to include what I've found for reference, here's a couple folks who've worked on getting what Docker needs implemented in nftables:

I think docker network create's ability to create arbitrary bridges is going to further complicate this, but for my own use case I'll be able to dictate a fixed number of Docker networks, so that won't be a huge deal (just bringing it up in case folks in the future find this and need to implement something similar).

On implementation details, is the current iptables/firewalld code tightly coupled with the rest of the networking system, or is it already abstracted out reasonably enough that eBPF or nftables could theoretically be implemented as an optional backend? Is there perhaps a way we could make that code pluggable, or at least pluggability friendly? Even just having Docker write out to a file the set of things it would've asked iptables to do would be an improvement; isn't it mostly port openings and masquerade settings?

(Not trying to be a bother, just trying to add some additional information about why folks might care about this and brainstorm ideas for how it could maybe move forward without being too invasive. ❤️)

Edit (2018-08-13): #35777 is also relevant (even with --iptables=false, Docker still currently touches iptables to create DOCKER-USER).

@cpuguy83
Copy link
Member

is the current iptables/firewalld code tightly coupled with the rest of the networking system

It is horribly coupled right now. It's basically all the original iptables code from years ago moved out of docker/docker into docker/libnetwork and mostly not touched except to add more cruft to it to support custom chains (remember when docker didn't use it's own chain?) and firewalld, among other things.

@pentago
Copy link

pentago commented Jan 2, 2019

Hi all, any new progress on this?

nftables are getting default with Debian 10 (Buster) due in couple of months and with it goes the wave of adoption in derivative distros such as Ubuntu I guess.

Having upgrade season and iptables deprecation closing in quickly something will need to happen.

@cpuguy83
Copy link
Member

cpuguy83 commented Jan 2, 2019 via email

@camAtGitHub
Copy link

Redhat 8 (currently in Beta) - Notes: The nftables framework replaces iptables in the role of the default network packet filtering facility.
That means CentOS 8 will follow suit more than likely also...

@mavenugo
Copy link
Contributor

Thanks @camAtGitHub for the pointer.

The iptables, ip6tables, ebtables and arptables tools are replaced by nftables-based drop-in replacements with the same name. While external behavior is identical to their legacy counterparts, internally they use nftables with legacy netfilter kernel modules through a compatibility interface where required.

makes it the correct transition path. No change required for existing software making use of netfilter based iptables.

@tianon
Copy link
Member

tianon commented Jan 29, 2019 via email

@cpuguy83
Copy link
Member

I'd go for the latter, but I've also had PR's open for multiple years untouched on libnetwork.

@elboulangero
Copy link
Contributor

Debian user reported that indeed, he can't use docker on a machine where a nftables-based firewall is enabled: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=921600

@schka17
Copy link

schka17 commented Aug 11, 2020

It looks like it would work on Debian buster with the following conditions:

  • use an ip and ipv6 table instead of inet
  • name all chains exactly as in iptables: INPUT, OUTPUT & FORWARD

Source: https://ehlers.berlin/blog/nftables-and-docker/
Edit: I can confirm it :).

thank you for this tip, since my nftables I was troubleshooting looks like this now:

nft list ruleset | egrep "table|hook|chain"
table inet filter {
	chain global {
	chain INPUT {
		type filter hook input priority 0; policy drop;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
table ip nat {
	chain PREROUTING {
		type nat hook prerouting priority 0; policy accept;
	chain INPUT {
		type nat hook input priority 100; policy accept;
	chain POSTROUTING {
		type nat hook postrouting priority 100; policy accept;
	chain OUTPUT {
		type nat hook output priority -100; policy accept;
	chain DOCKER {
table ip filter {
	chain INPUT {
		type filter hook input priority 0; policy accept;
	chain FORWARD {
		type filter hook forward priority 0; policy accept;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
	chain DOCKER {
	chain DOCKER-ISOLATION-STAGE-1 {
	chain DOCKER-ISOLATION-STAGE-2 {
	chain DOCKER-USER {
table inet f2b-table-docker {
	chain f2b-chain {
		type filter hook forward priority -1; policy accept;

and it does not work as expected ..

Notice there is an input and output chain in the ip nat table....
Docker did that, because I didn't (checking now)

i can confirm it works creating an empty /etc/nftables.conf with this content:
`table ip nat {

    chain PREROUTING  {
            type nat hook prerouting priority -100; policy accept;
    }

    chain INPUT {
            type nat hook input priority 100; policy accept;
    }

    chain POSTROUTING {
            type nat hook postrouting priority 100; policy accept;
    }

    chain OUTPUT {
            type nat hook output priority -100; policy accept;
    }

}

table ip filter {
chain INPUT {
type filter hook input priority 0; policy accept;
}
chain FORWARD {
type filter hook forward priority 0; policy accept;
}
chain OUTPUT {
type filter hook output priority -100; policy accept;
}

}
`
then restart nftables and docker service. I didn't add nftable rules yet, but this is the next step

@mvorisek
Copy link

mvorisek commented Dec 8, 2021

Is there any progress on this issue? Can Docker check if nftables is present and use it instead of iptables? The commands should be the same, only the grammar is different.

@tao12345666333
Copy link
Contributor

I want to play with this package https://github.com/google/nftables

@camAtGitHub
Copy link

Hello again, Redhat 9 (just released) - Notes: The ipset and iptables-nft (iptables shim) packages have been deprecated.

When loading iptables modules you get a kernel message: Warning: <module_name> - this driver is not recommended for new deployments. It continues to be supported in this RHEL release, but it is likely to be removed in the next major release [1]

IPTables days (read: years) are numbered!

@marek22k
Copy link

marek22k commented Jun 7, 2023

Is there any news? It's been a while since the issue was created.

@polarathene
Copy link
Contributor

polarathene commented Jun 7, 2023

There is work towards this in the networking support, starting with an IPVS feature that if successful may establish the way forward for introducing nftables support later IIRC.

@Gunni
Copy link

Gunni commented Jun 7, 2023

I'd like to re-emphasize that it be done in a way that supports Atomic rule replacement.

Being able to modify my nftables file and reload(1) my firewall without it ever going down is amazing, and I also don't want my docker rules to vanish if I reload the firewall either.

(1) AFAIK on systemd systems reload is implemented with a service that has
ExecReload=/sbin/nft 'flush ruleset; include "/etc/sysconfig/nftables.conf";'

@trainzkid
Copy link

(1) AFAIK on systemd systems reload is implemented with a service that has ExecReload=/sbin/nft 'flush ruleset; include "/etc/sysconfig/nftables.conf";'

While I agree with your point, I do want to make a note that fail2ban doesn't play nice with reloading like that, and as such, not all systemd systems provide a reload field for nftables. I know my Arch systems don't. That may be more of a conversation for fail2ban though, not nftables..

@trainzkid
Copy link

It looks like it would work on Debian buster with the following conditions:

  • use an ip and ipv6 table instead of inet
  • name all chains exactly as in iptables: INPUT, OUTPUT & FORWARD

Source: https://ehlers.berlin/blog/nftables-and-docker/
Edit: I can confirm it :).

thank you for this tip, since my nftables I was troubleshooting looks like this now:

nft list ruleset | egrep "table|hook|chain"
table inet filter {
	chain global {
	chain INPUT {
		type filter hook input priority 0; policy drop;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
table ip nat {
	chain PREROUTING {
		type nat hook prerouting priority 0; policy accept;
	chain INPUT {
		type nat hook input priority 100; policy accept;
	chain POSTROUTING {
		type nat hook postrouting priority 100; policy accept;
	chain OUTPUT {
		type nat hook output priority -100; policy accept;
	chain DOCKER {
table ip filter {
	chain INPUT {
		type filter hook input priority 0; policy accept;
	chain FORWARD {
		type filter hook forward priority 0; policy accept;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
	chain DOCKER {
	chain DOCKER-ISOLATION-STAGE-1 {
	chain DOCKER-ISOLATION-STAGE-2 {
	chain DOCKER-USER {
table inet f2b-table-docker {
	chain f2b-chain {
		type filter hook forward priority -1; policy accept;

and it does not work as expected ..

Notice there is an input and output chain in the ip nat table....
Docker did that, because I didn't (checking now)

i can confirm it works creating an empty /etc/nftables.conf with this content:
`table ip nat {

    chain PREROUTING  {
            type nat hook prerouting priority -100; policy accept;
    }

    chain INPUT {
            type nat hook input priority 100; policy accept;
    }

    chain POSTROUTING {
            type nat hook postrouting priority 100; policy accept;
    }

    chain OUTPUT {
            type nat hook output priority -100; policy accept;
    }

}

table ip filter {
chain INPUT {
type filter hook input priority 0; policy accept;
}
chain FORWARD {
type filter hook forward priority 0; policy accept;
}
chain OUTPUT {
type filter hook output priority -100; policy accept;
}

}
`
then restart nftables and docker service. I didn't add nftable rules yet, but this is the next step

Is policy accept required for all of those, or do the chains just need to be there for it to work? The last thing I want to do is leave the input chain as default accept.

@aviallon
Copy link

FYI, we had to remove almost every usage of Docker in our organisation because of this.
Now everything is migrated to podman.

@jakubgs
Copy link

jakubgs commented Feb 13, 2024

You can use both nftables and iptables using different priority hooks, see this answer:
https://unix.stackexchange.com/questions/657545/nftables-whitelisting-docker

This does actually work. The answer mentions adding marks to packets handled by iptables, but it's not actually necessary.
The mark is there only if you want to handle packets filtered by iptables separately in nftables.

The key thing is to use higher hook priority than 0, which is the default for iptables:

type filter hook forward priority 10; policy drop;

Which means packets are first filtered by iptables, and then by nftables, allowing you to use both.

@Talkless
Copy link

You can use both nftables and iptables using different priority hooks

But this will not work if you want to use confiscators like firewalld, foomuuri, etc, running in nftables mode, as everything will probably be flushed by these tools?

@cpuguy83
Copy link
Member

docker supports firewalld.
If firewalld is detected it will insert rules there rather than directly.

Is there some specific case with firewalld that is broken?
Having not used the firewalld integration for anything other than some test years ago, I'm definitely blind to this personally.

@Talkless
Copy link

docker supports firewalld.

Yes, but I believe it does only if firewalld runs in iptables mode, not on nftables.

@westurner
Copy link

westurner commented Apr 11, 2024

FWIU podman 5.0 w/ containers/netavark has support for nftables and aardvark-dns:

From https://github.com/containers/podman/blob/7b95a8af80da7dcefe0bcbce3a24002a038c55a9/cni/README.md#L1-L4:

**Note**: The CNI backend is deprecated and will be removed
in the next major Podman version 5.0, in preference of Netavark,
see **[podman-network(1)](../docs/source/markdown/podman-network.1.md)** on how to change the backend.

There are a wide variety of different [CNI](https://github.com/containernetworking/cni) network configurations

netavark:

fwiw also pasta is a replacement for slip4netns:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests