Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] nftables support #26824

Open
senden9 opened this issue Sep 22, 2016 · 87 comments
Open

[feature request] nftables support #26824

senden9 opened this issue Sep 22, 2016 · 87 comments

Comments

@senden9
Copy link

@senden9 senden9 commented Sep 22, 2016

Docker seems to be optimized for iptables at the moment. Are there any plans to support nftables in future versions of Docker?

My workaround at the moment is do deactivate the iptables integration via --iptables=false and then set the right rules for nftables by hand.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 22, 2016

I'm not aware of plans in this direction

ping @aboch is this planned? Worth doing?

Loading

@aboch
Copy link
Contributor

@aboch aboch commented Sep 22, 2016

I remember @mrjana had thought of using nftables last year. He knows more about the plan.
From what I read online, it seems nftables made it into kernel 3.13. Given docker supports up to linux 3.10, it may not be possible to move to nftables yet.

Loading

@mrjana
Copy link
Contributor

@mrjana mrjana commented Sep 22, 2016

Yeah nftables are not in the kernel until 3.14 and we can't use it to generally replace iptables yet.

Loading

@Yamakaky
Copy link

@Yamakaky Yamakaky commented Dec 4, 2016

Maybe add it as an option? That way, those who have the latest kernel can use it. Currently I have to disable my nftables firewall to get the network working, it's fine on my machine but it's not an option on a server.

Loading

@itagent
Copy link

@itagent itagent commented Apr 4, 2017

Any new ideas or progress. We are in transition to nftables and really would appreciate

Loading

@ford-perfect
Copy link

@ford-perfect ford-perfect commented Apr 7, 2017

+1 for optional nftables support
Just some dates:
Linux LTS 3.10 has it's projected EOL in October 2017.
Debian 7.0's kernel is not supported anyway but Debian 8.0's one has nftables.
RHEL-7.3 EOL is not until 2024-06 and it runs 3.10 so there is a conflict here;
but the proposition is for an optional nfttables support additionally to the existing iptables support.

Loading

@gdahlm
Copy link

@gdahlm gdahlm commented Jun 1, 2017

I wanted that RHEL 7 does have nfttables as a tech preview, and it would greatly simplify ipv6 as well as allowing for a simpler implementations of throttling and very useful tools like connection tracking or load-balancing.

Loading

@Gunni
Copy link

@Gunni Gunni commented Apr 16, 2018

I want to add that i've been using nftables on Centos 7 for over a year now i believe, on dozens of different servers both with and without nat, using ipv6 and more, and have had no issues other than understanding the parse errors when i mess up. And i'm using Ansible to manage and generate the nftables rules file and atomically reload the service to apply new rules, or do nothing if it fails to parse.

And since nftables applies the entire ruleset in one atomic operation, there is no moment when the system is in a partially configured state.

In my opinion i would NOT use nftables integration with docker unless i could control which file docker puts rules into and control the imports into my current ruleset myself and that docker would only issue reload commands to nftables (reload meaning nft -f , or systemctl which does it correctly).

I currently manage docker nat rules using ansible/manually.

Loading

@ojab
Copy link

@ojab ojab commented Jun 21, 2018

Meanwhile iptables is officially deprecated.

Loading

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Jun 21, 2018

I don't see the reason to bother with nftables when the whole community seems to be (rightly) pushing for bpf.

Loading

@ojab
Copy link

@ojab ojab commented Jun 21, 2018

nftables uses bpf internally. If you've implied bpfilter — it's not there yet.

Loading

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Jun 21, 2018

Sure it uses bpf internally, but it's not really any better than without bpf but rather about deduplication.
Even with bpf in the backend, nftables is still "slightly better" than iptables.

For that matter isn't iptables using nftables in the backed? (Don't quote me on that, I think I read that somewhere at some point, haven't looked into it).

Loading

@tianon
Copy link
Member

@tianon tianon commented Aug 12, 2018

According to https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables (which I'd imagine is pretty authoritative), using nftables and iptables at the same time is highly discouraged:

Beware of using both the nft and the legacy tools at the same time. That means using both x_tables and nf_tables kernel subsystems at the same time, and could lead to unexpected results.

I'd been playing with firewalld for building a router system and got tired of the way firewalld does things, so I was evaluating nftables, but the fact that I'd then have to disable Docker's iptables behavior and handle Docker's routing rules myself is a bit of a hurdle.

I've looked at doing eBPF, but it doesn't seem like there's nearly as many good examples (even nftables is a bit low on examples, but I've managed to find a few people doing things similar enough to what I need that I'm comfortable), so I don't really think it's totally fair to tell folks "we should just go straight to BPF instead" yet.

Just to include what I've found for reference, here's a couple folks who've worked on getting what Docker needs implemented in nftables:

I think docker network create's ability to create arbitrary bridges is going to further complicate this, but for my own use case I'll be able to dictate a fixed number of Docker networks, so that won't be a huge deal (just bringing it up in case folks in the future find this and need to implement something similar).

On implementation details, is the current iptables/firewalld code tightly coupled with the rest of the networking system, or is it already abstracted out reasonably enough that eBPF or nftables could theoretically be implemented as an optional backend? Is there perhaps a way we could make that code pluggable, or at least pluggability friendly? Even just having Docker write out to a file the set of things it would've asked iptables to do would be an improvement; isn't it mostly port openings and masquerade settings?

(Not trying to be a bother, just trying to add some additional information about why folks might care about this and brainstorm ideas for how it could maybe move forward without being too invasive. ❤️)

Edit (2018-08-13): #35777 is also relevant (even with --iptables=false, Docker still currently touches iptables to create DOCKER-USER).

Loading

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Aug 15, 2018

is the current iptables/firewalld code tightly coupled with the rest of the networking system

It is horribly coupled right now. It's basically all the original iptables code from years ago moved out of docker/docker into docker/libnetwork and mostly not touched except to add more cruft to it to support custom chains (remember when docker didn't use it's own chain?) and firewalld, among other things.

Loading

@pentago
Copy link

@pentago pentago commented Jan 2, 2019

Hi all, any new progress on this?

nftables are getting default with Debian 10 (Buster) due in couple of months and with it goes the wave of adoption in derivative distros such as Ubuntu I guess.

Having upgrade season and iptables deprecation closing in quickly something will need to happen.

Loading

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Jan 2, 2019

Loading

@camAtGitHub
Copy link

@camAtGitHub camAtGitHub commented Jan 16, 2019

Redhat 8 (currently in Beta) - Notes: The nftables framework replaces iptables in the role of the default network packet filtering facility.
That means CentOS 8 will follow suit more than likely also...

Loading

@mavenugo
Copy link
Contributor

@mavenugo mavenugo commented Jan 29, 2019

Thanks @camAtGitHub for the pointer.

The iptables, ip6tables, ebtables and arptables tools are replaced by nftables-based drop-in replacements with the same name. While external behavior is identical to their legacy counterparts, internally they use nftables with legacy netfilter kernel modules through a compatibility interface where required.

makes it the correct transition path. No change required for existing software making use of netfilter based iptables.

Loading

@tianon
Copy link
Member

@tianon tianon commented Jan 29, 2019

Loading

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Jan 29, 2019

I'd go for the latter, but I've also had PR's open for multiple years untouched on libnetwork.

Loading

@elboulangero
Copy link
Contributor

@elboulangero elboulangero commented Feb 28, 2019

Debian user reported that indeed, he can't use docker on a machine where a nftables-based firewall is enabled: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=921600

Loading

@liskin
Copy link
Contributor

@liskin liskin commented May 3, 2020

@cpuguy83 Yes, right. Detecting which one to use without firewalld is a daunting task indeed, as one might happily use both on the same system. :-/

Loading

@Yajo
Copy link

@Yajo Yajo commented May 4, 2020

Here is a workaround for Fedora 32: https://bugzilla.redhat.com/show_bug.cgi?id=1817022#c2

Loading

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented May 4, 2020

Would love to hear from @arkodg on this.

I've only looked a little at the firewalld docs over the weekend, but it does seem like we could at least enhance our firewalld support to use their zones API instead of just passing through iptables configuration and it would use whatever backend firewalld is configured with.
This, of course, is only part of a solution to supporting nftables, because not everyone uses firewalld.

Loading

@darkbasic
Copy link

@darkbasic darkbasic commented May 4, 2020

Depending on firewalld would be ok IMO and it would be the easiest way to support both iptables and nftables.

Loading

@Sispheor
Copy link

@Sispheor Sispheor commented May 4, 2020

I can confirm that a lot of people don't use firewalld as it is not really DevOps ready.
Also, as iptables seems now deprecated, the continuity would be to switch to its replacent nftables, doesn't it?

Loading

@Gunni
Copy link

@Gunni Gunni commented May 4, 2020

https://bugzilla.redhat.com/show_bug.cgi?id=1830618 just got closed with WONTFIX

RHEL 8 does not support Docker, and instead provides Podman, which offers a command line-compatible container management solution for single nodes usecases.

Podman already works with nftables (and has done so since release), and so this RFE is not relevant to it. Given this, and the fact that we do not support Docker on RHEL8, I am going to close this as WONTFIX. We'd be happy to help with any Podman bugs you encounter if you choose to migrate, but we're no longer accepting RFEs against Docker.

@Sispheor Yes indeed, i've been using it for years now, deploying rules with ansible is as simple as modifying a text file, getting the diff and everything, the atomic rule replacement is also a gamechanger compared to the dark times of iptables.

Loading

@Sispheor
Copy link

@Sispheor Sispheor commented May 4, 2020

Yep you are right, With Ansible it's ok. It's what I do. So we need a wrapper around the wrapper ;).

Anyway, I still have issues with firewalld too. I do not manage to get both Docker and firewalld with some personal rules working together.

Loading

@arkodg
Copy link
Contributor

@arkodg arkodg commented May 4, 2020

@cpuguy83 I think there are 3 things being discussed in this thread :)

  1. Native NFT library - docker / libnetwork supports iptables today and should natively support a nftables library. We are always open to receiving PR contributions for such enhancements . This would definitely help since today for newer distros that support nft , the iptables binary translates rules into the nft sub system at a suboptimal speed

  2. Docker rules getting wiped out during a systemd nftables service restart - worth exploring if something like /usr/sbin/iptables-nft-restore --noflush will work in the ExecStart section

  3. Firewalld with nftables backend not playing well with Docker - Even though Docker uses direct passthrough with Firewalld, default Firewalld block/deny rules are of higher priority breaking container networking . I'm not a firewalld expert but raised moby/libnetwork#2548 to include docker interfaces into the firewalld trusted zone

Loading

@lee-jnk
Copy link

@lee-jnk lee-jnk commented May 5, 2020

Is it possible to find a solution from Red Hat podman code ?

https://github.com/containers/libpod

According to

https://bugzilla.redhat.com/show_bug.cgi?id=1830618

"Podman already works with nftables"

and docker images from Docker Hub

example :

podman run hello-world

Loading

@arkodg
Copy link
Contributor

@arkodg arkodg commented May 5, 2020

@lee-jnk AFAIK podman offloads networking to cni (https://github.com/containernetworking/plugins) which does have a an open PR for nftables containernetworking/plugins#462

moving to cni would be great, but there are too many interdependencies between moby, swarm and libnetwork to make it happen right now

Loading

@lee-jnk
Copy link

@lee-jnk lee-jnk commented May 5, 2020

@arkodg

I don't know how hard it is for the developers to go through all the code.

But as both Red Hat and Docker have their code out in the open, it should be easier to share code compared to closed source projects.

thanks.

Loading

@lee-jnk
Copy link

@lee-jnk lee-jnk commented May 6, 2020

Maybe we can submit to

https://summerofcode.withgoogle.com/

if the time is not over.

Loading

@senden9
Copy link
Author

@senden9 senden9 commented May 6, 2020

if the time is not over.

All organizations wishing to be a part of GSoC 2020 must complete their application by February 5, 2020 20:00 (Central European Standard Time).

Loading

@Sispheor
Copy link

@Sispheor Sispheor commented May 8, 2020

Somebody found a workaround? Even with firewalld?

I tried this:

firewall-cmd --permanent --zone=trusted --add-interface=docker0
firewall-cmd --permanent --zone=trusted --add-interface=br-00cb9dee87e3
firewall-cmd --reload

So docker is working. But my other rules are now ignored.

Loading

@Yajo

This comment has been minimized.

@pixelprogrammer
Copy link

@pixelprogrammer pixelprogrammer commented May 19, 2020

I updated Fedora to 32 from 31 and I ran into these issues everyone here is having. However, after several possible solutions, I gave up. I just made firewalld use the iptables for the backend to get it working. Here is what I did:

Edit firewalld.conf
The file is located at /etc/firewalld/firewalld.conf
Change this line:

FirewallBackend=nftables

to the following:

FirewallBackend=iptables

Then restart the firewalld service to load the updated settings:

sudo service firewalld restart

I would prefer a solution using nftables but I just needed things fixed on my local machine so I can get back to work. I don't know if I am going to be expecting some issues down the road but this has got it working for me for now.

I will check in on this thread if I see a solution that seems to be agreed upon or if the Docker team comes up with a fix.

Keep up the good work everyone. This sounds like its an annoying issue to resolve.

Loading

@emanuelduss
Copy link

@emanuelduss emanuelduss commented May 23, 2020

It looks like it would work on Debian buster with the following conditions:

  • use an ip and ipv6 table instead of inet
  • name all chains exactly as in iptables: INPUT, OUTPUT & FORWARD

Source: https://ehlers.berlin/blog/nftables-and-docker/

Edit: I can confirm it :).

Loading

@jooola
Copy link

@jooola jooola commented May 24, 2020

@mindfuckup Does your firewall prevent outside world from accessing docker published ports ? I had a problem that the firewall would forward all port exposed by docker to the world.

Loading

@Sispheor
Copy link

@Sispheor Sispheor commented May 24, 2020

This is the script that saved me

It's based on the good old iptables. It will do the job until nftables is supported.

Loading

@emanuelduss
Copy link

@emanuelduss emanuelduss commented May 24, 2020

@mindfuckup Does your firewall prevent outside world from accessing docker published ports ? I had a problem that the firewall would forward all port exposed by docker to the world.

Yes, but only if you have any INPUT rule that allows it.

Loading

@Sispheor
Copy link

@Sispheor Sispheor commented May 24, 2020

@jooola try the script I pointed out. It works well to block exposed port by Docker and leaving docker update iptables to keep internal networking ok.

Loading

@lee-jnk
Copy link

@lee-jnk lee-jnk commented May 25, 2020

I don't know if changing from nftables to iptables will break something else in some other app/service or the os itself.

Loading

@lee-jnk
Copy link

@lee-jnk lee-jnk commented Jun 26, 2020

One solution in Fedora.

https://fedoramagazine.org/docker-and-fedora-32/

Loading

@hanscees
Copy link

@hanscees hanscees commented Jun 26, 2020

It looks like it would work on Debian buster with the following conditions:

  • use an ip and ipv6 table instead of inet
  • name all chains exactly as in iptables: INPUT, OUTPUT & FORWARD

Source: https://ehlers.berlin/blog/nftables-and-docker/

Edit: I can confirm it :).

thank you for this tip, since my nftables I was troubleshooting looks like this now:

nft list ruleset | egrep "table|hook|chain"
table inet filter {
	chain global {
	chain INPUT {
		type filter hook input priority 0; policy drop;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
table ip nat {
	chain PREROUTING {
		type nat hook prerouting priority 0; policy accept;
	chain INPUT {
		type nat hook input priority 100; policy accept;
	chain POSTROUTING {
		type nat hook postrouting priority 100; policy accept;
	chain OUTPUT {
		type nat hook output priority -100; policy accept;
	chain DOCKER {
table ip filter {
	chain INPUT {
		type filter hook input priority 0; policy accept;
	chain FORWARD {
		type filter hook forward priority 0; policy accept;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
	chain DOCKER {
	chain DOCKER-ISOLATION-STAGE-1 {
	chain DOCKER-ISOLATION-STAGE-2 {
	chain DOCKER-USER {
table inet f2b-table-docker {
	chain f2b-chain {
		type filter hook forward priority -1; policy accept;

and it does not work as expected ..

Notice there is an input and output chain in the ip nat table....
Docker did that, because I didn't (checking now)

Loading

@greenpau
Copy link

@greenpau greenpau commented Jul 30, 2020

@lee-jnk AFAIK podman offloads networking to cni (https://github.com/containernetworking/plugins) which does have a an open PR for nftables containernetworking/plugins#462

@arkodg , I am getting back to working on containernetworking/plugins#462. Would appreciate any help reviewing, providing feedback on the implementation.

Loading

@schka17
Copy link

@schka17 schka17 commented Aug 11, 2020

It looks like it would work on Debian buster with the following conditions:

  • use an ip and ipv6 table instead of inet
  • name all chains exactly as in iptables: INPUT, OUTPUT & FORWARD

Source: https://ehlers.berlin/blog/nftables-and-docker/
Edit: I can confirm it :).

thank you for this tip, since my nftables I was troubleshooting looks like this now:

nft list ruleset | egrep "table|hook|chain"
table inet filter {
	chain global {
	chain INPUT {
		type filter hook input priority 0; policy drop;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
table ip nat {
	chain PREROUTING {
		type nat hook prerouting priority 0; policy accept;
	chain INPUT {
		type nat hook input priority 100; policy accept;
	chain POSTROUTING {
		type nat hook postrouting priority 100; policy accept;
	chain OUTPUT {
		type nat hook output priority -100; policy accept;
	chain DOCKER {
table ip filter {
	chain INPUT {
		type filter hook input priority 0; policy accept;
	chain FORWARD {
		type filter hook forward priority 0; policy accept;
	chain OUTPUT {
		type filter hook output priority 0; policy accept;
	chain DOCKER {
	chain DOCKER-ISOLATION-STAGE-1 {
	chain DOCKER-ISOLATION-STAGE-2 {
	chain DOCKER-USER {
table inet f2b-table-docker {
	chain f2b-chain {
		type filter hook forward priority -1; policy accept;

and it does not work as expected ..

Notice there is an input and output chain in the ip nat table....
Docker did that, because I didn't (checking now)

i can confirm it works creating an empty /etc/nftables.conf with this content:
`table ip nat {

    chain PREROUTING  {
            type nat hook prerouting priority -100; policy accept;
    }

    chain INPUT {
            type nat hook input priority 100; policy accept;
    }

    chain POSTROUTING {
            type nat hook postrouting priority 100; policy accept;
    }

    chain OUTPUT {
            type nat hook output priority -100; policy accept;
    }

}

table ip filter {
chain INPUT {
type filter hook input priority 0; policy accept;
}
chain FORWARD {
type filter hook forward priority 0; policy accept;
}
chain OUTPUT {
type filter hook output priority -100; policy accept;
}

}
`
then restart nftables and docker service. I didn't add nftable rules yet, but this is the next step

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet