Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3s does not clean up kube-router iptables rules when restarting with --disable-kube-router. #7244

Closed
brandond opened this issue Apr 7, 2023 · 11 comments
Milestone

Comments

@brandond
Copy link
Member

brandond commented Apr 7, 2023

If K3s is initially started with the network policy controller enabled, and it is subsequently disabled, the controller's iptables rules are left in place, with a snapshot of whatever policies were last applied.

For example, after the following process:

  1. Install K3s
  2. add disable-network-policy: true to /etc/rancher/k3s/config.yaml
  3. Restart k3s

I still see the kube-router rules in place, and linked from the main INPUT chain:

systemd-node-1:/ # kubectl get node -o yaml | grep args
      k3s.io/node-args: '["server","--disable-network-policy","true"]'

systemd-node-1:/ # iptables -L INPUT
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
KUBE-ROUTER-INPUT  all  --  anywhere             anywhere             /* kube-router netpol - 4IA2OSFRMVNDXBVV */
KUBE-PROXY-FIREWALL  all  --  anywhere             anywhere             ctstate NEW /* kubernetes load balancer firewall */
KUBE-NODEPORTS  all  --  anywhere             anywhere             /* kubernetes health check service ports */
KUBE-EXTERNAL-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere
ACCEPT     all  --  anywhere             anywhere             /* KUBE-ROUTER rule to explicitly ACCEPT traffic that comply to network policies */ mark match 0x20000/0x20000

systemd-node-1:/ # iptables -L KUBE-ROUTER-INPUT
Chain KUBE-ROUTER-INPUT (1 references)
target     prot opt source               destination
RETURN     all  --  anywhere             ip-10-43-0-0.us-west-2.compute.internal/16  /* allow traffic to primary/secondary cluster IP range - EKROEGTNIJ3AP3LC */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
KUBE-POD-FW-CW5CGIAU2RMGPF37  all  --  ip-10-42-0-6.us-west-2.compute.internal  anywhere             /* rule to jump traffic from POD name:coredns-59b4f5bbd5-xdkpc namespace: kube-system to chain KUBE-POD-FW-CW5CGIAU2RMGPF37 */
KUBE-POD-FW-Y63Y2GINM6RLFJPG  all  --  ip-10-42-0-8.us-west-2.compute.internal  anywhere             /* rule to jump traffic from POD name:traefik-56b8c5fb5c-rtm64 namespace: kube-system to chain KUBE-POD-FW-Y63Y2GINM6RLFJPG */
KUBE-POD-FW-H6QOHGM7PLZN2VTH  all  --  ip-10-42-0-7.us-west-2.compute.internal  anywhere             /* rule to jump traffic from POD name:svclb-traefik-51ae277a-q67fp namespace: kube-system to chain KUBE-POD-FW-H6QOHGM7PLZN2VTH */
KUBE-POD-FW-6OJWO5XJCIDPXSJ3  all  --  ip-10-42-0-4.us-west-2.compute.internal  anywhere             /* rule to jump traffic from POD name:local-path-provisioner-76d776f6f9-xxzxx namespace: kube-system to chain KUBE-POD-FW-6OJWO5XJCIDPXSJ3 */
KUBE-POD-FW-26R3DURX34T42MXF  all  --  ip-10-42-0-5.us-west-2.compute.internal  anywhere             /* rule to jump traffic from POD name:metrics-server-7b67f64457-gtr6j namespace: kube-system to chain KUBE-POD-FW-26R3DURX34T42MXF */

The KUBE-ROUTER chains and rules should all be removed when the controller is disabled.

@brandond brandond added this to the v1.27.0+k3s1 milestone Apr 7, 2023
@brandond brandond moved this to Next Up in K3s Development Apr 7, 2023
@rbrtbnfgl
Copy link
Contributor

How do you restarted K3s?
The K3s uninstall script should have the command to remove the kube-router iptables rules.

@brandond
Copy link
Member Author

brandond commented Apr 7, 2023

I guess a reboot or uninstall/reinstall would also clear everything out yeah, but if that's what we want to recommend we at least need to document it. I wasn't aware that we weren't cleaning them up when the controller is disabled.

@brandond
Copy link
Member Author

brandond commented Apr 7, 2023

I'll leave it unassigned, sorry. We can talk through with the team about how to address it.

@CJCShadowsan
Copy link

CJCShadowsan commented Apr 19, 2023

I am having these exact issues... KUBE-ROUTER-INPUT is huge:

Chain KUBE-ROUTER-INPUT (4294831666 references)
target     prot opt source               destination
RETURN     all  --  0.0.0.0/0            10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  0.0.0.0/0            0.0.0.0/0            /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports 30000:32767
RETURN     udp  --  0.0.0.0/0            0.0.0.0/0            /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:32767
RETURN     all  --  0.0.0.0/0            10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  0.0.0.0/0            0.0.0.0/0            /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports 30000:32767
RETURN     udp  --  0.0.0.0/0            0.0.0.0/0            /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:32767

Well maybe not exact... What is going on here?

@CJCShadowsan
Copy link

Also not just that, but this:

[root@cca8101 ~]# iptables -L KUBE-ROUTER-INPUT
Chain KUBE-ROUTER-INPUT (1 references)
target     prot opt source               destination
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */
RETURN     tcp  --  anywhere             anywhere             /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports ndmps:filenet-powsrm
RETURN     udp  --  anywhere             anywhere             /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:filenet-powsrm
RETURN     all  --  anywhere             10.44.0.0/16         /* allow traffic to primary cluster IP range - LHPS74K4BTHQE7ZJ */

It seems the return rule duplicates itself ad-infinitum...?

@rbrtbnfgl
Copy link
Contributor

rbrtbnfgl commented Apr 19, 2023

Did you ever rebooted this node? How many times did you restart K3s?

@CJCShadowsan
Copy link

CJCShadowsan commented Apr 19, 2023

I'll leave these here for visibility, but:

This was k3s 1.23.16+k3s1, on Rocky Linux 8.7.

The root cause for my errors above, was:

#3117

adding the --prefer-bundled-bin option to my service resulted in everything working correctly, including what I thought was a longhorn issue but in fact turns out it was this.

This is a pretty horrible issue with OS-provided nftables and k3s!

@brandond
Copy link
Member Author

Yes, I have no idea why distros continue to package such a broken version of nftables.

@caroline-suse-rancher
Copy link
Contributor

At this point, this issue probably needs to be added to the docs before we can resolve. @rbrtbnfgl Would you be willing to do that?

@caroline-suse-rancher caroline-suse-rancher moved this from Next Up to Needs Additional in K3s Development Jun 5, 2023
@rbrtbnfgl
Copy link
Contributor

If kube-router it's still enabled when K3s restarted it should clean all the previous rules. The install script was updated to clean those rules. I don't know if we support changing the configuration of an already running instance.

@brandond
Copy link
Member Author

brandond commented Jun 6, 2023

We just need to document that kube-router leaves rules behind when disabled, and that they can be cleaned up with a manual set of commands and/or by re-running the install script. I don't see any reason why we wouldn't support toggling it on or off after the fact; users just need to know how to clean up after it since it won't clean up after itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants