Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom iptables version monitor plugin #844

Merged
merged 1 commit into from Jan 3, 2024

Conversation

aojea
Copy link
Member

@aojea aojea commented Dec 20, 2023

iptables has two kernel backends, legacy and nft.

Quoting https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables

It is also important to note that while iptables-nft
can supplant iptables-legacy, you should never use them simultaneously.

However, we don't want to block the node operations because of this
reason, as there is no enough evidence this is causing big issues in the
wild, so we just signal and warn about this situation.

Once we have more information we can revisit this decision and
keep it as is or move it to permanent.

The plugin runs every day to avoid causing problems on large systems.

https://github.com/kubernetes-sigs/iptables-wrappers/blob/97b01f43a8e8db07840fc4b95e833a37c0d36b12/iptables-wrapper-installer.sh

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 20, 2023
@aojea
Copy link
Member Author

aojea commented Dec 20, 2023

/assign @thockin @danwinship @SergeyKanzhelev

@danwinship
Copy link

hosts and containers using the host can have different iptables versions, these versions are incompatible and can cause problems if both are present in the kernel.

"incompatible" and "can cause problems" are overstating things. As long as your rules are self-contained and aren't trying to interact with or override another component's rules, it generally doesn't matter whether you're using the same iptables mode as anyone else. (Unless the system only has kernel support for one of the two modes, but this PR doesn't help people running into that.)

I think we got a little bit confused about this back when iptables-1.8 first came out, because (a) we were also dealing with various bugs in iptables-1.8.[0-2] at the same time, without fully understand what was going on; (b) at the time, kube-proxy depended on iptables chains that were created by kubelet, so kube-proxy and kubelet specifically needed to be using the same iptables mode; (c) kube-proxy purports to poke holes in the system firewall, which can only work if kube-proxy and the firewall both use the same iptables mode.

The bugs eventually got fixed, KEP-3178 made kube-proxy no longer depend on kubelet's rules, and the "poking holes in the system firewall" use case turns out to be slightly dubious and probably unnecessary. (Some distros have moved to nft-based firewalls now, which kube-proxy's firewall-hole-poking code won't work with even if you're using iptables-nft, and no one has filed any bugs about this.)

So I'm not sure it really makes sense to call this out as a "problem"...

@aojea
Copy link
Member Author

aojea commented Dec 20, 2023

well, at least if there are two systems using the different versions they can override each other rules ... I do not know exactly the problems today, but I remember having segfaults because of mixed versions https://bugzilla.netfilter.org/show_bug.cgi?id=1476

We want this plugin to run just to prevent any possible problems , as is impossible to say that there is zero risk

@danwinship
Copy link

segfaults because of mixed versions https://bugzilla.netfilter.org/show_bug.cgi?id=1476

That's not "segfaults because of mixed versions", that's "segfaults because of a bug in the iptables binaries that you happened to run into on a node with mixed versions".

@danwinship
Copy link

We want this plugin to run just to prevent any possible problems , as is impossible to say that there is zero risk

It seems to me that this is more likely to generate false positives and warn people that there is a problem when everything is actually working fine.

@aojea
Copy link
Member Author

aojea commented Dec 20, 2023

It seems to me that this is more likely to generate false positives and warn people that there is a problem when everything is actually working fine.

interesting, should I add a threshold to the number of lines?
At least for GKE COS clusters I don't expect to have rules from both modes in a node

@aojea
Copy link
Member Author

aojea commented Dec 21, 2023

Based on Dan's feedback and npd docs, I'm going to update this to only report events and metrics to get signal, based on the feedback we can assess the criticality of the problem with real data and update to Condition if necessary

node-problem-detector uses Event and NodeCondition to report problems to apiserver.

NodeCondition: Permanent problem that makes the node unavailable for pods should be reported as NodeCondition.
Event: Temporary problem that has limited impact on pod but is informative should be reported as Event.

iptables has two kernel backends, legacy and nft.

Quoting https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables

> It is also important to note that while iptables-nft
> can supplant iptables-legacy, you should never use them simultaneously.

However, we don't want to block the node operations because of this
reason, as there is no enough evidence this is causing big issues in the
wild, so we just signal and warn about this situation.

Once we have more information we can revisit this decision and
keep it as is or move it to permanent.
@aojea
Copy link
Member Author

aojea commented Dec 21, 2023

#6: Error fetching NPD metrics: error fetching NPD metrics: {prow 35.188.211.187 curl http://localhost:20257/metrics
curl: (7) Failed to connect to localhost port 20257 after 0 ms: Couldn't connect to server

/test pull-npd-e2e-test

at first sight in does not look related to my PR

@danwinship
Copy link

Based on Dan's feedback and npd docs, I'm going to update this to only report events and metrics to get signal, based on the feedback we can assess the criticality of the problem with real data and update to Condition if necessary

OK. I don't really know much about node-problem-detector...

@hakman
Copy link
Member

hakman commented Jan 2, 2024

/lgtm
/cc @vteratipally @mmiranda96 @MartinForReal

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 2, 2024
if [ "$num_legacy_lines" -gt 0 ] && [ "$num_nft_lines" -gt 0 ]; then
echo "Found rules from both versions, iptables-legacy: ${num_legacy_lines} iptables-nft: ${num_nft_lines}"
echo $NONOK
elif [ "$num_legacy_lines" -gt 0 ] && [ "$num_nft_lines" -eq 0 ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the following conditions are necessary. From my understanding, there is no rule found doesn't mean the node is using one backend..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a Kubernetes cluster, kubelet always install an iptables rule to solve this specific problem https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/3178-iptables-cleanup#iptables-wrapper

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These mechanism of detecting the backends by counting rules has demonstrated to be the most effective and widely used

@vteratipally
Copy link
Collaborator

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, hakman, vteratipally

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 3, 2024
@k8s-ci-robot k8s-ci-robot merged commit e9eddcc into kubernetes:master Jan 3, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants