-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for wrong iptables binary (#49279) + general iptables wrapper correctness pass #49703
Fix for wrong iptables binary (#49279) + general iptables wrapper correctness pass #49703
Conversation
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
This reverts commit a86d8e7.
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
ExistingRules: existingRules, | ||
}, nil | ||
} | ||
return IptablesVersion{}, fmt.Errorf("iptables save binary: %s not present", iptablesSaveBin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can capture the err
in Line 75 and use the following log message.
return IptablesVersion{}, fmt.Errorf("iptables save binary: %s not present", iptablesSaveBin) | |
return IptablesVersion{}, fmt.Errorf("error executing the %s binary: %v", iptablesSaveBin, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the reason I did not capture the err
on 75 is that it's redundant -
if iptables-save --version
fails, iptables-save
will fail, and vice-versa, so I only check+return for the first failure.
There is no real scenario where the first would not fail and the second would, so there is no need to capture both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Lets change the log message to include the execErr
.
The current message could be misleading to the users.
i.e., the failure need not be due to missing binary. As you know, there could be many other errors for the command to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sridhargaddam you were right! there is a difference :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bleggett thanks for the PR. I noticed a small typo in the log message and pushed the following PR to address it.
#49848
CC: @howardjohn
// binary is `legacy`, or vice-versa - we must match the binaries we have in our $PATH to what rules are actually defined | ||
// in our current netns context. | ||
// | ||
// Q: Why not simply "use the host default binary" at $PATH/iptables? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternate way to select between the two modes at run time is to use the approach explained in the following repo - https://github.com/kubernetes-sigs/iptables-wrappers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely familiar with Istio code base. Should we update this code as well? (as its forcing iptables to always use legacy)
Line 29 in 2c8dcd4
&& update-alternatives --set iptables /usr/sbin/iptables-legacy \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternate way to select between the two modes at run time is to use the approach explained in the following repo - https://github.com/kubernetes-sigs/iptables-wrappers
Yes - as mentioned in the PR description this will not work for us because we do not always have access to the host binaries because we are not always running in the host FS. kubelet
does, so k8s is fine doing it this way. Their heuristic for detecting which binary to use (checking for existing rules) is similar to what we are doing here, however.
I'm not completely familiar with Istio code base. Should we update this code as well? (as its forcing iptables to always use legacy)
Line 29 in 2c8dcd4
&& update-alternatives --set iptables /usr/sbin/iptables-legacy \
It doesn't matter, as with this PR we will prefer legacy
or nft
binaries explicitly, so the "container default" binary is moot - we can't rely on the container default being the same as the host default, so the container default isn't relevant to our selection heuristic.
With this PR, we no longer need to force a container default, and so can remove that line in a followup if we wish however - it won't change anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - as mentioned in the PR description this will not work for us because we do not always have access to the host binaries because we are not always running in the host FS.
We dont need access to the host binaries inside the pod. We can add the logic to the Dockerfile. As an example, you can see the following file - https://github.com/submariner-io/submariner/blob/3248d61fd6e20a6cb1d877ce75e30ba2a226faa6/package/Dockerfile.submariner-route-agent#L39
However, one requirement is that the pod should have priviledges to run iptables
command (I guess the pod will have them, as it needs to program some iptables rules).
It doesn't matter, as with this PR we will prefer legacy or nft binaries explicitly, so the "container default" binary is moot - we can't rely on the container default being the same as the host default, so the container default isn't relevant to our selection heuristic.
Right, I understand that. If the container defaults points to appropriate binary then we dont have to make any changes in the Istio go
code.
Anyways, I don't have any strong opinions on the approach and even the approach of selecting the iptables-nft/legacy inside the pod would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - as mentioned in the PR description this will not work for us because we do not always have access to the host binaries because we are not always running in the host FS.
We dont need access to the host binaries inside the pod. We can add the logic to the Dockerfile. As an example, you can see the following file - https://github.com/submariner-io/submariner/blob/3248d61fd6e20a6cb1d877ce75e30ba2a226faa6/package/Dockerfile.submariner-route-agent#L39
Since we are jumping into arbitrary pod network namespaces during runtime in some contexts, we cannot select the correct binary except at runtime, dynamically, at that point. So it is not something we can accurately detect in the Dockerfile, even if we wanted to.
Right, I understand that. If the container defaults points to appropriate binary then we dont have to make any changes in the Istio
go
code. Anyways, I don't have any strong opinions on the approach and even the approach of selecting the iptables-nft/legacy inside the pod would work.
The container default can never point to the appropriate binary in all cases where the wrapper is invoked, is the problem. No matter what we do with it. So we cannot rely on the system default binary no matter what.
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
// 3. If so, use `nft` binary set | ||
// 4. Otherwise, see if we have `legacy` binary set, and use that. | ||
// 5. Otherwise, see if we have `iptables` binary set, and use that (detecting whether it's nft or legacy). | ||
func (r *RealDependencies) DetectIptablesVersion(ipV6 bool) (IptablesVersion, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have some logging around this? (I may have missed it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only thing that stopped me was that the logger wasn't already imported here, but I might just add it and put in some logging, yeah.
…s wrapper correctness pass (istio#49703)" This reverts commit 8592312.
Please provide a description of this PR:
Background
In looking at a fix for #49279 I realized that our iptables binary selection logic across ambient and sidecar was just fundamentally not correct, and the effort to fix that bled into the correctness rework I had wanted to do on both inpod and sidecar iptables wrappers since merging inpod.
For context, there are 3 basic iptables binaries we use.
But with the kernel legacy/nft split, there are parallel binaries for each of those, and again for ipv6 tables, so the actual potential binary set on any given linux machine can be some combination of:
...and on and on - the matrix is like 12 different binaries, potentially. Which is ridiculous but common.
We use our
iptables
wrapper lib 4 different ways across sidecar and ambient, and in each case "which iptables binary should we use" is a question with a different answer.If, for instance, the host has
iptables-legacy
andiptables-nft
in $PATH, which should we use? We should see if rules are defined innft
at all and prefer that, but if no rules are innft
tables, and thelegacy
binary is available and rules are defined inlegacy
, we should use thelegacy
binary. If no rules are defined in either, we should use the system-default.If we are running directly on the host (not in a container) we can just assume whatever binary (legacy or nft) is aliased to
iptables
is correct and use that, and we're done. This is what k8s does for the kubelet.However we do something a little more weird than K8S - we also have CNI, and that is running iptables from inside a container against the host netns. The CNI container ships its own iptables binaries that may differ from the host. So we have to match the rules present in the host netns (which we can see) with the correct binary we have access to (only our own baked into the container) - we cannot rely purely on the K8S logic here because we are not only concerned with a kubelet-style context.
So basically selecting the right binary for one of the above cases is a heuristic with 2 inputs:
This was the cause of #49279 - we were using the in-container default binary, without inspecting host rules to see if that was actually the one we should use in the host context.
Really, calling code should never assume what binary variant of
iptables
is selected, and should simply define the type of operation (insert/save/restore
) and let detection logic pick the correct binary given the environmental context.But a lot of our code just assumes binary names and when used in some contexts, the result is simply incorrect.
What this does
This fixes that by fixing the wrapper lib to have binary detection logic that will work in all contexts where we use it, rely fully on binary autodetection in all spots, properly handling
iptables/ip6tables
variants (which, because of successive refactorings, we were not handling correctly, which the updated test goldens show), and fixing some incomplete or simply incorrectiptables
tests.The end result of this is that if our iptables wrapper is used, the correct
iptables
binary should be selected regardless of whether the context is a pod, the host, a container, or the host filesystem, or what "default" binary is configured in either the container or the host.Improves but does not fix