Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly fail when we are asked to add a duplicate ip to the ipset #51081

Merged
merged 3 commits into from
May 17, 2024

Conversation

bleggett
Copy link
Contributor

@bleggett bleggett commented May 15, 2024

A user is (somehow) able to produce a scenario where, without istio-cni restarting, a pod IP goes missing from the host node ipset, causing an active pod to begin failing host node healthchecks.

Since we destructively mutate the ipset contents in exactly 3 spots:

  1. Flush on istio-cni startup
  2. Flush/destroy on istio-cni shutdown
  3. Removing pod IPs on K8S pod unlabel/delete events

and no istio-cni restart happens as per logs, that leaves (3) as the only way this could manifest, barring outside interference with the contents of the ipset.

The only scenario where I can imagine this happening is if

  • two pods reuse the same IP relatively close to each other on the same node.
  • K8S pod add and pod remove events for those two pods both overlap, and come in to us out-of-order - we add the new pod with the reused IP, and then somehow later get a remove event for the old pod with the original use of the IP, leading to the set being out of sync.

This PR changes the addPodIPToIpset logic, so that if we try to add a pod IP that is already there, we log/return an error - previously, we would silently overwrite the existing entry.

This will at least help us see if the above happens and cause the pod with the overlapping IP to fail to start - otherwise we should never legitimately have an overwrite.

(We might have missed this in testing months back because we were using ip:port maps, and the odds of two pods using the same IP AND the same ports is even tinier)

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
@bleggett bleggett requested a review from a team as a code owner May 15, 2024 20:58
@istio-policy-bot istio-policy-bot added area/ambient Issues related to ambient mesh area/networking labels May 15, 2024
@istio-testing istio-testing added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 15, 2024
@bleggett bleggett added the cherrypick/release-1.22 Set this label on a PR to auto-merge it to the release-1.22 branch label May 15, 2024
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
@istio-testing istio-testing added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 15, 2024
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
cni/pkg/nodeagent/net.go Show resolved Hide resolved
cni/pkg/nodeagent/net_test.go Show resolved Hide resolved
@hzxuzhonghu
Copy link
Member

Can the node level interception be deprecated now

@bleggett
Copy link
Contributor Author

bleggett commented May 17, 2024

Can the node level interception be deprecated now

What do you mean, exactly?

We don't do node level interception at all, but we still do need to do node level SNAT for kubelet healthchecks.

This doesn't change that, this just works around a bug that could exist there.

@hzxuzhonghu
Copy link
Member

OK, NVM, I thought is was intercetion on node

@hzxuzhonghu
Copy link
Member

we still do need to do node level SNAT for kubelet healthchecks.

Why still need this, now if ambient behaves like sidecar interception, we can eliminate host ipset completely

@bleggett
Copy link
Contributor Author

we still do need to do node level SNAT for kubelet healthchecks.

Why still need this, now if ambient behaves like sidecar interception, we can eliminate host ipset completely

Even for sidecars we must special-case host kubelet probes - with sidecars we just do it with webhook rewrites for the pod manifest.

Ambient doesn't have mutating webhooks, so we do it this way instead.

Sidecar mutates the pod manifest to redirect the health probes, ambient mutates the source IP to redirect the health probes. It's tricky.

@hzxuzhonghu
Copy link
Member

Yes, we know the src addrss of kubelete probe, so can do that better than sidecar via bypass this kind of traffic

@bleggett
Copy link
Contributor Author

bleggett commented May 17, 2024

Yes, we know the src addrss of kubelete probe, so can do that better than sidecar via bypass this kind of traffic

We don't know the source address of the kubelet probe. The current approach in ambient doesn't need to know it.

Mechanisms to reliably derive the kubelet IP from the perspective of the pod are very CNI dependent.

We don't bother to derive the pod-facing kubelet IP with ambient, we just use iptables to tell us if the packet originated locally on the node, from the host node netns.

Also, not all packets coming from the kubelet IP originate from the local kubelet process - the kubelet source IP is just not a reliable check. Which is why we don't use it.

It's tricky. Asserting policy based on "kubelet src IP" is just an unavoidably unsound approach, no matter how you slice it.

// Since we purge on restart of CNI, and remove pod IPs from the set on every pod removal/deletion,
// we _shouldn't_ get any overwrite/overlap, unless something is wrong and we are asked to add
// a pod by an IP we already have in the set (which will give an error, which we want).
if err := hostsideProbeSet.AddIP(pip, ipProto, podUID, false); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we hit the error, we fail to the CNI layer which retries and now the ordering should be correct. Right? It will recover?

Copy link
Contributor Author

@bleggett bleggett May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Inelegant, but it will work itself out for now (if the hypothesis about the cause is in fact correct)

@istio-testing istio-testing merged commit 143b9f6 into istio:master May 17, 2024
28 checks passed
@istio-testing
Copy link
Collaborator

In response to a cherrypick label: #51081 failed to apply on top of branch "release-1.22":

Applying: Explicitly fail when we are asked to add a duplicate ip to the ipset
Using index info to reconstruct a base tree...
M	cni/pkg/nodeagent/net.go
Falling back to patching base and 3-way merge...
Auto-merging cni/pkg/nodeagent/net.go
CONFLICT (content): Merge conflict in cni/pkg/nodeagent/net.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Explicitly fail when we are asked to add a duplicate ip to the ipset
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

@istio-testing
Copy link
Collaborator

In response to a cherrypick label: new issue created for failed cherrypick: #51131

bleggett added a commit to bleggett/istio that referenced this pull request May 17, 2024
…stio#51081)

* Explicitly fail when we are asked to add a duplicate ip to the ipset

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>

* Add relnotes

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>

* Test fixups

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>

---------

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
bleggett added a commit to bleggett/istio that referenced this pull request May 17, 2024
…stio#51081)

* Explicitly fail when we are asked to add a duplicate ip to the ipset

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>

* Add relnotes

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>

* Test fixups

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>

---------

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
istio-testing pushed a commit that referenced this pull request May 18, 2024
…51081) (#51132)

* Explicitly fail when we are asked to add a duplicate ip to the ipset



* Add relnotes



* Test fixups



---------

Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
luksa pushed a commit to luksa/istio that referenced this pull request Oct 14, 2024
* upstream/master:
  Automator: update ztunnel@master in istio/istio@master (istio#51143)
  Automator: update proxy@master in istio/istio@master (istio#51142)
  Automator: update go-control-plane in istio/istio@master (istio#51141)
  Automator: update ztunnel@master in istio/istio@master (istio#51139)
  Automator: update proxy@master in istio/istio@master (istio#51138)
  Work around k8s 1.23/24 templating bugs (istio#51135)
  Drop legacy references to 1.22 (istio#51127)
  Automator: update proxy@master in istio/istio@master (istio#51133)
  Workaround weird cluster versions (istio#51128)
  Explicitly fail when we are asked to add a duplicate ip to the ipset (istio#51081)
  State what we did, rather than why (which is somewhat moot) (istio#51123)
  `istio-cni` - use templated env from cm, simplify config (istio#51050)
  Automator: update ztunnel@master in istio/istio@master (istio#51122)
  rename confused function (istio#51118)
  Automator: update proxy@master in istio/istio@master (istio#51120)
  Clean up to improve readability (istio#51117)
  Fix data race in discovery filter (istio#51048)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ambient Issues related to ambient mesh area/networking cherrypick/release-1.22 Set this label on a PR to auto-merge it to the release-1.22 branch size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants