Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s: delete IPs from ipcache for no running Pods #13220

Merged
merged 4 commits into from
Sep 25, 2020

Conversation

aanm
Copy link
Member

@aanm aanm commented Sep 18, 2020

In Kubernetes, a Job creates a pod which will complete with either
the "Succeeded" or "Failed" PodPhase. Kubernetes will leave these
Pods around until the Job is deleted by the operator. As soon the pod
enters either one of the previously described PodPhases, Kubelet will
send a CNI delete event to Cilium agent which will then release the
allocated IP addresses of that pod, making the IP address available
again.

If not disabled, Cilium will create a Cilium Endpoint for each Pod in
the cluster that has its network managed by Cilium.

Cilium agent populates the ipcache with the information retrieved from
Pods and Cilium Endpoints events, in case of duplicated information,
ipcache will be stored with the state from Cilium Endpoints.

In a unlikely case of Cilium agent not running and the Pod enters the
"Succeeded" state, it will mean the Cilium agent will not be available
to delete the Cilium Endpoint created for that Pod.

To complement this fix, Cilium agents will also prune Cilium Endpoints
of not running pods on start up.

Delete dangling Cilium Endpoints for completed Kubernetes Jobs.

Fixes #12953

@aanm aanm added kind/bug This is a bug in the Cilium logic. priority/high This is considered vital to an upcoming release. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. needs-backport/1.7 labels Sep 18, 2020
@aanm aanm requested a review from a team September 18, 2020 17:45
@aanm aanm requested a review from a team as a code owner September 18, 2020 17:45
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.8.4 Sep 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.7.10 Sep 18, 2020
@aanm aanm added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Sep 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 18, 2020
@aanm
Copy link
Member Author

aanm commented Sep 18, 2020

test-me-please

Copy link
Member

@nathanjsweet nathanjsweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit. Otherwise, LGTM.

operator/k8s_cep_gc.go Outdated Show resolved Hide resolved
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! A minor nit and a question regarding potentially simplifying logic by using sync.Once.

operator/k8s_cep_gc.go Outdated Show resolved Hide resolved
operator/k8s_cep_gc.go Outdated Show resolved Hide resolved
Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit only.

operator/k8s_cep_gc.go Outdated Show resolved Hide resolved
@aanm aanm force-pushed the pr/fix-complete-pod-statuses branch from ed7b66a to 3ad73f7 Compare September 21, 2020 12:40
@aanm
Copy link
Member Author

aanm commented Sep 21, 2020

test-me-please

@aanm
Copy link
Member Author

aanm commented Sep 21, 2020

retest-gke

operator/k8s_cep_gc.go Show resolved Hide resolved
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to simplify the flow / make it more digestible.

@aanm aanm added the dont-merge/blocked Another PR must be merged before this one. label Sep 21, 2020
@aanm aanm marked this pull request as draft September 21, 2020 18:56
@aanm aanm changed the title k8s: garbage collect IPs of no running Pods k8s: delete IPs from ipcache for no running Pods Sep 23, 2020
@aanm aanm removed the dont-merge/blocked Another PR must be merged before this one. label Sep 23, 2020
@aanm aanm force-pushed the pr/fix-complete-pod-statuses branch from 3ad73f7 to 6b22034 Compare September 23, 2020 10:28
@aanm aanm removed the sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. label Sep 23, 2020
@aanm
Copy link
Member Author

aanm commented Sep 23, 2020

test-me-please

@aanm aanm marked this pull request as ready for review September 23, 2020 10:28
@aanm aanm requested a review from a team September 23, 2020 10:28
@aanm aanm requested a review from a team as a code owner September 23, 2020 10:28
@aanm aanm force-pushed the pr/fix-complete-pod-statuses branch from 6b22034 to 414bb72 Compare September 23, 2020 10:35
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Needs backport from master in 1.7.10 Sep 23, 2020
@aanm
Copy link
Member Author

aanm commented Sep 23, 2020

test-me-please

Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have nits below.

operator/k8s_cep_gc.go Outdated Show resolved Hide resolved
operator/k8s_cep_gc.go Show resolved Hide resolved
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at everything except operator/k8s_cep_gc.go, LMK if you'd like another look at that.

daemon/cmd/state.go Outdated Show resolved Hide resolved
pkg/k8s/watchers/endpointsynchronizer.go Show resolved Hide resolved
…ted"

This reverts commit 8068f1a.

This reverted commit introduces a regression where Cilium Endpoints can
be left around after the Cilium Endpoint was locally deleted. Although
it was a scale optimization for non existing docker images, the security
aspect will overlap the scalability concern initially thought.

Signed-off-by: André Martins <andre@cilium.io>
To avoid wasting resources in Cilium and to avoid leftover
CiliumEndpoints from populating the ipcache, we should not watch for
CiliumEndpoints when disable-endpoint-crd is set to true.

Signed-off-by: André Martins <andre@cilium.io>
This field is essential to understand if the pod is still running or
not.

Signed-off-by: André Martins <andre@cilium.io>
In Kubernetes, a Job creates a pod which will complete with either
the "Succeeded" or "Failed" PodPhase. Kubernetes will leave these
Pods around until the Job is deleted by the operator. As soon the pod
enters either one of the previously described PodPhases, Kubelet will
send a CNI delete event to Cilium agent which will then release the
allocated IP addresses of that pod, making the IP address available
again.

If not disabled, Cilium will create a Cilium Endpoint for each Pod in
the cluster that has its network managed by Cilium.

Cilium agent populates the ipcache with the information retrieved from
Pods and Cilium Endpoints events, in case of duplicated information,
ipcache will be stored with the state from Cilium Endpoints.

In a unlikely case of Cilium agent not running and the Pod enters the
"Succeeded" state, it will mean the Cilium agent will not be available
to delete the Cilium Endpoint created for that Pod.

To complement this fix, Cilium agents will also prune Cilium Endpoints
of not running pods on start up.

Signed-off-by: André Martins <andre@cilium.io>
@aanm aanm force-pushed the pr/fix-complete-pod-statuses branch from 414bb72 to 6a160dd Compare September 25, 2020 08:43
@aanm
Copy link
Member Author

aanm commented Sep 25, 2020

test-me-please

@aanm
Copy link
Member Author

aanm commented Sep 25, 2020

hit #13224

@aanm aanm merged commit b3adc4d into cilium:master Sep 25, 2020
@aanm aanm deleted the pr/fix-complete-pod-statuses branch September 25, 2020 11:37
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.8 in 1.8.4 Sep 25, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.8 to Backport done to v1.8 in 1.8.4 Sep 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. priority/high This is considered vital to an upcoming release. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.8.4
Backport done to v1.8
Development

Successfully merging this pull request may close these issues.

Conflicting endpoint when complete Job pod shares IP with an active pod
8 participants