Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an option to wait for kube-proxy #20517

Merged
merged 1 commit into from
Jul 14, 2022
Merged

Conversation

michi-covalent
Copy link
Contributor

@michi-covalent michi-covalent commented Jul 13, 2022

Description

This commit is to add the flag in helm, which will enable init
container waiting for kube-proxy if required. The main reason is
to avoid any potential race condition between kube-proxy and
cilium agent. More context can be found in below related PR.

Relates: #20123

Signed-off-by: Michi Mutsuzaki michi@isovalent.com

Testing

(tam) The changes are done original by Michi, I just add below minor things:

  • Add check for kubeProxyReplacement not to be strict
  • Add securityContext as privileged for running iptables command
  • Correct manifest syntax error

Testing was done locally by setting this flag enabled (and kube proxy
replacement is not set).

$ helm install cilium install/kubernetes/cilium -n kube-system --set waitForKubeProxy=true

$ ksyslo ds/cilium --timestamps -c wait-for-kube-proxy
2022-07-14T09:44:18.298138416Z :KUBE-PROXY-CANARY - [0:0]
2022-07-14T09:44:18.298293549Z Found KUBE-IPTABLES-HINT or KUBE-PROXY-CANARY iptables rule in 'iptables-nft-save -t mangle'

$ ksysex ds/cilium --  iptables-nft-save -t mangle | grep -E '^:(KUBE-IPTABLES-HINT|KUBE-PROXY-CANARY)'
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), wait-for-kube-proxy (init)
:KUBE-PROXY-CANARY - [0:0]

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 13, 2022
Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'd like to see if the --check flag can be used instead of grepping, but this works too.

- |
while true
do
if iptables-nft-save -t mangle | grep -E '^:(KUBE-IPTABLES-HINT|KUBE-PROXY-CANARY)'; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is finding one or the other enough? I'm assuming the rules are atomically set at the same time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes my understanding, KUBE-IPTABLES-HINT was what the old versions used the newer versions use KUBE-PROXY-CANARY so if we ever find both things are strange already.

- |
while true
do
if iptables-nft-save -t mangle | grep -E '^:(KUBE-IPTABLES-HINT|KUBE-PROXY-CANARY)'; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing a save and grep could you use the --check flag to check for the chain and a particular rule?

       -C, --check chain rule-specification
              Check whether a rule matching the specification does exist in the selected chain. This command uses the same logic as -D to  find  a  matching
              entry, but does not alter the existing iptables configuration and uses its exit code to indicate success or failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure... I've never used --check maybe @joestringer knows the answer.

@sayboras sayboras added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Jul 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 14, 2022
@sayboras sayboras added area/helm Impacts helm charts and user deployment experience dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jul 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 14, 2022
@sayboras sayboras added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. area/kube-proxy Issues related to kube-proxy (not the kube-proxy-free mode). labels Jul 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 14, 2022
@sayboras sayboras force-pushed the pr/michi/iptables-struggle branch 2 times, most recently from 2b5ac9e to 981ec7e Compare July 14, 2022 09:49
This commit is to add the flag in helm, which will enable init
container waiting for kube-proxy if required. The main reason is
to avoid any potential race condition between kube-proxy and
cilium agent. More context can be found in below related PR.

Relates: #20123

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
@sayboras sayboras marked this pull request as ready for review July 14, 2022 09:54
@sayboras sayboras requested a review from a team July 14, 2022 09:54
@sayboras sayboras requested review from a team as code owners July 14, 2022 09:54
@sayboras
Copy link
Member

sayboras commented Jul 14, 2022

/test

Job 'Cilium-PR-K8s-1.24-kernel-net-next' failed:

Click to show.

Test Name

K8sServicesTest Checks N/S loadbalancing Tests NodePort with sessionAffinity from outside

Failure Output

FAIL: Timed out after 240.000s.

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.24-kernel-net-next so I can create one.

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self approved :shameless:

@sayboras
Copy link
Member

sayboras commented Jul 14, 2022

/test-1.24-net-next

Job 'Cilium-PR-K8s-1.24-kernel-net-next' failed:

Click to show.

Test Name

K8sServicesTest Checks E/W loadbalancing (ClusterIP, NodePort from inside cluster, etc) Checks in-cluster KPR Tests NodePort with externalTrafficPolicy=Local

Failure Output

FAIL: k8s2 host unexpectedly connected to service "http://192.168.56.11:30083", it should fail

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.24-kernel-net-next so I can create one.

@sayboras sayboras requested a review from a team July 14, 2022 15:39
@aanm
Copy link
Member

aanm commented Jul 14, 2022

/mlh new-flake Cilium-PR-K8s-1.24-kernel-net-next

@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.11.7 Jul 15, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.13 Jul 15, 2022
@joestringer joestringer added this to Needs backport from master in 1.10.14 Jul 15, 2022
@joestringer joestringer removed this from Needs backport from master in 1.10.13 Jul 15, 2022
@joestringer joestringer added this to Needs backport from master in 1.11.8 Jul 15, 2022
@joestringer joestringer removed this from Needs backport from master in 1.11.7 Jul 15, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.12 in 1.12.0 Jul 18, 2022
@ldelossa ldelossa added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. and removed backport-pending/1.12 labels Jul 18, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.12 to Backport done to v1.12 in 1.12.0 Jul 18, 2022
@gandro gandro mentioned this pull request Jul 21, 2022
6 tasks
@tklauser tklauser added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Aug 11, 2022
@tklauser tklauser moved this from Needs backport from master to Backport done to v1.11 in 1.11.8 Aug 11, 2022
@tklauser tklauser moved this from Needs backport from master to Backport done to v1.10 in 1.10.14 Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm Impacts helm charts and user deployment experience area/kube-proxy Issues related to kube-proxy (not the kube-proxy-free mode). backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
No open projects
1.10.14
Backport done to v1.10
1.11.8
Backport done to v1.11
1.12.0
Backport done to v1.12
Development

Successfully merging this pull request may close these issues.

None yet