Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wildcard tolerations to kube-proxy #56589

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml
Expand Up @@ -22,6 +22,10 @@ spec:
- matchExpressions:
- key: cloud.google.com/gke-accelerator
operator: Exists
tolerations:
- key: "nvidia.com/gpu"
effect: "NoSchedule"
operator: "Exists"
hostNetwork: true
hostPID: true
volumes:
Expand Down
1 change: 0 additions & 1 deletion cluster/addons/fluentd-gcp/fluentd-gcp-ds.yaml
Expand Up @@ -107,7 +107,6 @@ spec:
effect: "NoSchedule"
- operator: "Exists"
effect: "NoExecute"
#TODO: remove this toleration once #44445 is properly fixed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this comment? The issue hasn't closed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want this toleration to be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't is different than we never will. TODO indicates we don't but we might want to eventually. Is this issue obsolete? Should we close it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#44445 contains multiple issues.

  • The title of the issue: Improvement: fluentd-gcp to get same toleration as kube-proxy will be fixed by this PR.
  • There was a regression in the daemonset controller mentioned in the bug which was fixed a while back.
  • I guess the only thing that is not fixed are the three comments starting from Improvement: fluentd-gcp to get same toleration as kube-proxy #44445 (comment) (users not being able to modify system addons on managed services like GKE and so can't use NoSchedule or NoExecute taints if these addons don't tolerate them because adding such taints make the nodes not have these "required" addons)

But even if we fix that (allow users to modify the toleration of system addons). I think the default should still be that these addons tolerate all taints. If users really want they can use the ability to modify the toleration of system addons to remove these wildcard tolerations.

Also, we need system addons that run on every node to have wildcard NoSchedule toleration for issue #55080, PR #55839.

cc @vishh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. The expected behavior is that all system addons that are expected to run on all GKE nodes should tolerate all taints and effects. Certain addons like GPU plugins need to tolerate GPU specific taints only.
This I feel is probably slightly different from #44445 where if a user taints all GKE nodes then cluster level system addons will not run at all. This is a separate feature and is not tied to the comment at all.
@mikedanese thoughts?

- operator: "Exists"
effect: "NoSchedule"
terminationGracePeriodSeconds: 30
Expand Down
5 changes: 5 additions & 0 deletions cluster/addons/kube-proxy/kube-proxy-ds.yaml
Expand Up @@ -28,6 +28,11 @@ spec:
hostNetwork: true
nodeSelector:
beta.kubernetes.io/kube-proxy-ds-ready: "true"
tolerations:
- operator: "Exists"
effect: "NoExecute"
- operator: "Exists"
effect: "NoSchedule"
containers:
- name: kube-proxy
image: {{pillar['kube_docker_registry']}}/kube-proxy:{{pillar['kube-proxy_docker_tag']}}
Expand Down
5 changes: 5 additions & 0 deletions cluster/saltbase/salt/kube-proxy/kube-proxy.manifest
Expand Up @@ -65,6 +65,11 @@ metadata:
spec:
{{pod_priority}}
hostNetwork: true
tolerations:
- operator: "Exists"
effect: "NoExecute"
- operator: "Exists"
effect: "NoSchedule"
containers:
- name: kube-proxy
image: {{pillar['kube_docker_registry']}}/kube-proxy:{{pillar['kube-proxy_docker_tag']}}
Expand Down