-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(vpcCni): set node anti-affinity to not deploy to fargate #291
Conversation
2e45a6b
to
aeb38ea
Compare
nodejs/eks/cni/aws-k8s-cni.yaml
Outdated
@@ -76,6 +76,8 @@ spec: | |||
operator: In | |||
values: | |||
- amd64 | |||
- key: "eks.amazonaws.com/compute-type" | |||
operator: DoesNotExist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is diverging from the upstream CNI spec, right?
How does EKS natively handle this? They install CNI by default - https://docs.aws.amazon.com/eks/latest/userguide/cni-upgrades.html - so what happens if you just manually create a cluster put Fargate profiles that cause all pods to go to Fargate? Do you still end up with DameonSets trying to launch pods on the fake nodes?
What’s not clear to me is how this is supposed to work - I see no mention of this issue in EKS docs or in eksctl issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is diverging from the upstream CNI spec, right?
Yes
How does EKS natively handle this?
Docs are lacking, so hard to tell.
what happens if you just manually create a cluster put Fargate profiles that cause all pods to go to Fargate? Do you still end up with DameonSets trying to launch pods on the fake nodes?
For an all fargate cluster, the aws-node
DaemonSet still gets created, but with this change you don't see any Pods attempting to run on the fargate nodes, as the node labels now don't match for them. Prior to this change, we'd see aws-node
pods in the Pending
state trying to run on the fargate nodes and not succeeding. That's no longer the case in this scenario as it will only start Pods for the DS on EC2 or managed nodes, if they exist.
What’s not clear to me is how this is supposed to work - I see no mention of this issue in EKS docs or in eksctl issues.
Agreed. The docs are skim for this scenario with a hybrid cluster case and for an all fargate cluster.
FWIW the initial attempt at this PR was to move the creation of the VpcCni
resource in createCore()
to the create function for self-managed or managed node groups, to exclude it for fargate nodes. This change would either alter the established APIs and existing clusters expecting to pass vpcCniOptions
into the Cluster definition, or result in attempting to deploy the aws-node DS per node group used - resulting in resource dupes on Pulumi's end, and attempts at creating multiple DS objects in the k8s control plane (e.g. if you had many node groups).
Setting the node anti-affinity in the CNI spec seemed less intrusive of a change overall given that our end goal is to not deploy it on fargate nodes, but still ensure the DS exists if the cluster we were to mix and match node types along with fargate nodes.
cc @lblackstone Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems reasonable to me. Affinity is the right way to avoid scheduling on nodes that don't support particular workloads.
We should just make sure the divergence is documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears there is a divergence between what AWS installs in practice, and what they have in the VPC CNI repo on GitHub.
Opened an issue to understand whether that's intentional - but we should align with what they actually install (which this PR almost does - we should match them exactly):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
52e6dd1
to
715f1e3
Compare
Feedback has been addressed. PTAL @lukehoban @lblackstone |
1b64081
to
5a67a4b
Compare
This change sets a node anti-affinity for the AWS CNI DaemonSet to only deploy to nodes that are not labeled for fargate with `eks.amazonaws.com/compute-type=fargate`. By default, the DaemonSet uses a `RollingUpdate` [strategy](https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/#performing-a-rolling-update) with a `maxUnavailable` of 1 pod. This means that on a Pulumi update to this change for existing clusters, the `aws-node` pods will update one at a time, until all are updated with the new node anti-affinity. This will not cause down time of the cluster or running pods. In summary, this change allows clusters to: - Mix and match the type of nodes used (EC2 / self-managed, AWS managed, and/or Fargate) - Deploy the aws-cni to the cluster for any configuration of nodes, and - Although the DS will always be created, it will only deploy to nodes that are not fargate backed.
5a67a4b
to
02d3bc2
Compare
Proposed changes
This change sets a node anti-affinity for the AWS CNI DaemonSet to only deploy to nodes that are not labeled for fargate with
eks.amazonaws.com/compute-type=fargate
.By default, the DaemonSet uses a
RollingUpdate
strategy with amaxUnavailable
of 1 pod. This means that on a Pulumi update to this change for existing clusters, theaws-node
pods will update one at a time, until all are updated with the new node anti-affinity. This will not cause down time of the cluster or running pods.In summary, this change allows clusters to:
Related issues (optional)
Fixes #286