Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a bug where hostname topology launched extra nodes #866

Merged
merged 1 commit into from
Nov 28, 2021

Conversation

ellistarn
Copy link
Contributor

1. Issue, if available:

2. Description of changes:
This bug is caused by the provisioners controller comparing the state of the current provisioner to the persisted provisioner state and the mutation happening in hostname topology. The scheduling logic mutates the provisioners' requirements, which causes the comparison to inadvertently force refresh the provisioner while nodes are being launched. This causes the provisioner to be drained, which can happen after launching capacity but before binding the pods. This results in multiple scale out (and eventual self-healing / scale-in).

The bug is due to me violating my own principles about not mutating state: https://github.com/aws/karpenter/blob/5d5798b5fefc757ef353889204c56138d8042066/pkg/controllers/provisioning/scheduling/topology.go#L99. In the long term, this mutation will be removed as part of a scheduling refactor.

3. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@netlify
Copy link

netlify bot commented Nov 27, 2021

✔️ Deploy Preview for karpenter-docs-prod ready!

🔨 Explore the source changes: 709562c

🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/61a28319b520000007b1b528

😎 Browse the preview: https://deploy-preview-866--karpenter-docs-prod.netlify.app

@ellistarn ellistarn changed the title Fixed a bug where hostname topology caused extra nodes to be launched Fixed a bug where hostname topology launched extra nodes Nov 27, 2021
Copy link
Contributor

@JacobGabrielson JacobGabrielson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@JacobGabrielson JacobGabrielson merged commit cbdaa40 into aws:main Nov 28, 2021
@ellistarn ellistarn deleted the hostnametopology branch November 28, 2021 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants