Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix a few karpenter-core issues #19

Merged
merged 1 commit into from
Oct 2, 2023
Merged

fix: Fix a few karpenter-core issues #19

merged 1 commit into from
Oct 2, 2023

Conversation

Fei-Guo
Copy link
Collaborator

@Fei-Guo Fei-Guo commented Sep 30, 2023

This change fixes a few karpenter-core issues:

  1. Turn off the garbagecolletor controller for now. This controller tries to delete machine cr if node is not shown up after 10 sec when machine is marked launched. The garbagecolletor controller will always throw errors when a new machine CR is created if the node object has not shown up in current implementation. I will have a follow up change to fix the garbagecolletor.

  2. After registering the machine, always populate machine status.allocatable and status.capacity with node status. Otherwise, the machine status are initialized with instanceType info which are inconsistent with the actual node status. This leads to errors reported by consistency controller.

  3. Remove populateInflight call. It is difficult to controller when the node.kubernetes.io/instance-type is added in the node object. Hence this function always report errors before the label is populated. We don't need this functionality in gpu-provisioner. Hence, remove the code to avoid confusing error logs.

  4. Remove unnecessary debug logs.

@Fei-Guo Fei-Guo changed the title Fix a few karpenter-core issues fix: Fix a few karpenter-core issues Sep 30, 2023
Copy link
Collaborator

@helayoty helayoty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment about logs

@Fei-Guo Fei-Guo merged commit 4b10388 into main Oct 2, 2023
3 of 4 checks passed
@Fei-Guo Fei-Guo deleted the fguo-dev1 branch October 2, 2023 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants