Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retries in pluto to handle eventual-consistent EC2 private DNS names #3363

Closed
etungsten opened this issue Aug 24, 2023 · 0 comments · Fixed by #3364
Closed

Add retries in pluto to handle eventual-consistent EC2 private DNS names #3363

etungsten opened this issue Aug 24, 2023 · 0 comments · Fixed by #3364
Assignees
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW status/needs-triage Pending triage or re-evaluation

Comments

@etungsten
Copy link
Contributor

etungsten commented Aug 24, 2023

In rare cases, calls to EC2 DescribeInstance will return an empty string for the instance's private DNS name when the instance is newly launched. The API response eventually settles as the private DNS name of the instance becomes consistent. This poses a problem when pluto is determining the node name based on the private DNS name of the instance.

This issue is described in more detail in kubernetes/cloud-provider-aws#635. The fix for the in-tree cloud provider is in kubernetes/kubernetes#118421 which would cover all pre-1.27 clusters.

However for 1.27+ clusters, we no longer use the in-tree cloud provider and depend on the external cloud-provider (See #3033). Therefore we need to add similar retry logic to fix this issue in pluto.

See awslabs/amazon-eks-ami#1383 for the corresponding fix on the EKS optimized AMI side.

@etungsten etungsten added area/kubernetes K8s including EKS, EKS-A, and including VMW status/needs-triage Pending triage or re-evaluation labels Aug 24, 2023
@etungsten etungsten self-assigned this Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW status/needs-triage Pending triage or re-evaluation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant