-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster-autoscaler does not work with VPC-Endpoint (AWS/EKS) #2829
Comments
This has been updated in master 3 months ago, but no release. Any idea when a new release would be cut? |
/assign |
Thanks. I can help resolve this issue and request newer release. I think what we can help is bump the SDK version and then user can mount env AWS_STS_REGIONAL_ENDPOINTS=regional. SDK client will pick up env and resolve right endpoint. Is that correct? |
@Jeffwan That is correct, It's resolved in the version that is already in master.
See go.mod in master here. The upstream fix was here https://github.com/aws/aws-sdk-go/pull/2779/files |
Fixes #2532 |
@ajohnstone Thanks. I plan to have a few cherry-pick recently, I will make the change and include this in the new release. |
Just make changes on 1.15. I will make the changes for rest of the version |
em.. Sorry we only have following versions to support this case. https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.15.6 1.14, 1.16 and 1.17 changes is not included in this release. Change will be merged and you can build one image for short term. If you need any help, let me know |
We are just migrating to 1.15, thanks a lot. If I can find time I will extend the AWS documentation. |
Successful tested with 1.15.6, thanks again. I opened #3052 From my side this could be closed, not sure if you want to keep it open until the other versions support it. |
I will leave it open to track changes in other branches. @maust Thanks for the contribution. I will review the doc change |
Hi @Jeffwan any plan to fix them in other version (1.16.x -> 1.17.x)? |
Hello, is there any info on when this will be released in a 1.16.x or 1.17.x release? Thanks |
1.16.6, 1.17.3 have been released. Please download latest version. I will close the issue. Thanks everyone for all your feedbacks |
/close |
@Jeffwan: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We do have a quite locked-down network in AWS (no internet-connectivity at all). Access to AWS services only via VPC endpoints and on-premise systems via DirectConnect. At the same time we would like to use IAM roles.
Kubernetes: 1.14
Cluster-Autoscaler: 1.14.7
When using cluster-autoscaler it cannot fetch credentials via STS using the IAM role. To my understanding the issue is caused by cluster-autoscaler not using the regional STS endpoint (https://sts.eu-central-1.amazonaws.com) but instead the global (https://sts.amazonaws.com). With VPC-Endpoints it is not possible to replace the global enpoint.
With github.com/aws/aws-sdk-go v1.25.18
(see https://github.com/aws/aws-sdk-go/blob/master/CHANGELOG.md) configuration of regional STS endpoints was introduced by setting env AWS_STS_REGIONAL_ENDPOINTS=regional
I tried setting the region as env AWS_REGION and AWS_STS_REGIONAL_ENDPOINTS but still the global endpoint is used.
After looking at the 1.14 branch for cluster-autoscaler, it looks like v1.23.22 is used (see
https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-release-1.14/cluster-autoscaler/vendor/github.com/aws/aws-sdk-go/CHANGELOG.md)
I also checked the other cluster-autoscaler branches:
So I would assume that supporting such a use case would be possible by upgrading the aws-sdk-go version to >= v1.25.18 - let me know if I can be of help.
logs:
E0213 16:05:54.390164 1 aws_manager.go:259] Failed to regenerate ASG cache: cannot autodiscover ASGs: WebIdentityErr: failed to retrieve credentials
caused by: RequestError: send request failed
caused by: Post https://sts.amazonaws.com/: dial tcp 54.239.29.25:443: i/o timeout
F0213 16:05:54.390200 1 aws_cloud_provider.go:330] Failed to create AWS Manager: cannot autodiscover ASGs: WebIdentityErr: failed to retrieve credentials
caused by: RequestError: send request failed
caused by: Post https://sts.amazonaws.com/: dial tcp 54.239.29.25:443: i/o timeout
Attached you can find the kubernetes deployment yaml.
deployment.txt
The text was updated successfully, but these errors were encountered: