-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow longer name for AWS/Azure cluster and refactoring #1648
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Michaelvll, and nice tests! Shall we also test a name length close to AWS's char limit?
Good catch! Actually, we have to reduce the limit to 246 for AWS because of the limitation of DescribeInstance API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Michaelvll. Just some minor comments and a question about some additional characters added by the autoscaler. May be worth updating PR description with each cloud's at-limit test?
Great point! I added the tests for each cloud, and seems there are multiple corner cases that are caught by those tests. I fixed the cluster name limit for AWS and Azure. Thanks! |
…#1648) * Allow longer name for AWS/Azure cluster * Refactor the cluster name length checking * Refactor cloud.supports and optimize the failover speed * format * Refactor cloud features support * fix logging * UX fix * rename * Add comments * format * address comment * fix aws length * address comments * Add comment
…#1648) * Allow longer name for AWS/Azure cluster * Refactor the cluster name length checking * Refactor cloud.supports and optimize the failover speed * format * Refactor cloud features support * fix logging * UX fix * rename * Add comments * format * address comment * fix aws length * address comments * Add comment
This PR
cloud.supports
tocloud.check_features_are_supported
and change the boolean return value to exception raising.InvalidClusterNameError
andClusterUserIdentityError
to directly move to the next cloud, as those errors apply to the whole cloud.Previously:
Now:
Tested (run the relevant ones):
sky launch -c i-am-using-a-very-long-name-for-the-aws-cluster-it-should-be-more-than-63-characters-and-hopefully-it-works-this-sentence-has-135-chars --cloud aws
sky launch -c i-am-using-a-very-long-name-for-the-aws-cluster-it-should-be-more-than-63-characters-and-hopefully-it-works-by-checking-the-document-of-aws-we-found-that-the-cluster-name-length-should-be-less-than-250-due-to-the-tag-limit-this-sentence-has-246-c --num-nodes 2 --cloud aws -t t3.micro
(and 245 does not work)sky launch -c i-am-using-a-very-long-name-for-the-aws-cluste --cloud azure --region southcentralus --num-nodes 2
sky launch -c i-am-using-a-very-long-name-for-the-aws-cluster-it-should-be-mor --cloud lambda
workssky launch -c i-am-using-a-very-long-name-for-the-azure-cluster-it-should-be-more-than-63-characters-and-hopefully-it-works-this-sentence-has-136-chars --cloud azure
: correctly fails for the Azuresky launch -c i-am-using-a-very-long-name-for-the-aws-cluster-it-should-be-more-than-63-characters-and-hopefully-it-works-this-sentence-has-135-chars --gpus A100:8
, it successfully and quickly failover through Lambda/GCP/Azure and AWS (skip the first three clouds with a single try and try all the regions on AWS).