Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[autoscaler][aws] Use subnets in only one VPC #14868

Merged
merged 8 commits into from
Mar 24, 2021

Conversation

DmitriGekhtman
Copy link
Contributor

@DmitriGekhtman DmitriGekhtman commented Mar 23, 2021

Why are these changes needed?

The default behavior of _configure_subnets in aws/config.py is to fetch all available subnets across all available VPCs
and place these subnets into the SubnetIDs field of the head and worker nodes.
However, _configure_security group generates only a security group in the VPC of one of the subnets.
AWSNodeProvider.create_node then might attempt to launch into a subnet outside of the VPC of the security group, leading to the error observed here #5053 (comment)

This PR fixes the bug by filtering subnets so that we only use subnets belonging to one VPC.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@DmitriGekhtman
Copy link
Contributor Author

Pushing for initial comment.

Working on a unit test for this.

@DmitriGekhtman
Copy link
Contributor Author

added test to test_autoscaler_aws.py
test_autoscaler_aws.py passes for me locally

@DmitriGekhtman DmitriGekhtman added the release-blocker P0 Issue that blocks the release label Mar 23, 2021
Copy link
Contributor

@pcmoritz pcmoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, this is great :)

@DmitriGekhtman DmitriGekhtman added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Mar 24, 2021
@DmitriGekhtman
Copy link
Contributor Author

tests look fine
@pcmoritz could you merge this?

@AmeerHajAli AmeerHajAli added the P0 Issues that should be fixed in short order label Mar 24, 2021
@pcmoritz pcmoritz merged commit 1045856 into ray-project:master Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Issues that should be fixed in short order release-blocker P0 Issue that blocks the release tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants