-
Notifications
You must be signed in to change notification settings - Fork 314
[integ-test] rotate instance types for some integration tests #6645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1. Unlike rotating OS, instance types are region dependent. Therefore, we cannot use general Jinja variables like `{{ OS_X86_1 }}`. We need to use region specific Jinja variables like `{{ US_EAST_1_INSTANCE_TYPE_0 }}`
2. For code efficiency, this commit only populates three large AWS regions. The code is extendable if more regions should be added.
3. This commit rotates instance types only on `test_essential_features` and `test_cluster_with_gpu_health_checks`. The code is extendable if more tests should be added.
4. Improve `test_cluster_with_gpu_health_checks` to be able to run on both x86 and arm
Signed-off-by: Hanwen <hanwenli@amazon.com>
| ) | ||
| for index in range(len(gpu_instances)): | ||
| instance_type = gpu_instances[(today_number + index) % len(gpu_instances)] | ||
| result[f"{region_jinja}_GPU_INSTANCE_TYPE_{index}"] = instance_type[: -len(".xlarge")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we removing ".xlarge" if we just add it back in the develop.yaml. Especially since we are only looking at xlarge instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the approach more extensible.
For example test_dcv_configuration uses .2xlarge. If we lock down to .xlarge, this mechanism cannot be used by test_dcv_configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but right now we are already locked down to .xlarge due to this line if instance_type["InstanceType"].endswith(".xlarge"). If you really wanted to make it extendable, then shouldn't _get_instance_type_parameters() have a parameter that takes in a list of instance sizes and have the output be a map where the key is the instance size and the value is the list of instance types.
This bug was introduced by aws#6645 Signed-off-by: Hanwen <hanwenli@amazon.com>
This bug was introduced by #6645 Signed-off-by: Hanwen <hanwenli@amazon.com>
This bug was introduced by aws#6645 Signed-off-by: Hanwen <hanwenli@amazon.com>
{{ OS_X86_1 }}. We need to use region specific Jinja variables like{{ US_EAST_1_INSTANCE_TYPE_0 }}test_essential_featuresandtest_cluster_with_gpu_health_checks. The code is extendable if more tests should be added.test_cluster_with_gpu_health_checksto be able to run on both x86 and armChecklist
developadd the branch name as prefix in the PR title (e.g.[release-3.6]).Please review the guidelines for contributing and Pull Request Instructions.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.