feat(aws): AMI architecture detection and cross-validation#664
Conversation
Pull Request Test Coverage Report for Build 21985915080Details
💛 - Coveralls |
There was a problem hiding this comment.
Pull request overview
This PR adds AMI architecture detection and cross-validation to catch mismatches between AMI architecture and instance type architecture before instance creation.
Changes:
- Added
describeImageArchhelper to query AMI architecture from EC2 - Added
getInstanceTypeArchto query supported architectures for instance types - Updated
resolveImageForNodeto return architecture when explicit ImageId is provided - Added architecture cross-validation in
DryRun()to detect mismatches - Updated test mocks to include ProcessorInfo with SupportedArchitectures
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/provider/aws/image.go | Added describeImageArch and getInstanceTypeArch helpers; updated ResolvedImage to include Architecture field; modified resolveImageForNode to query architecture for explicit ImageIds; updated resolveOSToAMI and setLegacyAMI to store architecture |
| pkg/provider/aws/dryrun.go | Added architecture compatibility validation that compares AMI architecture against instance type supported architectures |
| pkg/provider/aws/image_test.go | Added comprehensive unit tests for describeImageArch, getInstanceTypeArch, resolveImageForNode with explicit ImageId, and DryRun architecture validation (both mismatch and match cases) |
| pkg/provider/aws/mock_ec2_test.go | Updated DescribeInstanceTypes mock to return ProcessorInfo with SupportedArchitectures |
| pkg/provider/aws/aws_test.go | Updated test mocks to include ProcessorInfo for instance types |
| pkg/provider/aws/aws_ginkgo_test.go | Updated test mocks to include ProcessorInfo for instance types |
pkg/provider/aws/image.go
Outdated
| // Store the resolved architecture for cross-validation in DryRun | ||
| if p.Spec.Image.Architecture == "" { | ||
| p.Spec.Image.Architecture = "x86_64" // Legacy default | ||
| } |
There was a problem hiding this comment.
The setLegacyAMI function should store the normalized architecture value that findLegacyAMI uses internally. Currently, if a user specifies Architecture as "amd64" or "aarch64", findLegacyAMI normalizes it to "x86_64" or "arm64" internally, but this normalized value is not stored back in p.Spec.Image.Architecture. This could cause the architecture validation in DryRun to fail incorrectly when comparing unnormalized user input (e.g., "amd64") against EC2 API values (e.g., "x86_64").
pkg/provider/aws/image.go
Outdated
| } | ||
|
|
||
| p.Spec.Image.ImageId = &resolved.ImageID | ||
| p.Spec.Image.Architecture = arch |
There was a problem hiding this comment.
The resolveOSToAMI function should normalize the architecture before storing it in p.Spec.Image.Architecture. The AMI resolver normalizes the architecture internally (e.g., "amd64" to "x86_64"), but this function stores the un-normalized value. This could cause the architecture validation in DryRun to fail incorrectly when comparing user input like "amd64" against EC2 API values like "x86_64". Consider calling ami.NormalizeArch(arch) before storing.
48116ce to
822c00a
Compare
Add describeImageArch helper to query an AMI's architecture from EC2 and getInstanceTypeArch to query instance type supported architectures. Update resolveImageForNode to return architecture for all code paths including explicit ImageId. Cross-validate AMI architecture against instance type supported architectures in DryRun() to catch mismatches (e.g., arm64 AMI + x86_64 instance type) with a clear error message before instance creation fails. Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
822c00a to
00f7c51
Compare
Summary
describeImageArchhelper to query AMI architecture from EC2ResolvedImageto includeArchitecturefieldresolveImageForNodeto return architecture for explicit ImageId (was skipping all arch detection)getInstanceTypeArchto query instance type supported architectures viaProcessorInfoDryRun()resolveOSToAMIto store resolved architectureCatches arm64 AMI + x86_64 instance type mismatches with a clear error message instead of cryptic boot failures.
Test plan
describeImageArchwith mock EC2 clientresolveImageForNodereturning architecture for explicit ImageIdDryRun()detecting architecture mismatchDryRun()succeeding when architectures matchgo test ./pkg/provider/aws/...passes (84/84)golangci-lint run ./...passes (0 issues)