Skip to content

feat(aws): AMI architecture detection and cross-validation#664

Merged
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom
ArangoGutierrez:fix/arm64-ami-arch-validation
Feb 13, 2026
Merged

feat(aws): AMI architecture detection and cross-validation#664
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom
ArangoGutierrez:fix/arm64-ami-arch-validation

Conversation

@ArangoGutierrez
Copy link
Collaborator

Summary

  • Add describeImageArch helper to query AMI architecture from EC2
  • Update ResolvedImage to include Architecture field
  • Update resolveImageForNode to return architecture for explicit ImageId (was skipping all arch detection)
  • Add getInstanceTypeArch to query instance type supported architectures via ProcessorInfo
  • Cross-validate AMI arch against instance type arch in DryRun()
  • Update resolveOSToAMI to store resolved architecture

Catches arm64 AMI + x86_64 instance type mismatches with a clear error message instead of cryptic boot failures.

Test plan

  • Unit tests for describeImageArch with mock EC2 client
  • Unit test for resolveImageForNode returning architecture for explicit ImageId
  • Unit test for DryRun() detecting architecture mismatch
  • Unit test for DryRun() succeeding when architectures match
  • go test ./pkg/provider/aws/... passes (84/84)
  • golangci-lint run ./... passes (0 issues)
  • CI passes

Copilot AI review requested due to automatic review settings February 13, 2026 11:00
@coveralls
Copy link

coveralls commented Feb 13, 2026

Pull Request Test Coverage Report for Build 21985915080

Details

  • 74 of 82 (90.24%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.6%) to 48.07%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/provider/aws/dryrun.go 18 21 85.71%
pkg/provider/aws/image.go 56 61 91.8%
Totals Coverage Status
Change from base Build 21955389842: 0.6%
Covered Lines: 2565
Relevant Lines: 5336

💛 - Coveralls

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds AMI architecture detection and cross-validation to catch mismatches between AMI architecture and instance type architecture before instance creation.

Changes:

  • Added describeImageArch helper to query AMI architecture from EC2
  • Added getInstanceTypeArch to query supported architectures for instance types
  • Updated resolveImageForNode to return architecture when explicit ImageId is provided
  • Added architecture cross-validation in DryRun() to detect mismatches
  • Updated test mocks to include ProcessorInfo with SupportedArchitectures

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pkg/provider/aws/image.go Added describeImageArch and getInstanceTypeArch helpers; updated ResolvedImage to include Architecture field; modified resolveImageForNode to query architecture for explicit ImageIds; updated resolveOSToAMI and setLegacyAMI to store architecture
pkg/provider/aws/dryrun.go Added architecture compatibility validation that compares AMI architecture against instance type supported architectures
pkg/provider/aws/image_test.go Added comprehensive unit tests for describeImageArch, getInstanceTypeArch, resolveImageForNode with explicit ImageId, and DryRun architecture validation (both mismatch and match cases)
pkg/provider/aws/mock_ec2_test.go Updated DescribeInstanceTypes mock to return ProcessorInfo with SupportedArchitectures
pkg/provider/aws/aws_test.go Updated test mocks to include ProcessorInfo for instance types
pkg/provider/aws/aws_ginkgo_test.go Updated test mocks to include ProcessorInfo for instance types

Comment on lines 252 to 255
// Store the resolved architecture for cross-validation in DryRun
if p.Spec.Image.Architecture == "" {
p.Spec.Image.Architecture = "x86_64" // Legacy default
}
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setLegacyAMI function should store the normalized architecture value that findLegacyAMI uses internally. Currently, if a user specifies Architecture as "amd64" or "aarch64", findLegacyAMI normalizes it to "x86_64" or "arm64" internally, but this normalized value is not stored back in p.Spec.Image.Architecture. This could cause the architecture validation in DryRun to fail incorrectly when comparing unnormalized user input (e.g., "amd64") against EC2 API values (e.g., "x86_64").

Copilot uses AI. Check for mistakes.
}

p.Spec.Image.ImageId = &resolved.ImageID
p.Spec.Image.Architecture = arch
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resolveOSToAMI function should normalize the architecture before storing it in p.Spec.Image.Architecture. The AMI resolver normalizes the architecture internally (e.g., "amd64" to "x86_64"), but this function stores the un-normalized value. This could cause the architecture validation in DryRun to fail incorrectly when comparing user input like "amd64" against EC2 API values like "x86_64". Consider calling ami.NormalizeArch(arch) before storing.

Copilot uses AI. Check for mistakes.
@ArangoGutierrez ArangoGutierrez force-pushed the fix/arm64-ami-arch-validation branch from 48116ce to 822c00a Compare February 13, 2026 11:33
Add describeImageArch helper to query an AMI's architecture from EC2
and getInstanceTypeArch to query instance type supported architectures.
Update resolveImageForNode to return architecture for all code paths
including explicit ImageId. Cross-validate AMI architecture against
instance type supported architectures in DryRun() to catch mismatches
(e.g., arm64 AMI + x86_64 instance type) with a clear error message
before instance creation fails.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
@ArangoGutierrez ArangoGutierrez force-pushed the fix/arm64-ami-arch-validation branch from 822c00a to 00f7c51 Compare February 13, 2026 11:55
@ArangoGutierrez ArangoGutierrez merged commit 53ea515 into NVIDIA:main Feb 13, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants