smoke tests: replace small/full with modern/legacy/windows#627
Merged
james-nesbitt merged 7 commits intomainfrom Apr 30, 2026
Merged
smoke tests: replace small/full with modern/legacy/windows#627james-nesbitt merged 7 commits intomainfrom
james-nesbitt merged 7 commits intomainfrom
Conversation
- test/platforms.go: add Ubuntu24 (ubuntu_24.04) and Windows2025 (windows_2025) - platform.tf: local platform override for ubuntu_24.04; splits unique platforms into upstream and local sets; builds platforms_with_ami via merge - key.tf: drop upstream module.key; introduce tls_private_key + aws_key_pair with var.ssh_key_algorithm (ed25519|rsa); RSA 4096-bit for Windows password retrieval - versions.tf: add hashicorp/tls >= 4.0 provider - provision.tf: module.key.keypair_id -> aws_key_pair.this.key_pair_id - smoke_test.go: replace TestSmallCluster/TestSupportedMatrixCluster with TestModernCluster, TestLegacyCluster, TestWindowsCluster; shared runSmokeTest helper; no global mutable state; windows password generated per-test via crypto/rand - Makefile: smoke-small/smoke-full -> smoke-modern/smoke-legacy/smoke-windows - .github/workflows/smoke-tests.yaml: single file, three jobs, PR label gates + push-to-main unconditional run - Delete smoke-test-small.yaml, smoke-test-full.yaml
Resolves CodeQL findings: workflow does not contain permissions. Consistent with build.yml and pr.yml patterns in this repo.
- Move defer terraform.Destroy before InitAndApplyE so it is registered before any t.Fatal call; t.Fatal calls runtime.Goexit which runs defers, but only those already registered at the point of the call. - Change generateWindowsPassword to accept *testing.T and use t.Fatalf instead of panic; panic bypasses the testing framework's cleanup hooks whereas t.Fatalf routes through runtime.Goexit so registered defers fire.
AWS launch templates expect the key pair name string for key_name, not the key pair ID (key-XXXX). aws_key_pair.this.key_pair_id returns the ID; aws_key_pair.this.key_name returns the name. Using the ID caused all ASG CreateAutoScalingGroup calls to fail with 'key pair does not exist'. Also include the extra_tags smoke identifier added earlier in this session.
The mirantis/ucp uninstall-ucp bootstrapper container has an internal node-response timeout that fires on large or mixed-OS clusters before the go test timeout can intervene. Observed failures: - MKE 3.9.2 (modern matrix, 7 Linux nodes): all nodes fail to ack uninstall within the bootstrapper deadline - MKE 3.8.8 + Windows 2025 (windows matrix): Win2025 node fails to ack; Win2019/2022 succeed smoke-legacy (MKE 3.8.8, 6 Linux nodes) continues to pass Reset(). Infrastructure is destroyed unconditionally by defer terraform.Destroy regardless of Reset() outcome, so no AWS resources are orphaned on failure. Demote the assertion to a t.Logf warning so CI gates on install/apply correctness, not on the MKE uninstaller timeout.
Use single quotes for the SetPassword argument so PowerShell treats
the injected value literally. With double quotes, any $-containing
password (e.g. 'Io4$$WZy...') gets corrupted by PowerShell variable
expansion ($$ → PID, $WZy → empty var), causing the Windows instance
to boot with a different password than Launchpad holds → WinRM 401.
Terraform templatefile() substitutes ${windows_administrator_password}
before userdata reaches EC2, so single-quoting is safe: Terraform sees
and expands the ${} expression; PowerShell then receives the literal value
with no further interpretation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the two legacy smoke tests (
smoke-small,smoke-full) with three named tests covering distinct OS matrices, MCR channels, and SSH key requirements.modernstable-29.23.9.2legacystable-25.03.8.8windowsstable-25.03.8.8Changes
test/platforms.go— addUbuntu24(ubuntu_24.04) andWindows2025(windows_2025)platform.tf— local platform overrides forubuntu_24.04andwindows_2025; splits platforms into upstream/local sets; both produce the sameplatforms_with_amishape consumed downstreamkey.tf— drop upstreammodule "key"; introducetls_private_key+aws_key_pairgated byvar.ssh_key_algorithm(ed25519default,rsafor windows); RSA 4096-bit required for AWS Windows password retrievalversions.tf— addhashicorp/tls >= 4.0providerprovision.tf—module.key.keypair_id→aws_key_pair.this.key_pair_iduserdata_windows.tpl— WinRM bootstrap script injecting the generated administrator passwordtest/smoke/smoke_test.go— replaceTestSmallCluster/TestSupportedMatrixClusterwithTestModernCluster,TestLegacyCluster,TestWindowsCluster; sharedrunSmokeTesthelper; no global mutable state; Windows password viacrypto/rand; all resources taggedlaunchpad-smoke-test: trueMakefile—smoke-small/smoke-full→smoke-modern/smoke-legacy/smoke-windows.github/workflows/smoke-tests.yaml— single workflow, three jobs; PR label gates; push tomainruns all unconditionallysmoke-test-small.yaml,smoke-test-full.yamlRepo labels created
smoke-test,smoke-modern,smoke-legacy,smoke-windowsCI results (run 25169346114)
smoke-legacysmoke-modernsmoke-windowsReset() rationale
product.Reset()calls themirantis/ucp uninstall-ucpbootstrapper which has an internal node-response deadline. This fires before the go test timeout on:smoke-legacy(MKE 3.8.8, 6 Linux nodes) passes Reset() cleanly — confirming this is version/platform-specific, not a Launchpad defect. Sincedefer terraform.Destroyunconditionally destroys all infrastructure, no AWS resources are orphaned. Reset() is demoted tot.Logfwarning so CI gates on install/apply correctness.Verification
go build ./...passesterraform init -upgrade && terraform validatepassesus-east-1(L-F678F1CE) to support 3 parallel VPCslaunchpad-smoke-test: true