Validate published Aspire CLI builds end-to-end from AzDO + GH#17532
Validate published Aspire CLI builds end-to-end from AzDO + GH#17532radical wants to merge 9 commits into
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17532Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17532" |
|
Re-running the failed jobs in the CI workflow for this pull request because 2 jobs were identified as retry-safe transient failures in the CI run attempt.
Matched test failure patterns (1 test)
|
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
…Strategy
Channel-based installs (quality=dev / staging / release) go through aka.ms
aliases whose targets can be stale, so the install itself can't catch
"channel pointer didn't get updated after publish". Add an optional
post-install version assertion: ExpectedVersion on a strategy causes
VerifyAspireCliVersionAsync to run `aspire --version` after install and
fail with CLI_VERSION_MISMATCH if it doesn't match.
The selector layer exposes this as:
- WithExpectedVersion(version) builder for callers constructing a
strategy directly. Validates the value against the same regex
FromVersion uses — the value is interpolated unquoted into a bash
equality check, so the regex doubles as a shell-safety guard.
- ASPIRE_E2E_EXPECTED_VERSION env var applied as an override in
Detect() when the chosen strategy doesn't already carry a
deterministic ExpectedVersion (LocalArchive from nupkg, DotnetTool
with explicit version still win). Whitespace is trimmed and
treated as unset so a blank workflow_dispatch input doesn't trip
the assertion.
Tests cover the builder, both whitespace and invalid-format rejection,
the env override on quality + version strategies, the no-op for
deterministic strategies, and trim/whitespace handling on the env path.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The release-publish-nuget pipeline dispatched release-github-tasks.yml
via an inline pwsh block + a per-script-call ASPIRE_BOT_APP_ID/
PRIVATE_KEY env mapping. Add a second AzDO dispatcher (for the upcoming
validate-published-build workflow) and they'd both repeat the same
JSON-encode + secret-mapping ceremony.
Extract:
- dispatch-github-workflow-steps.yml — reusable template covering the
pwsh dispatch step, JSON-via-env input passing (avoids embedding a
JSON literal in the script body), and the post-step that surfaces
the dispatched run URL. Takes workflowFile, workflowRef, inputs,
and a noWait switch (fire-and-forget vs wait-for-completion).
- dispatch-github-workflow.ps1 — renamed from
dispatch-release-github-tasks.ps1 (the body was already generic).
Adds a -NoWait switch so fire-and-forget callers skip the run-id
resolution + polling path.
Refactor DispatchGitHubTasksJob to invoke the new template. As part of
that, lift the source-build release-branch derivation up to PrepareJob's
deriveReleaseVersion step (releaseBranchEffective) so it can be reused
by future consumers — DispatchGitHubTasksJob now reads it via
stageDependencies rather than recomputing inline. Drop ReleaseBranchDerived.
No behavior change to release-github-tasks.yml's dispatch.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shared Node helper for workflows that auto-create a tracking issue
when they fail, modelled on the existing inline pattern in
tests-daily-smoke.yml and deployment-tests.yml but using the search
API for exact-title dedup (no listForRepo-window churn, no substring
collision between e.g. "13.4" and "13.4.1").
Callers compose their own title/body/labels (the parts that genuinely
differ — artifact parsing, prose, dedup key). The helper owns only
search → exact-title-match → comment-or-create.
Loaded from workflows via the established
require(${GITHUB_WORKSPACE}/.github/workflows/...) pattern used by
create-failing-test-issue.js and workflow-command-helpers.js.
Tests in tests/Infrastructure.Tests/WorkflowScripts/ drive the helper
against a fake Octokit, covering:
- buildDedupQuery shape with JSON-escaped phrase quoting
- create path when no existing issue matches
- comment path when an exact-title match exists
- exact-title-match rejection of substring hits (e.g. "13.4" search
returning a "13.4.1" issue must NOT collapse)
- create path on search failure (don't lose the failure report)
No caller yet; introduced separately so the workflow that consumes
it stays focused.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a workflow that runs the full Aspire.Cli.EndToEnd.Tests suite against a CLI installed via a published channel (quality + version inputs). Splits every test class into its own matrix job so the shape matches PR validation. Quality drives the channel install; the version input flows into ASPIRE_E2E_EXPECTED_VERSION so `aspire --version` post-install asserts the channel actually resolved to the requested build — catches "publish succeeded but channel pointer is stale" failures that a versioned install path would silently mask. On failure or cancellation (per-leg timeout, GHA infra cancel), opens or comments on a (version, quality) tracking issue via the shared create-failure-tracking-issue helper. validate-published-build is dispatched fire-and-forget from AzDO, so without this nobody sees a failure unless they're watching the Actions UI. Also adds a thin job template, dispatch-validate-published-build-job.yml, so the AzDO callers added in a later commit can share the dispatch shape. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wires both AzDO pipelines to dispatch validate-published-build.yml
(fire-and-forget) so each production-branch build and release exercises
the public install path end-to-end:
- azure-pipelines.yml gains a dispatch_validate_published_build stage
that fires when _PackagesPublished=true (main, release/*,
internal/release/*). Quality derives from the source branch — main
publishes to 'daily' (quality=dev) per build_sign_native.yml,
release branches publish to 'staging'. Workflow ref tracks the
source branch so the dispatched workflow YAML matches the test
source for that channel.
- release-publish-nuget.yml gains a ValidatePublishedBuild stage that
dispatches with quality=release and the just-published version. It
depends on Release (the stage that publishes NuGet packages and
promotes the channel pointer), not GitHubTasks, because the channel
pointer — not the GitHub release — is what aka.ms/dotnet/.../release
resolves to. PrepareArtifacts is required to have succeeded so the
version macro is populated; Release.result == 'Skipped' is permitted
so operator-driven partial reruns work. Two new advanced parameters
cover skip and ref override.
Both callers use the dispatch-github-workflow-steps template added in
the previous commit, plus a thin
dispatch-validate-published-build-job.yml that pins the workflow
filename and matrix shape.
Dispatched fire-and-forget because validate-published-build is
informational signal, not a release gate — blocking on CLI E2E
flakiness would punish releases for noise unrelated to the release
itself. The dispatched workflow opens a tracking issue on failure so
the signal isn't silently lost.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ssertion Optional workflow_dispatch input flows into ASPIRE_E2E_EXPECTED_VERSION so an operator triaging a channel outage can dispatch the smoke run with the version the channel should currently be pointing at and have `aspire --version` assert it post-install. Scheduled runs leave it empty (current behavior). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the new stage to the release-process overview, documents the two new advanced parameters (SkipValidatePublishedBuild, ValidatePublishedBuildWorkflowRef), and updates the Step 5 monitoring narrative — GitHubTasks is no longer the final stage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d4d8b89 to
a9687ea
Compare
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
…moke-workflow # Conflicts: # docs/release-process.md # eng/pipelines/azure-pipelines.yml # eng/pipelines/release-publish-nuget.yml
What this adds
A new post-publish validation path that exercises a just-published Aspire CLI build with the full CLI E2E suite and catches stale-channel-pointer failures, wired into both AzDO pipelines that produce a publish.
Three observable changes for operators and CI:
GitHub Actions workflow
Validate Published Build(.github/workflows/validate-published-build.yml) —workflow_dispatchwithquality(dev/staging/release) +version. Runs the fullAspire.Cli.EndToEnd.Testsclass-split matrix. Installs via the requested channel (so the same channel-resolution path real users hit gets exercised) and asserts post-install thataspire --versionmatches the suppliedversion(so a staleaka.ms/dotnet/.../{quality}pointer fails the run loudly).AzDO main build pipeline (
eng/pipelines/azure-pipelines.yml) — new fire-and-forgetdispatch_validate_published_buildstage that runs oncebuildsucceeds and_PackagesPublished=true. Quality + dispatched ref are derived from the source branch:main→quality=dev, ref=main;release/X.Y→quality=staging, ref=release/X.Y;internal/release/X.Y→quality=staging, ref=release/X.Y(theinternal/prefix is stripped so the workflow YAML is loaded from the public mirror branch).AzDO release pipeline (
eng/pipelines/release-publish-nuget.yml) — newValidatePublishedBuildstage afterGitHubTasksdispatches withquality=release, version from the derived release version, and workflow ref from the source build's branch. NewSkipValidatePublishedBuild+ValidatePublishedBuildWorkflowRefadvanced parameters for partial-failure re-runs and ref overrides.Both AzDO dispatches are fire-and-forget — the CLI E2E suite is informational signal, not a release gate. Blocking the release on a CLI E2E run that's susceptible to transient GH Actions/Docker/test flakiness would punish releases for noise unrelated to the release itself. The dispatched run URL is printed in the AzDO log; re-runs go through the GitHub Actions UI.
Why this matters
Before this PR there was no automated verification that a publish actually reached the public install path. The previous in-pipeline installer validation (
prepare_installersFull mode) downloads from the versionedhttps://ci.dot.net/public/aspire/{ver}/aspire-cli-{rid}-{ver}.zipURL, which by construction serves the version it names — so a publish that succeeded for assets but failed to flip a channel pointer would still pass installer validation and ship undetected.The new workflow exercises
aka.ms/dotnet/9/aspire/{ga|rc|daily}/dailydirectly, which is the URL real users hit. Combined with the post-installaspire --versionassertion, the path now catches both "channel URL is reachable" and "channel points at the expected build."Why depending on
build(notprepare_installers)Arcade v3 publishing (
enablePublishUsingPipelines: trueon the build job) uploads native archives to ci.dot.net during thebuildstage itself, not in a separate post-build stage.prepare_installers' own Full validation works for exactly that reason and likewise only depends onbuild. Dispatching afterbuild(in parallel withprepare_installers) removes ~20–30 minutes of latency on the validation signal at no correctness cost.Shared plumbing
eng/pipelines/scripts/dispatch-github-workflow.ps1— renamed fromdispatch-release-github-tasks.ps1(the body was already generic). Added-NoWaitswitch so callers can fire-and-forget without the run-id-resolve + poll path. Inputs are passed as a hashtable.eng/pipelines/templates/dispatch-github-workflow-steps.yml— generic dispatch step template. Inputs flow as anobjectparameter that's serialized via AzDO'sconvertToJson()at template expansion and parsed by the pwsh body from an env var — avoids the AzDO restriction on template directives inside string scalars and dodges PowerShell-quoting concerns for input values.eng/pipelines/templates/dispatch-validate-published-build-job.yml— single-job template specific tovalidate-published-build.yml. Both pipelines invoke it with justquality+version+ref(defaultmain).Channel-pointer assertion in test infra
tests/Shared/CliInstallStrategy.cs— newWithExpectedVersion(string)builder and a newASPIRE_E2E_EXPECTED_VERSIONenv var applied as an override at the tail ofDetect(). When set on a strategy that doesn't already carry a deterministic ExpectedVersion (whichLocalArchiveandDotnetTool with explicit versiondo), the override populatesExpectedVersionand the existingVerifyAspireCliVersionAsyncpath runsaspire --versionpost-install and fails withCLI_VERSION_MISMATCH:expected=X actual=Yon a stale channel.tests/Aspire.Cli.EndToEnd.Tests/Helpers/CliInstallStrategyTests.cs— 6 new tests cover the override-apply path on quality + version strategies, the deterministic-ExpectedVersion no-op, empty-string handling, and the builder including itsArgumentExceptionon empty input.Daily smoke convenience
.github/workflows/tests-daily-smoke.ymlgains an optionalexpectedVersionworkflow_dispatchinput that flows intoASPIRE_E2E_EXPECTED_VERSION. Scheduled runs leave it empty (current behavior); operators can dispatch with a specific value to verify channel resolution on demand during outage triage.Validation
End-to-end exercised against live infra prior to this PR being shipped:
gh workflow run "Daily CLI Smoke Tests" --ref ankj/staging-cli-smoke-workflow -f quality=staging -f expectedVersion=13.4.0(run) — proved the channel install +ASPIRE_E2E_EXPECTED_VERSIONassertion path on actual GitHub Actions infra.workflow_dispatchagainst the GitHub API as theaspire-repo-botGitHub App.CliInstallStrategyTestsclass passes (62 / 62) including the 6 new tests.Checklist