Skip to content

Remove redundant -restore from Queue Tests pipeline step#54590

Draft
MichaelSimons wants to merge 12 commits into
mainfrom
michaelsimons/speed-up-ci-test-queuing
Draft

Remove redundant -restore from Queue Tests pipeline step#54590
MichaelSimons wants to merge 12 commits into
mainfrom
michaelsimons/speed-up-ci-test-queuing

Conversation

@MichaelSimons
Copy link
Copy Markdown
Member

Summary

The Queue Tests pipeline step was passing -restore to eng/common/build.ps1/�uild.sh, causing a redundant sequential NuGet restore of 200+ test projects that were already restored during the Build step (via sdk.slnx).

Changes

  • Remove -restore from both Windows and Linux Queue Tests steps in eng/pipelines/templates/jobs/sdk-build.yml
  • Guard CleanOutStage0ToolsetsAndRuntimes behind the restore flag in both eng/restore-toolset.ps1 and eng/restore-toolset.sh to prevent unbound variable errors on Linux (DOTNET_INSTALL_DIR is only set by InitializeCustomSDKToolset which returns early without -restore)

Why this is safe

  1. The build step already restores all test projects via sdk.slnx
  2. The Helix SDK is resolved via \global.json\ (no NuGet restore needed)
  3. The per-project RID-specific restore in \BuildSDKCustomXUnitProjects\ (XUnitRunner.targets) remains active — it adds the RID target to existing assets files needed for publish
  4. \CleanOutStage0ToolsetsAndRuntimes\ only needs to run during the build step (which still passes -restore)

Expected impact

Eliminates the ~6 minute overhead from \RestoreSDKCustomXUnitProjects\ sequentially evaluating/restoring 200+ already-restored test projects.

Depends on

MichaelSimons and others added 8 commits June 3, 2026 14:54
- Register microsoft.dotnet.helix.jobmonitor as a local .NET tool in
  dotnet-tools.json (v11.0.0-beta.26277.111)
- Add toolset dependency in eng/Version.Details.xml sourced from the VMR
  at the same version and SHA as other arcade toolset dependencies
- Add /p:EnableHelixJobMonitor=true to Helix test submission steps so
  build agents submit jobs and exit immediately instead of blocking
- Rename test steps from 'Run Tests' to 'Queue Tests' to accurately
  reflect that the step only submits work; results are published by the
  monitor job
- Add HelixJobMonitor job to .vsts-pr.yml (public, 120-min timeout)
- Add HelixJobMonitor job to .vsts-ci.yml (internal, with HelixApiAccessToken,
  gated on the same condition as other test-build jobs)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove category name substitution from Queue Tests display name
- Remove redundant condition: succeeded() from both Queue Tests steps
- Bump Microsoft.DotNet.Arcade.Sdk and Microsoft.DotNet.Helix.Sdk in
  global.json from 26261.101 to 26277.111 to match Version.Details.xml
  and enable EnableHelixJobMonitor support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the explicit 120-minute timeout override from both PR and CI
pipelines so the arcade template default of 360 minutes is used.
The previous value was too short for the full pipeline to complete
and publish all test results.

Also update the Helix Job Monitor tool to 11.0.0-beta.26303.111
to pick up bug fixes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The longest observed work item across recent CI runs is ~33 minutes
(BlazorWebAssembly.Tests.dll.1 on Windows FullFramework). A 60-minute
timeout provides ~1.8x headroom while reducing wasted Helix machine
time for genuinely hung test processes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The build step already restores all test projects via sdk.slnx, so the
test queuing step's -restore flag was causing a redundant sequential
NuGet restore of 200+ test projects. The Helix SDK is resolved via
global.json (no NuGet restore needed), and RID-specific restores are
already handled inside BuildSDKCustomXUnitProjects unconditionally.

This eliminates:
- Sequential evaluation/restore of 200+ already-restored test projects
- Redundant shared framework installs from eng/restore-toolset.ps1
- Redundant native tools initialization

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CleanOutStage0ToolsetsAndRuntimes uses DOTNET_INSTALL_DIR which is
only initialized when InitializeDotNetCli runs during restore. When
-restore is not passed (e.g., the Queue Tests step which only runs
-test), the variable is unbound and build.sh's 'set -u' causes a
failure. Guard the call with the same restore check used by
InitializeCustomSDKToolset.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MichaelSimons MichaelSimons force-pushed the michaelsimons/speed-up-ci-test-queuing branch 2 times, most recently from b2d8576 to 86dba6d Compare June 4, 2026 17:25
MichaelSimons and others added 3 commits June 5, 2026 11:35
Places the Monitor Helix Jobs job first in the pipeline so it appears
at a predictable position in the AzDO UI — making it easy to check
test progress and failures as they come in.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MichaelSimons MichaelSimons force-pushed the michaelsimons/speed-up-ci-test-queuing branch 4 times, most recently from 2593bdc to 07d668c Compare June 5, 2026 20:05
The Queue Tests step was spending significant time on redundant restore
operations:

1. The top-level Arcade NuGet graph restore (triggered by -restore flag)
   was restoring UnitTests.proj and running RestoreSDKCustomXUnitProjects
   which sequentially restored all 200+ test projects.

2. BuildSDKCustomXUnitProjects already does its own per-project RID restore
   before each publish (which is the only restore that's actually needed,
   since it adds RuntimeIdentifiers to the assets file).

This change:
- Removes -restore from the Queue Tests pipeline step (both Windows/Linux)
- Empties RestoreSDKCustomXUnitProjects (redundant with the per-project
  restore in BuildSDKCustomXUnitProjects)

The restore-toolset.sh/ps1 guards (already present in the base branch)
ensure that skipping -restore doesn't cause unbound variable errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MichaelSimons MichaelSimons force-pushed the michaelsimons/speed-up-ci-test-queuing branch from 07d668c to a3d29e9 Compare June 5, 2026 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant