Remove redundant -restore from Queue Tests pipeline step#54590
Draft
MichaelSimons wants to merge 12 commits into
Draft
Remove redundant -restore from Queue Tests pipeline step#54590MichaelSimons wants to merge 12 commits into
MichaelSimons wants to merge 12 commits into
Conversation
- Register microsoft.dotnet.helix.jobmonitor as a local .NET tool in dotnet-tools.json (v11.0.0-beta.26277.111) - Add toolset dependency in eng/Version.Details.xml sourced from the VMR at the same version and SHA as other arcade toolset dependencies - Add /p:EnableHelixJobMonitor=true to Helix test submission steps so build agents submit jobs and exit immediately instead of blocking - Rename test steps from 'Run Tests' to 'Queue Tests' to accurately reflect that the step only submits work; results are published by the monitor job - Add HelixJobMonitor job to .vsts-pr.yml (public, 120-min timeout) - Add HelixJobMonitor job to .vsts-ci.yml (internal, with HelixApiAccessToken, gated on the same condition as other test-build jobs) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove category name substitution from Queue Tests display name - Remove redundant condition: succeeded() from both Queue Tests steps - Bump Microsoft.DotNet.Arcade.Sdk and Microsoft.DotNet.Helix.Sdk in global.json from 26261.101 to 26277.111 to match Version.Details.xml and enable EnableHelixJobMonitor support Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the explicit 120-minute timeout override from both PR and CI pipelines so the arcade template default of 360 minutes is used. The previous value was too short for the full pipeline to complete and publish all test results. Also update the Helix Job Monitor tool to 11.0.0-beta.26303.111 to pick up bug fixes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The longest observed work item across recent CI runs is ~33 minutes (BlazorWebAssembly.Tests.dll.1 on Windows FullFramework). A 60-minute timeout provides ~1.8x headroom while reducing wasted Helix machine time for genuinely hung test processes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The build step already restores all test projects via sdk.slnx, so the test queuing step's -restore flag was causing a redundant sequential NuGet restore of 200+ test projects. The Helix SDK is resolved via global.json (no NuGet restore needed), and RID-specific restores are already handled inside BuildSDKCustomXUnitProjects unconditionally. This eliminates: - Sequential evaluation/restore of 200+ already-restored test projects - Redundant shared framework installs from eng/restore-toolset.ps1 - Redundant native tools initialization Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CleanOutStage0ToolsetsAndRuntimes uses DOTNET_INSTALL_DIR which is only initialized when InitializeDotNetCli runs during restore. When -restore is not passed (e.g., the Queue Tests step which only runs -test), the variable is unbound and build.sh's 'set -u' causes a failure. Guard the call with the same restore check used by InitializeCustomSDKToolset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This reverts commit e6cd2f0.
This reverts commit ad90f67.
b2d8576 to
86dba6d
Compare
Places the Monitor Helix Jobs job first in the pipeline so it appears at a predictable position in the AzDO UI — making it easy to check test progress and failures as they come in. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2593bdc to
07d668c
Compare
The Queue Tests step was spending significant time on redundant restore operations: 1. The top-level Arcade NuGet graph restore (triggered by -restore flag) was restoring UnitTests.proj and running RestoreSDKCustomXUnitProjects which sequentially restored all 200+ test projects. 2. BuildSDKCustomXUnitProjects already does its own per-project RID restore before each publish (which is the only restore that's actually needed, since it adds RuntimeIdentifiers to the assets file). This change: - Removes -restore from the Queue Tests pipeline step (both Windows/Linux) - Empties RestoreSDKCustomXUnitProjects (redundant with the per-project restore in BuildSDKCustomXUnitProjects) The restore-toolset.sh/ps1 guards (already present in the base branch) ensure that skipping -restore doesn't cause unbound variable errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
07d668c to
a3d29e9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Queue Tests pipeline step was passing -restore to eng/common/build.ps1/�uild.sh, causing a redundant sequential NuGet restore of 200+ test projects that were already restored during the Build step (via sdk.slnx).
Changes
Why this is safe
Expected impact
Eliminates the ~6 minute overhead from \RestoreSDKCustomXUnitProjects\ sequentially evaluating/restoring 200+ already-restored test projects.
Depends on