Skip to content

[WIP] Add the Helix Job Monitor#127749

Draft
premun wants to merge 8 commits intodotnet:mainfrom
premun:prvysoky/helix-job-monitor
Draft

[WIP] Add the Helix Job Monitor#127749
premun wants to merge 8 commits intodotnet:mainfrom
premun:prvysoky/helix-job-monitor

Conversation

@premun
Copy link
Copy Markdown
Member

@premun premun commented May 4, 2026

Copilot AI review requested due to automatic review settings May 4, 2026 13:35
@premun premun added NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it labels May 4, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @akoeplinger, @matouskozak, @simonrozsival
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new Azure Pipelines “Helix Job Monitor” stage/job template under eng/common/core-templates and wires it into the runtime pipeline so a dedicated job can monitor Helix work across stages.

Changes:

  • Adds a new stage template (helix-job-monitor.yml) that wraps a job template to run the monitor tool.
  • Adds a new job template (helix-job-monitor.yml) intended to acquire/install the monitor tool and run it with pipeline/Helix context.
  • Integrates the stage into eng/pipelines/runtime.yml with monitorAllStages: true.

Reviewed changes

Copilot reviewed 1 out of 3 changed files in this pull request and generated 9 comments.

File Description
eng/pipelines/runtime.yml Adds the new Helix Job Monitor stage template to the runtime pipeline.
eng/common/core-templates/stages/helix-job-monitor.yml New stage wrapper for the Helix Job Monitor job template.
eng/common/core-templates/job/helix-job-monitor.yml New job template that downloads/installs and runs the Helix Job Monitor tool.

Comment on lines +178 to +184
echo "Tool DLL: $toolDll"
echo "##vso[task.setvariable variable=HelixJobMonitorDll]$toolDll"
displayName: Install Helix Job Monitor

- bash: |
set -euo pipefail

Comment on lines +220 to +225
# Tool was installed from a local nupkg; run the DLL via the repo-local dotnet.
export DOTNET_ROOT="$(Build.SourcesDirectory)/.dotnet"
./eng/common/dotnet.sh exec "$(HelixJobMonitorDll)" "${toolArgs[@]}"
displayName: Monitor Helix Jobs
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
Comment on lines +22 to +26
# Optional explicit tool version. Only honored when 'toolNupkgArtifactName' is set; in the
# default code path the version is taken from the consuming repo's .config/dotnet-tools.json.
- name: toolVersion
type: string
default: '11.0.0-ci'
Comment on lines +7 to +11
# Pool override. When empty the template selects a default azurelinux pool based on the team project.
- name: pool
type: object
default: {}

Comment on lines +116 to +125
- task: DownloadPipelineArtifact@2
displayName: Download Helix Job Monitor artifact
inputs:
buildType: specific
project: public
pipeline: arcade-pr
runId: 1407423
artifactName: Artifacts_Windows_NT_Release
itemPattern: '${{ parameters.toolNupkgArtifactSubPath }}/${{ parameters.toolPackageId }}.*.nupkg'
targetPath: $(Agent.TempDirectory)/helix-job-monitor-nupkg
Comment on lines +22 to +55
# NuGet package id of the Helix job monitor tool.
- name: toolPackageId
type: string
default: Microsoft.DotNet.Helix.JobMonitor

# Console command exposed by the installed tool package.
- name: toolCommand
type: string
default: dotnet-helix-job-monitor

# Optional explicit tool version. When empty, the latest available version is installed.
- name: toolVersion
type: string
default: ''

# Optional NuGet feed used as an additional source when installing the tool.
- name: toolSource
type: string
default: ''

# JobMonitorOptions: --helix-base-uri.
- name: helixBaseUri
type: string
default: https://helix.dot.net/

# Helix API access token forwarded via the HELIX_ACCESSTOKEN environment variable.
- name: helixAccessToken
type: string
default: ''

# JobMonitorOptions: --polling-interval-seconds.
- name: pollingIntervalSeconds
type: number
default: 30

stages:
- stage: ${{ parameters.stageName }}
dependsOn: ${{ parameters.dependsOn }}
Comment on lines +28 to +34
# Optional NuGet feed used as an additional source when installing the tool. Only honored
# when 'toolNupkgArtifactName' is set; in the default code path the tool is restored from
# the consuming repo's .config/dotnet-tools.json manifest and no extra feeds are consulted.
- name: toolSource
type: string
default: ''

Comment on lines +182 to +185
- bash: |
set -euo pipefail

toolArgs=(
Copilot AI review requested due to automatic review settings May 4, 2026 14:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 6 changed files in this pull request and generated 7 comments.

Comment thread eng/pipelines/runtime.yml Outdated
Comment on lines +2077 to +2081
eq(variables['isRollingBuild'], true))

- template: /eng/common/core-templates/stages/helix-job-monitor.yml
parameters:
monitorAllStages: true No newline at end of file
Comment on lines +90 to +99
- template: /eng/common/core-templates/job/helix-job-monitor.yml
parameters:
jobName: ${{ parameters.jobName }}
helixBaseUri: ${{ parameters.helixBaseUri }}
helixAccessToken: ${{ parameters.helixAccessToken }}
timeoutInMinutes: ${{ parameters.timeoutInMinutes }}
organization: ${{ parameters.organization }}
repository: ${{ parameters.repository }}
prNumber: ${{ parameters.prNumber }}
monitorAllStages: ${{ parameters.monitorAllStages }}
Comment on lines +1 to +99
parameters:
# Stage identifier.
- name: stageName
type: string
default: Helix_Job_Monitor

# Optional list of stages this stage depends on.
- name: dependsOn
type: object
default: []

# Optional stage condition expression.
- name: condition
type: string
default: ''

# Job identifier produced inside the stage.
- name: jobName
type: string
default: HelixJobMonitor

# NuGet package id of the Helix job monitor tool.
- name: toolPackageId
type: string
default: Microsoft.DotNet.Helix.JobMonitor

# Console command exposed by the installed tool package.
- name: toolCommand
type: string
default: dotnet-helix-job-monitor

# Optional explicit tool version. When empty, the latest available version is installed.
- name: toolVersion
type: string
default: ''

# Optional NuGet feed used as an additional source when installing the tool.
- name: toolSource
type: string
default: ''

# JobMonitorOptions: --helix-base-uri.
- name: helixBaseUri
type: string
default: https://helix.dot.net/

# Helix API access token forwarded via the HELIX_ACCESSTOKEN environment variable.
- name: helixAccessToken
type: string
default: ''

# JobMonitorOptions: --polling-interval-seconds.
- name: pollingIntervalSeconds
type: number
default: 30

# JobMonitorOptions: --max-wait-minutes. Also used as the job/stage timeout.
- name: timeoutInMinutes
type: number
default: 360

# JobMonitorOptions: --organization (owner segment of the source repository).
- name: organization
type: string
default: ''

# JobMonitorOptions: --repository (name of the source repository).
- name: repository
type: string
default: ''

# JobMonitorOptions: --pr-number. Required for PR validation pipelines.
- name: prNumber
type: string
default: ''

# When true, the monitor tracks Helix jobs and pipeline jobs across every stage of the
# build. When false, the monitor only tracks jobs that belong to the same Azure DevOps stage as
# the monitor job itself (i.e. this stage).
- name: monitorAllStages
type: boolean
default: false

stages:
- stage: ${{ parameters.stageName }}
dependsOn: ${{ parameters.dependsOn }}
${{ if ne(parameters.condition, '') }}:
condition: ${{ parameters.condition }}
jobs:
- template: /eng/common/core-templates/job/helix-job-monitor.yml
parameters:
jobName: ${{ parameters.jobName }}
helixBaseUri: ${{ parameters.helixBaseUri }}
helixAccessToken: ${{ parameters.helixAccessToken }}
timeoutInMinutes: ${{ parameters.timeoutInMinutes }}
organization: ${{ parameters.organization }}
repository: ${{ parameters.repository }}
prNumber: ${{ parameters.prNumber }}
monitorAllStages: ${{ parameters.monitorAllStages }}
Comment on lines +7 to +11
# Pool override. When empty the template selects a default azurelinux pool based on the team project.
- name: pool
type: object
default: {}

Comment on lines +22 to +30
# Optional explicit tool version. Only honored when 'toolNupkgArtifactName' is set; in the
# default code path the version is taken from the consuming repo's .config/dotnet-tools.json.
- name: toolVersion
type: string
default: '11.0.0-ci'

# Optional NuGet feed used as an additional source when installing the tool. Only honored
# when 'toolNupkgArtifactName' is set; in the default code path the tool is restored from
# the consuming repo's .config/dotnet-tools.json manifest and no extra feeds are consulted.
Comment on lines +116 to +125
- task: DownloadPipelineArtifact@2
displayName: Download Helix Job Monitor artifact
inputs:
buildType: specific
project: public
pipeline: arcade-pr
runId: 1407423
artifactName: Artifacts_Windows_NT_Release
itemPattern: '${{ parameters.toolNupkgArtifactSubPath }}/${{ parameters.toolPackageId }}.*.nupkg'
targetPath: $(Agent.TempDirectory)/helix-job-monitor-nupkg
Comment on lines +164 to +171
pushd "$(Build.SourcesDirectory)" > /dev/null

# Update .NET SDK version in global.json
if [ -f "global.json" ]; then
sed -i 's/"11.0.100-preview.3.26170.106"/"11.0.0-preview.4.26210.111"/g' global.json
fi

./eng/common/dotnet.sh tool install \
Copilot AI review requested due to automatic review settings May 5, 2026 09:16
@premun premun force-pushed the prvysoky/helix-job-monitor branch from 63f15a7 to b11ee5f Compare May 5, 2026 09:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 11 changed files in this pull request and generated 6 comments.

Comment thread global.json Outdated
@@ -14,7 +14,7 @@
},
"msbuild-sdks": {
"Microsoft.DotNet.Arcade.Sdk": "11.0.0-beta.26211.102",
Comment thread NuGet.config
<add key="dotnet11-transport" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet11-transport/nuget/v3/index.json" />
<add key="dotnet-diagnostics-tests" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-diagnostics-tests/nuget/v3/index.json" />

<!-- TODO: Remove -->
Comment on lines +1 to +6
parameters:
# Azure DevOps job identifier.
- name: jobName
type: string
default: HelixJobMonitor

Comment on lines +7 to +11
# Pool override. When empty the template selects a default azurelinux pool based on the team project.
- name: pool
type: object
default: {}

Comment on lines +28 to +34
# Optional NuGet feed used as an additional source when installing the tool. Only honored
# when 'toolNupkgArtifactName' is set; in the default code path the tool is restored from
# the consuming repo's .config/dotnet-tools.json manifest and no extra feeds are consulted.
- name: toolSource
type: string
default: ''

Comment on lines +27 to +41
# Console command exposed by the installed tool package.
- name: toolCommand
type: string
default: dotnet-helix-job-monitor

# Optional explicit tool version. When empty, the latest available version is installed.
- name: toolVersion
type: string
default: ''

# Optional NuGet feed used as an additional source when installing the tool.
- name: toolSource
type: string
default: ''

@premun premun force-pushed the prvysoky/helix-job-monitor branch from b11ee5f to 6e31c0d Compare May 5, 2026 09:40
Copilot AI review requested due to automatic review settings May 5, 2026 09:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 11 changed files in this pull request and generated 3 comments.

Comment thread global.json Outdated
Comment on lines 15 to 18
"msbuild-sdks": {
"Microsoft.DotNet.Arcade.Sdk": "11.0.0-beta.26211.102",
"Microsoft.DotNet.Helix.Sdk": "11.0.0-beta.26211.102",
"Microsoft.DotNet.Helix.Sdk": "11.0.0-beta.26254.2",
"Microsoft.DotNet.SharedFramework.Sdk": "11.0.0-beta.26211.102",
Comment on lines +7 to +11
# Pool override. When empty the template selects a default azurelinux pool based on the team project.
- name: pool
type: object
default: {}

Comment thread NuGet.config
<add key="dotnet11-transport" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet11-transport/nuget/v3/index.json" />
<add key="dotnet-diagnostics-tests" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-diagnostics-tests/nuget/v3/index.json" />

<!-- TODO: Remove -->
Copilot AI review requested due to automatic review settings May 5, 2026 12:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 61 changed files in this pull request and generated 7 comments.

Comment thread NuGet.config
<add key="dotnet11" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet11/nuget/v3/index.json" />
<add key="dotnet11-transport" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet11-transport/nuget/v3/index.json" />
<add key="dotnet-diagnostics-tests" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-diagnostics-tests/nuget/v3/index.json" />
<!-- TODO: Remove -->
Comment on lines +118 to +124
pool:
${{ if eq(variables['System.TeamProject'], 'public') }}:
name: $(DncEngPublicBuildPool)
demands: ImageOverride -equals build.azurelinux.3.amd64.open
${{ else }}:
name: $(DncEngInternalBuildPool)
demands: ImageOverride -equals build.azurelinux.3.amd64
Comment on lines +194 to +201
- bash: |
globalJsonPath="$(Build.SourcesDirectory)/global.json"
# Update existing global.json to set sdk.version and tools.dotnet
cat "$globalJsonPath" | jq '.sdk.version = "11.0.100-preview.3.26170.106" | .tools.dotnet = "11.0.100-preview.3.26170.106"' > "$globalJsonPath.tmp" && mv "$globalJsonPath.tmp" "$globalJsonPath"
displayName: Prepare global.json

- bash: ./eng/common/dotnet.sh tool restore
displayName: Restore Helix Job Monitor
Comment thread eng/common/tools.sh
Comment on lines 560 to +563
# Returns a full path to an Arcade SDK task project file.
function GetSdkTaskProject {
local taskName=$1
local toolsetDir
toolsetDir="$(dirname "$_InitializeToolset")"
local proj="$toolsetDir/$taskName.proj"
if [[ -a "$proj" ]]; then
echo "$proj"
return
fi
# TODO: Remove this fallback once all supported versions use the new layout.
local legacyProj="$toolsetDir/SdkTasks/$taskName.proj"
if [[ -a "$legacyProj" ]]; then
echo "$legacyProj"
return
fi
Write-PipelineTelemetryError -category 'Build' "Unable to find $taskName.proj in toolset at: $toolsetDir"
ExitWithExitCode 3
taskName=$1
echo "$(dirname $_InitializeToolset)/SdkTasks/$taskName.proj"
- ${{ if eq(parameters.publishingInfraVersion, 4) }}:
- task: DownloadPipelineArtifact@2
displayName: Download Pipeline Artifacts (V4)
inputs:
- ${{ if eq(parameters.publishingInfraVersion, 4) }}:
- task: DownloadPipelineArtifact@2
displayName: Download Pipeline Artifacts (V4)
inputs:
arguments: >
-BuildId $(BARBuildId)
-PublishingInfraVersion ${{ parameters.publishingInfraVersion }}
-PublishingInfraVersion 3
Copilot AI review requested due to automatic review settings May 5, 2026 13:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 61 changed files in this pull request and generated 11 comments.

Comment thread NuGet.config
<add key="dotnet11" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet11/nuget/v3/index.json" />
<add key="dotnet11-transport" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet11-transport/nuget/v3/index.json" />
<add key="dotnet-diagnostics-tests" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-diagnostics-tests/nuget/v3/index.json" />
<!-- TODO: Remove -->
Comment on lines +30 to +34
/p:TestScope=${{ parameters.testScope }}
/p:TestRunNamePrefixSuffix=${{ parameters.testRunNamePrefixSuffix }}
/p:HelixBuild=$(Build.BuildNumber)
/p:EnableHelixJobMonitor=${{ parameters.useHelixMonitor }}
${{ parameters.extraHelixArguments }}
Comment thread eng/common/tools.sh
Comment on lines 560 to 564
# Returns a full path to an Arcade SDK task project file.
function GetSdkTaskProject {
local taskName=$1
local toolsetDir
toolsetDir="$(dirname "$_InitializeToolset")"
local proj="$toolsetDir/$taskName.proj"
if [[ -a "$proj" ]]; then
echo "$proj"
return
fi
# TODO: Remove this fallback once all supported versions use the new layout.
local legacyProj="$toolsetDir/SdkTasks/$taskName.proj"
if [[ -a "$legacyProj" ]]; then
echo "$legacyProj"
return
fi
Write-PipelineTelemetryError -category 'Build' "Unable to find $taskName.proj in toolset at: $toolsetDir"
ExitWithExitCode 3
taskName=$1
echo "$(dirname $_InitializeToolset)/SdkTasks/$taskName.proj"
}
Comment thread eng/common/tools.ps1
Comment on lines 616 to 619
# Returns a full path to an Arcade SDK task project file.
function GetSdkTaskProject([string]$taskName) {
$toolsetDir = Split-Path (InitializeToolset) -Parent
$proj = Join-Path $toolsetDir "$taskName.proj"
if (Test-Path $proj) {
return $proj
}
# TODO: Remove this fallback once all supported versions use the new layout.
$legacyProj = Join-Path $toolsetDir "SdkTasks\$taskName.proj"
if (Test-Path $legacyProj) {
return $legacyProj
}
throw "Unable to find $taskName.proj in toolset at: $toolsetDir"
return Join-Path (Split-Path (InitializeToolset) -Parent) "SdkTasks\$taskName.proj"
}
arguments: >
-BuildId $(BARBuildId)
-PublishingInfraVersion ${{ parameters.publishingInfraVersion }}
-PublishingInfraVersion 3
Comment on lines +28 to +34
# Optional NuGet feed used as an additional source when installing the tool. Only honored
# when 'toolNupkgArtifactName' is set; in the default code path the tool is restored from
# the consuming repo's .config/dotnet-tools.json manifest and no extra feeds are consulted.
- name: toolSource
type: string
default: ''

Comment on lines +194 to +200
- bash: |
globalJsonPath="$(Build.SourcesDirectory)/global.json"
# Update existing global.json to set sdk.version and tools.dotnet
sed -i 's/"sdk":{[^}]*}/"sdk":{"version":"11.0.100-preview.3.26170.106"}/g' "$globalJsonPath"
sed -i 's/"tools":{[^}]*}/"tools":{"dotnet":"11.0.100-preview.3.26170.106"}/g' "$globalJsonPath"
displayName: Prepare global.json

Comment thread eng/common/tools.sh
Comment on lines +416 to +420
if [[ -a "$toolset_location_file" ]]; then
local path=`cat "$toolset_location_file"`
if [[ -a "$path" ]]; then
# return value
_InitializeToolset="$path"
Comment thread eng/common/tools.sh
Comment on lines +440 to 444
local toolset_build_proj=`cat "$toolset_location_file"`

if [[ -a "$toolset_tools_dir/Build.proj" ]]; then
toolset_build_proj="$toolset_tools_dir/Build.proj"
else
Write-PipelineTelemetryError -category 'Build' "Unable to find Build.proj in toolset at: $toolset_tools_dir"
if [[ ! -a "$toolset_build_proj" ]]; then
Write-PipelineTelemetryError -category 'Build' "Invalid toolset path: $toolset_build_proj"
ExitWithExitCode 3
Comment on lines +37 to +41
targetPath: '$(Build.ArtifactStagingDirectory)/artifacts'
artifactName: ${{ coalesce(parameters.artifacts.publish.artifacts.name , 'Artifacts_$(Agent.Os)_$(_BuildConfig)') }}
continueOnError: true
condition: always()
retryCountOnTaskFailure: 10 # for any logs being locked
condition: succeeded()
retryCountOnTaskFailure: 10 # for any files being locked
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-Infrastructure-mono NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants