Skip to content

Helix job monitor: surface work item failures and job errors into AzDO build timeline#16874

Closed
Copilot wants to merge 2 commits into
mainfrom
copilot/update-job-monitor-console-output
Closed

Helix job monitor: surface work item failures and job errors into AzDO build timeline#16874
Copilot wants to merge 2 commits into
mainfrom
copilot/update-job-monitor-console-output

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 22, 2026

WaitForHelixJobCompletion was silently swallowing per-work-item failure details and logging job-level errors without a FailureCategory, meaning neither showed up as proper annotations in the AzDO build timeline.

Changes

WaitForHelixJobCompletion.cs

  • Fix job-level error to use FailureCategory.Helix (was untagged Log.LogError)
  • Track failed work items across polling iterations via HashSet<string>; on first detection of each failure, emit Log.LogError(FailureCategory.Test, ...) with the work item name, job name, and console log URI — same format and URI construction already used by CheckHelixJobStatus
// Report any newly failed work items so this information makes it into the AzDO build timeline.
foreach (string failedWorkItemName in pf.Failed ?? Enumerable.Empty<string>())
{
    string wi = Helpers.CleanWorkItemName(failedWorkItemName);
    if (reportedWorkItemFailures.Add(wi))
    {
        string consoleUri = HelixApi.Options.BaseUri.AbsoluteUri.TrimEnd('/') + $"/api/2019-06-17/jobs/{jobName}/workitems/{Uri.EscapeDataString(wi)}/console";
        Log.LogError(FailureCategory.Test, $"Work item {wi} in job {jobName} has failed. Failure log: {consoleUri}{accessTokenSuffix}");
    }
}

Failures are reported as they appear (each polling iteration), not deferred to CheckHelixJobStatus, giving engineers real-time visibility in the timeline without waiting for the full job to complete.

To double check:

Copilot AI requested review from Copilot and removed request for Copilot May 22, 2026 22:28
…ring job monitoring

Agent-Logs-Url: https://github.com/dotnet/arcade/sessions/4ecaafe5-e0d1-44bf-9d46-4d681b570fe0

Co-authored-by: mmitche <8725170+mmitche@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 22, 2026 22:41
Copilot AI changed the title [WIP] Update job monitor to print failed work item info to timeline Helix job monitor: surface work item failures and job errors into AzDO build timeline May 22, 2026
Copilot AI requested a review from mmitche May 22, 2026 22:42
@mmitche mmitche closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Job Monitor errors and failed work item info should make it into the timeline

2 participants