Skip to content

[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113)#69332

Closed
github-actions[bot] wants to merge 1 commit into
v3-3-testfrom
backport-e99daee-v3-3-test
Closed

[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113)#69332
github-actions[bot] wants to merge 1 commit into
v3-3-testfrom
backport-e99daee-v3-3-test

Conversation

@github-actions

@github-actions github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

The duration monitor flagged jobs by comparing a single nightly canary run
against the median of the preceding runs, so any one slow run — slow PyPI,
runner queue pressure, a cold cache — tripped the alert. Because a different
run was "latest" each day, a different set of jobs was flagged each day, and
network-bound constraint-resolution jobs that legitimately swing tens of
minutes dominated nearly every alert. The result was a near-daily alert whose
contents swung wildly and carried little signal.

Compare the median of the last few nightly runs against the baseline so the
two sides are symmetric and one unlucky run no longer trips it, and require a
larger absolute jump before flagging individual jobs.

Pin the monitor to successful (green) canary runs only. A failed or cancelled
canary stops partway, so its truncated wall-clock and per-job durations would
skew the baseline downwards and mask real regressions. The script already
defaults to this, but the guarantee is now explicit at the call site so it
cannot be silently changed.
(cherry picked from commit e99daee)

Co-authored-by: Jarek Potiuk jarek@potiuk.com

The duration monitor flagged jobs by comparing a single nightly canary run
against the median of the preceding runs, so any one slow run — slow PyPI,
runner queue pressure, a cold cache — tripped the alert. Because a different
run was "latest" each day, a different set of jobs was flagged each day, and
network-bound constraint-resolution jobs that legitimately swing tens of
minutes dominated nearly every alert. The result was a near-daily alert whose
contents swung wildly and carried little signal.

Compare the median of the last few nightly runs against the baseline so the
two sides are symmetric and one unlucky run no longer trips it, and require a
larger absolute jump before flagging individual jobs.

Pin the monitor to successful (green) canary runs only. A failed or cancelled
canary stops partway, so its truncated wall-clock and per-job durations would
skew the baseline downwards and mask real regressions. The script already
defaults to this, but the guarantee is now explicit at the call site so it
cannot be silently changed.
(cherry picked from commit e99daee)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant