Skip to content

ci/external: skip slot with warning when upstream source breaks#301

Merged
igorpecovnik merged 1 commit intomainfrom
ci-external-skip-broken-sources
May 3, 2026
Merged

ci/external: skip slot with warning when upstream source breaks#301
igorpecovnik merged 1 commit intomainfrom
ci-external-skip-broken-sources

Conversation

@igorpecovnik
Copy link
Copy Markdown
Member

@igorpecovnik igorpecovnik commented May 3, 2026

Symptom

Run 25275701457 — harfbuzz-jammy:jammy:armhf slot dies red on aptly mirror update after exhausting its 3 retries. The matrix view shows ✗, the workflow summary highlights it as a failure, and a re-run is needed even though the only fix is "wait for the upstream mirror to come back".

Fix

Treat upstream breakage as a per-slot skip with a warning, not a job failure. Three exit 1 paths in the "Download method" step (gh: no matching .deb; direct: wget failure / empty body; aptly: mirror update retry exhausted) now route through a new warn_skip helper that:

  • Emits a ::warning:: annotation
  • Appends a ⚠️ skipped: <reason> row to the step summary so the broken slot is tracked alongside the ✅ success rows
  • exit 0 so the matrix slot stays green and sibling slots keep running

Also wrapped aptly's mirror create, snapshot create, and publish in || warn_skip. Without that, set -euo pipefail at the top of the aptly block (line 736) would kill the job before warn_skip got a chance — and a missing/dead Release file on the remote URL is exactly what makes aptly mirror create exit non-zero.

What still fails fast

Workflow-input validation paths in the start job (CHUNK_INDEX / CHUNK_COUNT regex, unknown arch in gh's case) keep their exit 1 — those are caller misconfiguration, not upstream breakage.

Result

The matrix view now reads as: green slots updated, green-with-yellow-warning slots skipped a broken upstream, real red ✗ slots are actual bugs that need fixing. The "harfbuzz-jammy mirror flaked again today" noise drops out of the failure-rerun loop.

Test plan

  • Re-trigger Infrastructure: APT repositories update against this branch with the same input set as run 25275701457; the harfbuzz-jammy slot (or whichever is currently broken) should turn green with a yellow warning instead of red.
  • Confirm the step summary table shows the skipped slot with ⚠️ skipped: <reason> next to the successful slots' .
  • Synthetic failure: temporarily change the URL in one os/external/<x>.conf to https://invalid.example.invalid/, re-run, confirm warn_skip path fires for all three method types.
  • Confirm a successful run with no broken upstreams still produces the same step summary as before (only warn_skip was added; success path is untouched).

Today a single dead upstream (404, deleted GH release, expired apt
mirror, aptly snapshot reset) fails its matrix slot red and leaves
a noisy red ✗ on the workflow's run summary. Run 25275701457 had
harfbuzz-jammy:jammy:armhf go red on aptly mirror update; rerunning
just to chase a transient upstream is a waste of CI minutes and
hides the real signal.

Add `warn_skip <reason>` near the top of the "Download method:"
step. It emits a `::warning::` annotation, appends a "skipped" row
to the step summary alongside the success rows so the broken slot
is still tracked, and exits 0 so the matrix slot stays green and
sibling slots keep running.

Replace the three "upstream broke" exit-1 paths:

  * gh:     no .deb matched any pattern  → warn_skip
  * direct: wget failed / produced no .deb → warn_skip
  * aptly:  mirror update exhausted retries → warn_skip

Wrap the aptly mirror create / snapshot create / publish calls in
`|| warn_skip`. Without this, set -euo pipefail (line 736) would
kill the job before warn_skip got a chance — and a missing/dead
Release file on the remote URL is exactly what triggers a hard
failure in `aptly mirror create`.

Validation exit-1s (CHUNK_INDEX / CHUNK_COUNT regex, unknown arch
in the gh case block) stay as-is — those are workflow-input typos
the caller needs to fix, not upstream breakage.

Net effect on the operator's view: run summary's table now has a
"⚠️ skipped: <reason>" row for each broken upstream alongside the
"✅" rows for successful updates; broken upstreams are visible but
non-blocking, and the GHA matrix view stays mostly green so any
real red ✗ stands out.
@github-actions github-actions Bot added 05 Milestone: Second quarter release size/medium PR with more then 50 and less then 250 lines GitHub Actions GitHub Actions code Needs review Seeking for review labels May 3, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Walkthrough

The workflow now defines a warn_skip() helper function that converts certain download failures into ::warning:: annotations and summary entries, then exits with exit 0 instead of failing the job. The gh, direct, and aptly methods now call this function when they encounter missing artifacts, failed downloads, or operation errors respectively. This allows matrix slots to complete successfully while documenting which artifacts were skipped.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: skipping workflow slots with warnings when upstream sources fail, which is the core objective of this PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description clearly explains the problem (upstream breakage causing job failures), the solution (new warn_skip helper to skip slots gracefully), and includes detailed test plan with specific verification steps.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci-external-skip-broken-sources

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/infrastructure-download-external.yml (1)

876-886: ⚡ Quick win

Retry loop's mirror recreate lacks || warn_skip protection.

Lines 878 and 881 call aptly mirror create without error handling. If the upstream becomes fully unavailable mid-retry (after initial create succeeded), this will fail under set -euo pipefail and produce a red job—contrary to the PR's goal.

Consider wrapping these calls with || warn_skip for consistent coverage:

Proposed fix
                   aptly -config="$APTLY_CFG" mirror drop "$MIRROR" || true
                   if [[ -n "$COMPONENTS" ]]; then
                     # shellcheck disable=SC2086
-                    aptly -config="$APTLY_CFG" -ignore-signatures "${FILTER_ARGS[@]}" $ADDITIONAL_FILTER -architectures="${{ matrix.arch }}" mirror create "$MIRROR" "$URL" "$DIST" $COMPONENTS
+                    aptly -config="$APTLY_CFG" -ignore-signatures "${FILTER_ARGS[@]}" $ADDITIONAL_FILTER -architectures="${{ matrix.arch }}" mirror create "$MIRROR" "$URL" "$DIST" $COMPONENTS \
+                      || warn_skip "aptly mirror create failed during retry (URL='$URL' DIST='$DIST' COMPONENTS='$COMPONENTS')"
                   else
                     # shellcheck disable=SC2086
-                    aptly -config="$APTLY_CFG" -ignore-signatures "${FILTER_ARGS[@]}" $ADDITIONAL_FILTER -architectures="${{ matrix.arch }}" mirror create "$MIRROR" "$URL" "$DIST"
+                    aptly -config="$APTLY_CFG" -ignore-signatures "${FILTER_ARGS[@]}" $ADDITIONAL_FILTER -architectures="${{ matrix.arch }}" mirror create "$MIRROR" "$URL" "$DIST" \
+                      || warn_skip "aptly mirror create failed during retry (URL='$URL' DIST='$DIST')"
                   fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/infrastructure-download-external.yml around lines 876 -
886, The retry branch calls the aptly mirror create command (the two occurrences
that include "$MIRROR" "$URL" "$DIST" with and without $COMPONENTS) without
fall-through error handling, so under set -euo pipefail a mid-retry failure will
fail the job; update those two aptly -config=... mirror create invocations to
append || warn_skip so that failures are converted into a controlled skip via
the existing warn_skip function rather than causing a hard error, preserving the
current logic that uses warn_skip on other create/update failure paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/infrastructure-download-external.yml:
- Around line 876-886: The retry branch calls the aptly mirror create command
(the two occurrences that include "$MIRROR" "$URL" "$DIST" with and without
$COMPONENTS) without fall-through error handling, so under set -euo pipefail a
mid-retry failure will fail the job; update those two aptly -config=... mirror
create invocations to append || warn_skip so that failures are converted into a
controlled skip via the existing warn_skip function rather than causing a hard
error, preserving the current logic that uses warn_skip on other create/update
failure paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 22725c6b-b56b-4b2d-9dec-b40d417d5b8b

📥 Commits

Reviewing files that changed from the base of the PR and between 1259449 and 35d7eab.

📒 Files selected for processing (1)
  • .github/workflows/infrastructure-download-external.yml

@igorpecovnik igorpecovnik merged commit 331cee9 into main May 3, 2026
6 checks passed
@igorpecovnik igorpecovnik deleted the ci-external-skip-broken-sources branch May 3, 2026 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

05 Milestone: Second quarter release GitHub Actions GitHub Actions code Needs review Seeking for review size/medium PR with more then 50 and less then 250 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant