Skip to content

ci: retry SonarCloud scan once on transient failure#21604

Merged
taratorio merged 1 commit into
mainfrom
yperbasis/sonar-scan-retry
Jun 3, 2026
Merged

ci: retry SonarCloud scan once on transient failure#21604
taratorio merged 1 commit into
mainfrom
yperbasis/sonar-scan-retry

Conversation

@yperbasis
Copy link
Copy Markdown
Member

Problem

Two merge-queue evictions in the last 3 weeks were caused by the SonarCloud scan failing to download the scanner CLI from binaries.sonarsource.com — not by anything in the queued code:

In the merge queue the sonar job fast-cancels the whole CI Gate run on failure, so a CDN blip cancels ~40 min of green sibling jobs and github-merge-queue removes the PR with failed_checks. Per CI-GUIDELINES.md, merge-queue checks must have no false positives; CDN weather is one.

Fix

Give the scan one spaced retry:

  • the first attempt runs with continue-on-error: true
  • if it failed, wait 90s and run the action again
  • if the retry also fails, the job fails as before — a persistent outage still blocks correctly

cache-warming-only runs are unaffected (scan skipped → outcome is skipped, so the retry steps skip too). A continue-on-error step reports conclusion: success to the jobs API, so ci-gate's root-cause detection won't flag a run recovered by the retry; a double failure is attributed to SonarCloud scan (retry).

Alternatives considered

  • Pre-seeding the runner tool cache: impossible — the action's tc.find() lookup can never match SonarSource's 4-segment version string (semver.clean("8.1.0.6389") is null), so the action's internal tool-cache path is dead code on any runner.
  • Mirroring the scanner zip via scannerBinariesUrl: removes the CDN dependency entirely but adds hosting and per-upgrade maintenance, and the GPG keyserver dependency remains. Can revisit if 403s persist despite the retry.

No tests: CI workflow YAML change (TDD not applicable); validated with actionlint and make lint.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the SonarCloud GitHub Actions workflow to reduce merge-queue false positives caused by transient external download/service failures during the Sonar scan.

Changes:

  • Runs the first SonarCloud scan attempt with continue-on-error: true and captures its outcome via a step id.
  • If the first attempt fails, waits 90 seconds and retries the SonarCloud scan once.
  • Keeps existing behavior for persistent failures (the job still fails if the retry fails) and leaves cache-warming-only runs unaffected.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yperbasis yperbasis requested a review from taratorio June 3, 2026 12:51
@taratorio taratorio enabled auto-merge June 3, 2026 12:53
@taratorio taratorio added this pull request to the merge queue Jun 3, 2026
Merged via the queue into main with commit 830763a Jun 3, 2026
90 checks passed
@taratorio taratorio deleted the yperbasis/sonar-scan-retry branch June 3, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants