Skip to content

Added retries to yarn in CI#26781

Merged
EvanHahn merged 1 commit intomainfrom
add-retries-to-ci-yarn
Mar 12, 2026
Merged

Added retries to yarn in CI#26781
EvanHahn merged 1 commit intomainfrom
add-retries-to-ci-yarn

Conversation

@EvanHahn
Copy link
Copy Markdown
Contributor

no ref

The yarn registry has been flaky. This adds a little retrying in case the registry is down.

no ref

The yarn registry has been flaky. This adds a little retrying in case
the registry is down.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 11, 2026

Walkthrough

The .github/scripts/install-deps.sh script has been modified to add a retry mechanism for the yarn dependency installation step. A helper function and a retry loop that attempts dependency installation up to 4 times have been introduced, with exponential-like backoff between attempts. Progress logging has been added to track each attempt. The post-installation logic, including the sqlite3 presence check and potential rebuild, remains unchanged. The net change adds 22 lines while removing 2 lines.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding retry logic to yarn commands in CI to address flaky registry issues.
Description check ✅ Passed The description is related to the changeset, explaining the motivation (flaky yarn registry) and the solution (adding retries) that aligns with the code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-retries-to-ci-yarn

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/scripts/install-deps.sh (1)

25-27: Add jitter to the backoff.

The fixed 15/30/45 cadence can cause multiple CI jobs to hit the registry again at the same time after a shared outage. A small random offset will spread retries out and improve the odds of recovery.

💡 Possible tweak
-    sleep_seconds=$((attempt * 15))
+    jitter_seconds=$((RANDOM % 10))
+    sleep_seconds=$((attempt * 15 + jitter_seconds))
     echo "::warning::Dependency installation failed, retrying in ${sleep_seconds} seconds..."
     sleep "$sleep_seconds"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/install-deps.sh around lines 25 - 27, The backoff uses a
fixed cadence via sleep_seconds=$((attempt * 15)) which can synchronize retries;
add a small random jitter when computing sleep_seconds (e.g., compute a jitter
value using RANDOM or $RANDOM modulo a small max and add/subtract it) and then
use that jittered sleep_seconds in the existing echo and sleep calls so retries
are spread out; update the variables sleep_seconds and the sleep invocation (and
keep attempt unchanged) to include the jitter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/install-deps.sh:
- Around line 16-23: The script currently retries every non-zero exit from
install_dependencies; change install_dependencies to detect transient
registry/network errors (e.g., match yarn/npm stderr for patterns like
ENOTFOUND, EAI_AGAIN, ETIMEDOUT, ECONNREFUSED, 502/503/504/408, "socket hang
up", "ETIMEDOUT", "Request failed" or similar) and return a special retryable
status only for those cases, while returning a non-retryable failure immediately
for deterministic errors (lockfile drift, authentication, package metadata
errors); update the loop that checks install_dependencies (the code using
attempt and max_attempts) to only retry when install_dependencies indicates a
transient/network error and fail fast on non-retryable failures.

---

Nitpick comments:
In @.github/scripts/install-deps.sh:
- Around line 25-27: The backoff uses a fixed cadence via
sleep_seconds=$((attempt * 15)) which can synchronize retries; add a small
random jitter when computing sleep_seconds (e.g., compute a jitter value using
RANDOM or $RANDOM modulo a small max and add/subtract it) and then use that
jittered sleep_seconds in the existing echo and sleep calls so retries are
spread out; update the variables sleep_seconds and the sleep invocation (and
keep attempt unchanged) to include the jitter.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7948207d-c048-46b1-bfba-7ad2502c27d6

📥 Commits

Reviewing files that changed from the base of the PR and between 665ff7c and 72e3a52.

📒 Files selected for processing (1)
  • .github/scripts/install-deps.sh

Comment on lines +16 to +23
if install_dependencies "$@"; then
break
fi

if [ "$attempt" -eq "$max_attempts" ]; then
echo "Dependency installation failed after ${max_attempts} attempts"
exit 1
fi
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Retry only transient install failures.

Any non-zero yarn install exit now gets retried, including deterministic failures like lockfile drift, bad package metadata, or auth/config issues. That will slow down broken PRs and bury the real failure behind three extra attempts. Please gate retries on transient registry/network errors and fail fast otherwise.

💡 Possible fix
 for attempt in $(seq 1 "$max_attempts"); do
     echo "Installing dependencies with --ignore-scripts... (attempt ${attempt}/${max_attempts})"
 
-    if install_dependencies "$@"; then
+    log_file="$(mktemp)"
+    if install_dependencies "$@" 2>&1 | tee "$log_file"; then
+        rm -f "$log_file"
         break
     fi
+
+    if ! grep -Eqi 'EAI_AGAIN|ECONNRESET|ETIMEDOUT|ESOCKETTIMEDOUT|502 Bad Gateway|503 Service Unavailable' "$log_file"; then
+        rm -f "$log_file"
+        echo "Dependency installation failed with a non-transient error; not retrying"
+        exit 1
+    fi
+    rm -f "$log_file"
 
     if [ "$attempt" -eq "$max_attempts" ]; then
         echo "Dependency installation failed after ${max_attempts} attempts"
         exit 1
     fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if install_dependencies "$@"; then
break
fi
if [ "$attempt" -eq "$max_attempts" ]; then
echo "Dependency installation failed after ${max_attempts} attempts"
exit 1
fi
log_file="$(mktemp)"
if install_dependencies "$@" 2>&1 | tee "$log_file"; then
rm -f "$log_file"
break
fi
if ! grep -Eqi 'EAI_AGAIN|ECONNRESET|ETIMEDOUT|ESOCKETTIMEDOUT|502 Bad Gateway|503 Service Unavailable' "$log_file"; then
rm -f "$log_file"
echo "Dependency installation failed with a non-transient error; not retrying"
exit 1
fi
rm -f "$log_file"
if [ "$attempt" -eq "$max_attempts" ]; then
echo "Dependency installation failed after ${max_attempts} attempts"
exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/install-deps.sh around lines 16 - 23, The script currently
retries every non-zero exit from install_dependencies; change
install_dependencies to detect transient registry/network errors (e.g., match
yarn/npm stderr for patterns like ENOTFOUND, EAI_AGAIN, ETIMEDOUT, ECONNREFUSED,
502/503/504/408, "socket hang up", "ETIMEDOUT", "Request failed" or similar) and
return a special retryable status only for those cases, while returning a
non-retryable failure immediately for deterministic errors (lockfile drift,
authentication, package metadata errors); update the loop that checks
install_dependencies (the code using attempt and max_attempts) to only retry
when install_dependencies indicates a transient/network error and fail fast on
non-retryable failures.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deciding not to address this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EvanHahn Understood, no worries! Happy to defer to your judgment on this one.

🐰 hops away quietly

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.20%. Comparing base (8c5e239) to head (72e3a52).
⚠️ Report is 28 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #26781      +/-   ##
==========================================
- Coverage   73.20%   73.20%   -0.01%     
==========================================
  Files        1534     1534              
  Lines      120966   121034      +68     
  Branches    14630    14635       +5     
==========================================
+ Hits        88559    88597      +38     
- Misses      31393    31423      +30     
  Partials     1014     1014              
Flag Coverage Δ
admin-tests 54.31% <ø> (ø)
e2e-tests 73.20% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ErisDS

This comment was marked as outdated.

@EvanHahn EvanHahn requested a review from troyciesco March 11, 2026 19:10
@EvanHahn EvanHahn merged commit 60d36d5 into main Mar 12, 2026
60 of 62 checks passed
@EvanHahn EvanHahn deleted the add-retries-to-ci-yarn branch March 12, 2026 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants