🐛 Fix nightly regressions: consistency violations, vitest OOM, API retries#4108
🐛 Fix nightly regressions: consistency violations, vitest OOM, API retries#4108clubanderson merged 2 commits intomainfrom
Conversation
Reverts #4103 (active user count in navbar) — feature explicitly rejected by maintainer. Also disables the Auto-QA check that keeps filing issues about this feature. Signed-off-by: Andrew Anderson <andy@clubanderson.com>
…tries - Fix unguarded .join() and fetch() without timeout in MissionBrowser.tsx (consistency-test Phase 3 and Phase 5 violations) - Add maxWorkers/maxForks limits for CI to prevent vitest worker OOM crashes with 600+ test files on 2-core GitHub Actions runners - Add retry logic with exponential backoff for transient GitHub API errors (502/503/504) in the Auto-QA Tuner copilot-stall-check job Closes #4087, closes #4088, closes #4089, closes #4091 Signed-off-by: Andrew Anderson <andy@clubanderson.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Deploy Preview for kubestellarconsole ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
👋 Hey @clubanderson — thanks for opening this PR!
This is an automated message. |
|
Thank you for your contribution! Your PR has been merged. Check out what's new:
Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey |
There was a problem hiding this comment.
Pull request overview
Fixes multiple nightly regressions across the web UI, Vitest CI stability, and Auto-QA workflow resilience to transient GitHub API outages.
Changes:
- Add fetch timeouts (AbortController) and join-guarding in
MissionBrowser.tsxto satisfy consistency-test phases 3/5. - Reduce Vitest worker/fork concurrency on CI to avoid OOM/worker-termination failures on small GitHub Actions runners.
- Introduce exponential-backoff retries for selected GitHub API calls in the Auto-QA Tuner
copilot-stall-checkjob.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
web/vite.config.ts |
Limits Vitest concurrency on CI (workers/forks) to reduce OOM risk. |
web/src/components/missions/MissionBrowser.tsx |
Adds timeout-abort for raw GitHub downloads and guards a .join() call to satisfy consistency checks. |
.github/workflows/auto-qa.yml |
Permanently disables the “active user count” check to prevent repeated re-filing. |
.github/workflows/auto-qa-tuner.yml |
Adds retry helper + wraps several GitHub API list calls with exponential backoff. |
Comments suppressed due to low confidence (1)
.github/workflows/auto-qa-tuner.yml:935
- The escalation path uses retries for
issues.listForRepo, but the subsequentissues.create(diagnostic issue creation) is not retried. If GitHub is returning 502/503/504, escalation will still fail and can break the workflow run. Consider wrappinggithub.rest.issues.createinwithRetry(...)too.
if (!hasDiag) {
const issueList = escalationNeeded.map(i =>
`- #${i.number}: ${i.title} (stalled ${i.stalledMinutes}m)`
).join('\n');
await github.rest.issues.create({
owner, repo,
| }), `listComments(#${issue.number})`); | ||
| const hasStallComment = comments.some(c => c.body.includes('Copilot Stall Detected')); | ||
| if (!hasStallComment) { | ||
| await github.rest.issues.createComment({ |
There was a problem hiding this comment.
issues.createComment is still called without withRetry. A transient 502/503/504 during GitHub outages can still fail the stall-check job even after adding retries around the list calls. Consider wrapping issues.createComment in withRetry(...) (and/or catching transient failures, logging a warning, and continuing).
🔄 Auto-Applying Copilot Code ReviewCopilot code review found 0 code suggestion(s) and 1 general comment(s). Also address these general comments:
Push all fixes in a single commit. Run Auto-generated by copilot-review-apply workflow. |
Summary
MissionBrowser.tsx: unguarded.join()(Phase 3) andfetch()without timeout/signal (Phase 5)maxWorkers: 2/maxForks: 2for CI in vitest config to prevent worker OOM crashes with 600+ test files on 2-core GitHub Actions runnerscopilot-stall-checkjobCloses #4087 #4088 #4089 #4091
Test plan
bash scripts/consistency-test.shpasses locally (0 errors, 13 warnings)npm run buildsucceeds