Skip to content

🐛 Fix nightly regressions: consistency violations, vitest OOM, API retries#4108

Merged
clubanderson merged 2 commits intomainfrom
fix/workflow-issues
Apr 1, 2026
Merged

🐛 Fix nightly regressions: consistency violations, vitest OOM, API retries#4108
clubanderson merged 2 commits intomainfrom
fix/workflow-issues

Conversation

@clubanderson
Copy link
Copy Markdown
Collaborator

Summary

  • Fix consistency-test violations in MissionBrowser.tsx: unguarded .join() (Phase 3) and fetch() without timeout/signal (Phase 5)
  • Add maxWorkers: 2 / maxForks: 2 for CI in vitest config to prevent worker OOM crashes with 600+ test files on 2-core GitHub Actions runners
  • Add retry logic with exponential backoff (3 retries, 2s/4s/8s) for transient GitHub API 502/503/504 errors in the Auto-QA Tuner copilot-stall-check job

Closes #4087 #4088 #4089 #4091

Test plan

  • bash scripts/consistency-test.sh passes locally (0 errors, 13 warnings)
  • npm run build succeeds
  • Nightly test suite should pass with reduced worker concurrency
  • Auto-QA Tuner stall check should survive transient GitHub API outages

Reverts #4103 (active user count in navbar) — feature explicitly
rejected by maintainer. Also disables the Auto-QA check that keeps
filing issues about this feature.

Signed-off-by: Andrew Anderson <andy@clubanderson.com>
…tries

- Fix unguarded .join() and fetch() without timeout in MissionBrowser.tsx
  (consistency-test Phase 3 and Phase 5 violations)
- Add maxWorkers/maxForks limits for CI to prevent vitest worker OOM crashes
  with 600+ test files on 2-core GitHub Actions runners
- Add retry logic with exponential backoff for transient GitHub API errors
  (502/503/504) in the Auto-QA Tuner copilot-stall-check job

Closes #4087, closes #4088, closes #4089, closes #4091

Signed-off-by: Andrew Anderson <andy@clubanderson.com>
Copilot AI review requested due to automatic review settings April 1, 2026 12:05
@kubestellar-prow kubestellar-prow bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label Apr 1, 2026
@kubestellar-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign clubanderson for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 1, 2026

Deploy Preview for kubestellarconsole ready!

Name Link
🔨 Latest commit 148fcf2
🔍 Latest deploy log https://app.netlify.com/projects/kubestellarconsole/deploys/69cd0a0094d2b00008e8b6e6
😎 Deploy Preview https://deploy-preview-4108.console-deploy-preview.kubestellar.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

👋 Hey @clubanderson — thanks for opening this PR!

🤖 This project is developed exclusively using AI coding assistants.

Please do not attempt to code anything for this project manually.
All contributions should be authored using an AI coding tool such as:

This ensures consistency in code style, architecture patterns, test coverage,
and commit quality across the entire codebase.


This is an automated message.

@kubestellar-prow kubestellar-prow bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 1, 2026
@clubanderson clubanderson merged commit 03a0111 into main Apr 1, 2026
20 of 22 checks passed
@kubestellar-prow kubestellar-prow bot deleted the fix/workflow-issues branch April 1, 2026 12:05
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Thank you for your contribution! Your PR has been merged.

Check out what's new:

Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes multiple nightly regressions across the web UI, Vitest CI stability, and Auto-QA workflow resilience to transient GitHub API outages.

Changes:

  • Add fetch timeouts (AbortController) and join-guarding in MissionBrowser.tsx to satisfy consistency-test phases 3/5.
  • Reduce Vitest worker/fork concurrency on CI to avoid OOM/worker-termination failures on small GitHub Actions runners.
  • Introduce exponential-backoff retries for selected GitHub API calls in the Auto-QA Tuner copilot-stall-check job.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
web/vite.config.ts Limits Vitest concurrency on CI (workers/forks) to reduce OOM risk.
web/src/components/missions/MissionBrowser.tsx Adds timeout-abort for raw GitHub downloads and guards a .join() call to satisfy consistency checks.
.github/workflows/auto-qa.yml Permanently disables the “active user count” check to prevent repeated re-filing.
.github/workflows/auto-qa-tuner.yml Adds retry helper + wraps several GitHub API list calls with exponential backoff.
Comments suppressed due to low confidence (1)

.github/workflows/auto-qa-tuner.yml:935

  • The escalation path uses retries for issues.listForRepo, but the subsequent issues.create (diagnostic issue creation) is not retried. If GitHub is returning 502/503/504, escalation will still fail and can break the workflow run. Consider wrapping github.rest.issues.create in withRetry(...) too.
              if (!hasDiag) {
                const issueList = escalationNeeded.map(i =>
                  `- #${i.number}: ${i.title} (stalled ${i.stalledMinutes}m)`
                ).join('\n');
                await github.rest.issues.create({
                  owner, repo,

}), `listComments(#${issue.number})`);
const hasStallComment = comments.some(c => c.body.includes('Copilot Stall Detected'));
if (!hasStallComment) {
await github.rest.issues.createComment({
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issues.createComment is still called without withRetry. A transient 502/503/504 during GitHub outages can still fail the stall-check job even after adding retries around the list calls. Consider wrapping issues.createComment in withRetry(...) (and/or catching transient failures, logging a warning, and continuing).

Copilot uses AI. Check for mistakes.
@clubanderson
Copy link
Copy Markdown
Collaborator Author

🔄 Auto-Applying Copilot Code Review

Copilot code review found 0 code suggestion(s) and 1 general comment(s).

Also address these general comments:

  • .github/workflows/auto-qa-tuner.yml (line 902): issues.createComment is still called without withRetry. A transient 502/503/504 during GitHub outages can still fail

Push all fixes in a single commit. Run cd web && npm run build && npm run lint before committing.


Auto-generated by copilot-review-apply workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the DCO. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nightly regression: consistency-test

3 participants