Skip to content

CI: Improve workflow verification retry logic in run-skipped-ci #2000

@justin808

Description

@justin808

Problem

The run-skipped-ci.yml workflow uses a fixed 5-second wait before verifying that workflows are queued:

await new Promise(resolve => setTimeout(resolve, 5000));

This approach has several issues:

  1. Fragile timing: Under heavy GitHub Actions load, 5 seconds may be insufficient
  2. False warnings: Creates "not yet verified" warnings when workflows are legitimately queued but not yet visible
  3. No retry logic: Single check with no fallback if the initial check fails

Impact

  • Users may see false warnings about workflows not being verified
  • Under load, workflows might not be detected even though they're running
  • No graceful degradation if GitHub API is slow

Location

.github/workflows/run-skipped-ci.yml: Line 116

Proposed Solution

Add retry logic with exponential backoff:

// Retry verification with exponential backoff
const maxRetries = 3;
const baseDelay = 5000;

for (let attempt = 1; attempt <= maxRetries; attempt++) {
  await new Promise(resolve => setTimeout(resolve, baseDelay * attempt));
  
  const workflows = await checkWorkflows();
  
  if (workflows.length > 0) {
    // Success - workflows found
    break;
  }
  
  if (attempt < maxRetries) {
    console.log(`Attempt ${attempt} failed, retrying in ${baseDelay * (attempt + 1)}ms...`);
  }
}

Or increase the initial timeout to a more conservative value (e.g., 10-15 seconds).

Priority

Low - Current implementation works in most cases, but improvements would make it more reliable under load.

Related

Part of the CI reliability improvements from PR #1995.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions