[github-actions] Retry transient GitHub infrastructure failures in workflows

## Summary

Some repository workflows still fail on transient GitHub-side infrastructure errors such as `git fetch` HTTP 500 responses during checkout. In those cases the workflow logic is fine, but the run still requires a manual rerun to go green.

## Current Behavior

Intermittent failures like the following can fail a job even though rerunning the same workflow immediately succeeds:

```text
/usr/bin/git -c protocol.version=2 fetch --no-tags --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/*
Error: error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500
Error: fatal: expected flush after ref listing
The process '/usr/bin/git' failed with exit code 128
Waiting 10 seconds before trying again
/usr/bin/git -c protocol.version=2 fetch --no-tags --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/*
Error: error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500
Error: fatal: expected 'packfile'
The process '/usr/bin/git' failed with exit code 128
Waiting 14 seconds before trying again
/usr/bin/git -c protocol.version=2 fetch --no-tags --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/*
remote: Internal Server Error
Error: fatal: unable to access 'https://github.com/php-fast-forward/dev-tools/': The requested URL returned error: 500
Error: The process '/usr/bin/git' failed with exit code 128
```

## Expected Behavior

When a workflow fails because GitHub checkout or another clearly transient GitHub-side operation hits an infrastructure error, the repository SHOULD retry or rerun automatically within a bounded policy. Logic bugs, validation failures, test failures, and deterministic workflow mistakes MUST still fail normally without automatic reruns.

## Scope

Investigate and implement a workflow-level resilience strategy for transient GitHub Actions failures, such as:

- checkout and fetch failures with HTTP 500 or similar GitHub-side transport errors
- short-lived internal GitHub service failures that disappear on immediate rerun
- a bounded retry or rerun mechanism that does not hide genuine workflow regressions

## Acceptance Criteria

- We define which failure signatures count as transient GitHub or network infrastructure failures.
- Repository workflows can retry or rerun automatically when those transient signatures are detected.
- The retry policy is bounded and visible in logs so maintainers can still diagnose flaky infrastructure.
- Deterministic failures from workflow logic, command failures, tests, or validation do not get retried automatically.
- The implementation documents where the retry policy applies and any intentionally excluded workflows or steps.
- README or docs are updated if maintainers need to understand or tune the behavior.

## Non-Goals

- Retrying failing tests, lint, changelog validation, or other real quality-signal failures.
- Hiding repeated infrastructure instability without surfacing that retries happened.
- Introducing unbounded rerun loops.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[github-actions] Retry transient GitHub infrastructure failures in workflows #175

Summary

Current Behavior

Expected Behavior

Scope

Acceptance Criteria

Non-Goals

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[github-actions] Retry transient GitHub infrastructure failures in workflows #175

Description

Summary

Current Behavior

Expected Behavior

Scope

Acceptance Criteria

Non-Goals

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions