safe-outputs.call-workflow workers cancel each other under parallel fan-out due to gh-aw-copilot-${{ github.workflow }} job concurrency group'

# `safe-outputs.call-workflow` workers cancel each other under parallel fan-out due to `gh-aw-copilot-${{ github.workflow }}` job concurrency group

## TL;DR

The compiler injects a job-level concurrency block on the agent job of every workflow:

```yaml
concurrency:
  group: "gh-aw-copilot-${{ github.workflow }}"
```

In a reusable worker invoked via `workflow_call` (e.g. by a `safe-outputs.call-workflow` orchestrator), `${{ github.workflow }}` resolves to the **caller's** workflow name, not the worker's. As a result, every parallel invocation of the worker from the same caller evaluates the concurrency group to the *same* string. GitHub Actions only keeps one waiting request per group; the rest are cancelled with:

```
Canceling since a higher priority waiting request for gh-aw-copilot-<caller-workflow-name> exists
```

This makes `safe-outputs.call-workflow` lossy as a fan-out primitive whenever ≥3 worker invocations land close together in time. In a recent test of mine, 3 out of 5 worker runs were cancelled.

## Environment

- `gh aw` version: **v0.74.8**
- Engine: `copilot` (default)
- Runner: `ubuntu-latest` on `github.com`
- Behavior reproduces consistently

## Reproduction

Two workflows, an orchestrator that fans out to a worker via `safe-outputs.call-workflow`.

### `orchestrator.md` (caller)

```yaml
---
on:
  issues:
    types: [opened]
permissions:
  contents: read
  issues: read
safe-outputs:
  call-workflow:
    workflows: [reviewer-worker]
    max: 1
---

# Orchestrator

Always call the `reviewer_worker` MCP tool with:
- `issue_number`: `${{ github.event.issue.number }}`
- `proposed_comment`: a one-line placeholder string
```

### `reviewer-worker.md` (worker)

```yaml
---
on:
  workflow_call:
    inputs:
      issue_number:
        type: string
        required: true
      proposed_comment:
        type: string
        required: true
      payload:
        type: string
        required: false

permissions:
  contents: read
  issues: read

safe-outputs:
  add-comment:
    max: 1
    target: "*"
---

# Worker

Echo `${{ inputs.proposed_comment }}` to a comment on issue `${{ inputs.issue_number }}` via `add_comment`. No reasoning.
```

### Trigger

Open 5 GitHub issues in quick succession (e.g. via a small bash loop with `gh issue create`). The orchestrator fires 5 times, each invoking the worker once.

### Expected

5 worker runs complete; 5 comments posted (one per issue). Worker invocations are independent — different `issue_number` inputs, distinct `github.run_id`s, distinct logical work units.

### Actual (~60% loss rate)

Several worker runs are cancelled mid-flight with the message above. Workflow run graph shows:

- `call-{worker} / pre_activation`  ✅
- `call-{worker} / activation`      ✅
- `call-{worker} / agent`           ❌ cancelled
- `call-{worker} / detection`       — skipped
- `call-{worker} / safe_outputs`    — skipped (so no comment posted)
- `call-{worker} / conclusion`      ✅

In one of my real runs, 3 of 5 parallel worker invocations got cancelled this way. The job-level annotation on the cancelled jobs reads:

> Canceling since a higher priority waiting request for `gh-aw-copilot-Issue Triage Agent for dotnet/aspnetcore` exists

— note `Issue Triage Agent for dotnet/aspnetcore` is the **orchestrator's** name, not the worker's.

## Root cause

Inside the generated `reviewer-worker.lock.yml`, the agent job carries:

```yaml
  agent:
    ...
    concurrency:
      group: "gh-aw-copilot-${{ github.workflow }}"
    ...
```

[GitHub's documented behavior](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#onworkflow_callinputs) for `${{ github.workflow }}` inside a reusable workflow run: it resolves to the **caller's** workflow name. So when caller `Issue Triage Agent` invokes worker `Reviewer Worker` five times in parallel, all five worker runs evaluate the group as:

```
gh-aw-copilot-Issue Triage Agent
```

GitHub Actions concurrency without `cancel-in-progress: true` does still cancel queued runs once a third arrives (only one queued slot per group), producing the observed cancellation pattern.

I verified the workflow-level concurrency at the top of the worker behaves the same way — but that one was easy to override from frontmatter:

```yaml
concurrency:
  group: gh-aw-reviewer-worker-${{ inputs.issue_number || github.run_id }}
  cancel-in-progress: false
```

The agent-job-level `gh-aw-copilot-...` block however is injected by the compiler and is not user-overridable from frontmatter as far as I can tell from the v0.74.8 codegen.

## Proposed fix

The agent-job concurrency group should be unique per invocation when the workflow is a `workflow_call` target. Two options:

### Option A — caller-side identity (simplest)

Use `github.run_id` (which IS unique per worker invocation, even in `workflow_call`) instead of, or as a suffix to, `github.workflow`:

```yaml
concurrency:
  group: "gh-aw-copilot-${{ github.workflow }}-${{ github.run_id }}"
```

Pro: no behavior change for non-`workflow_call` workflows (run_id is still unique).
Con: each run gets its own slot, so the gh-aw-copilot serialization at the agent level is effectively disabled. If that serialization exists to throttle Copilot CLI rate-limit pressure, this option removes the throttle.

### Option B — Worker-aware identity

Suffix with a compile-time string identifying the workflow file itself (e.g. the `GH_AW_WORKFLOW_ID_SANITIZED` env var that the compiler already emits):

```yaml
concurrency:
  group: "gh-aw-copilot-reviewer-worker"  # compile-time hardcoded
```

Pro: workers retain a serialization slot but in their *own* namespace, distinct from caller's. Different workers across the repo still each get a slot, matching the original intent.
Con: still serializes all invocations of the *same* worker, even when they're for different inputs. In the fan-out case above, all 5 worker invocations for 5 different issues still serialize (slow, but correct — no cancellations).

## Workarounds (for anyone hitting this today)

1. **Don't fire more than 2 worker invocations in close succession.** With queue size 1, two concurrent runs work (one runs, one queues); three is where the cancellations start. If using `safe-outputs.call-workflow` from an `issues.opened`-triggered orchestrator on a slow-traffic repo, you may never hit this in practice.

2. **Space dispatches by ~90s** (the time it takes one worker's agent step to release the slot). Worked for me to retry 3 cancelled runs sequentially.

3. **Accept workflow-level serialization in the worker** — don't override the default workflow-level concurrency on the worker. Workers will serialize end-to-end (one at a time, ~5 min each), but no cancellations. Slow but correct.

## Related observation (separate footgun)

When manually adding a `concurrency:` block to a `workflow_call` worker, do NOT use `${{ github.workflow }}` in the group key — it resolves to the caller, not the worker, and will deadlock with the caller's own concurrency group when they share an issue number suffix. Error:

> Canceling since a deadlock was detected for concurrency group: `gh-aw-<caller-name>-<issue#>` between a top level workflow and `<caller-job-name>`

The fix is to hardcode the worker's identity:

```yaml
concurrency:
  group: gh-aw-my-worker-${{ github.run_id }}
  cancel-in-progress: false
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

safe-outputs.call-workflow workers cancel each other under parallel fan-out due to gh-aw-copilot-${{ github.workflow }} job concurrency group' #35161

`safe-outputs.call-workflow` workers cancel each other under parallel fan-out due to `gh-aw-copilot-${{ github.workflow }}` job concurrency group

TL;DR

Environment

Reproduction

`orchestrator.md` (caller)

`reviewer-worker.md` (worker)

Trigger

Expected

Actual (~60% loss rate)

Root cause

Proposed fix

Option A — caller-side identity (simplest)

Option B — Worker-aware identity

Workarounds (for anyone hitting this today)

Related observation (separate footgun)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

safe-outputs.call-workflow workers cancel each other under parallel fan-out due to gh-aw-copilot-${{ github.workflow }} job concurrency group' #35161

Description

safe-outputs.call-workflow workers cancel each other under parallel fan-out due to gh-aw-copilot-${{ github.workflow }} job concurrency group

TL;DR

Environment

Reproduction

orchestrator.md (caller)

reviewer-worker.md (worker)

Trigger

Expected

Actual (~60% loss rate)

Root cause

Proposed fix

Option A — caller-side identity (simplest)

Option B — Worker-aware identity

Workarounds (for anyone hitting this today)

Related observation (separate footgun)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`safe-outputs.call-workflow` workers cancel each other under parallel fan-out due to `gh-aw-copilot-${{ github.workflow }}` job concurrency group

`orchestrator.md` (caller)

`reviewer-worker.md` (worker)