docs: Add outcome span attributes and outcomes reference by mnkiefer · Pull Request #34627 · github/gh-aw

mnkiefer · 2026-05-25T10:52:10Z

docs: Add outcome span attributes and outcomes reference

Summary

This PR introduces documentation and specification refinements for the Outcomes model in gh-aw. It adds a new Outcomes reference page, documents gh-aw.outcome.* OTel span attributes in the OpenTelemetry reference, and hardens the safe-output outcome evaluation spec with tightened provenance language and normative rules.

What changed and why

`docs/src/content/docs/reference/outcomes.md` (new file)

A new 110-line reference page covering the Outcomes model end-to-end:

Defines all outcome states: accepted, rejected, pending, ignored, noop
Introduces outcome efficiency (effective tokens per accepted outcome)
Provides the per-safe-output-type acceptance table
Documents telemetry integration, cost/rollup guidance, and links to related pages

This gives operators and integrators a single canonical reference for understanding how gh-aw evaluates and categorises workflow outputs.

`docs/src/content/docs/reference/open-telemetry.md` (modified)

Added a new "Outcome span attributes" section documenting the gh-aw.outcome.* OTel span attributes emitted for workflow-level outcome rollups:

Attribute	Description
`gh-aw.outcome.total`	Total outcome count
`gh-aw.outcome.accepted`	Accepted count
`gh-aw.outcome.rejected`	Rejected count
`gh-aw.outcome.pending`	Pending count
`gh-aw.outcome.ignored`	Ignored count
`gh-aw.outcome.acceptance_rate`	Acceptance rate
`gh-aw.outcome.waste_rate`	Waste rate

The table is documented as non-exhaustive, with a cross-reference to the new Outcomes reference page.

`specs/safe-output-outcome-evaluation.md` (modified) ⚠️ Breaking

Significant provenance model refinements throughout the spec:

Language hardening: replaced "human" with "visible non-bot actor" / "actor-visible" throughout, including conformance test rows — removes ambiguity around bot-authored actions
New "Provenance Limits" section: five normative rules governing what counts as valid provenance
OTel prefix rename: ghaw.outcome.* → gh-aw.outcome.* (see breaking change below)
Historical field name clarifications: human_comments, human_edits, zero_touch are explicitly noted as historical field names
"Current Default Acceptance Map" table: maps each output type to its runtime evaluator function and acceptance behaviour

⚠️ Breaking change

OTel span attribute namespace renamed

The span attribute prefix has changed from ghaw.outcome.* to gh-aw.outcome.*.

Impact: Any consumer, dashboard, or alert rule reading or emitting outcome span attributes using the old ghaw.* prefix must be updated to use gh-aw.*.

Files changed

File	Change	Impact
`docs/src/content/docs/reference/outcomes.md`	Added	Medium
`docs/src/content/docs/reference/open-telemetry.md`	Modified	Medium
`specs/safe-output-outcome-evaluation.md`	Modified	High / Breaking

Checklist

New Outcomes reference page covers all outcome states, efficiency metric, and telemetry
OTel reference updated with gh-aw.outcome.* attribute table (non-exhaustive)
Spec provenance language de-ambiguated ("visible non-bot actor")
Normative Provenance Limits section added
Default Acceptance Map table added to spec
Breaking OTel prefix rename documented

Generated by PR Description Updater for issue #34627 · sonnet46 1.3M · ◷

Copilot

Pull request overview

This PR updates the outcomes documentation/specs for GitHub Agentic Workflows, aiming to clarify outcome evaluation principles, provide a consolidated outcomes reference page, and document OpenTelemetry outcome-related attributes.

Changes:

Refines outcome evaluation principles and adds a provenance/actor-visibility section in the outcome evaluation spec.
Adds a new Outcomes reference page summarizing outcome states, acceptance meaning by safe output type, and rollups.
Adds an OpenTelemetry reference section documenting outcome-related span attributes.

Show a summary per file

File	Description
specs/safe-output-outcome-evaluation.md	Updates evaluation principles/provenance guidance and adds a “current default acceptance map” summary table.
docs/src/content/docs/reference/outcomes.md	Introduces a new Outcomes reference page summarizing states, acceptance meaning, and related links.
docs/src/content/docs/reference/open-telemetry.md	Adds an “Outcome span attributes” section to the OTel reference.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (7)

docs/src/content/docs/reference/outcomes.md:46

The /gh-aw/specs/safe-output-outcome-evaluation/ link likely won’t resolve in the docs site: no other docs pages link under /gh-aw/specs/, and the spec content lives in the repo-root specs/ folder (not under docs/src/content/docs). Consider linking to the published spec location (or to the GitHub file), or adding the spec into the docs content so this route exists.

Rows marked `fallback rule` use a generic existence check, not a type-specific rule. For exact rules, edge cases, and conformance details, see [Safe Output Outcome Evaluation Specification](/gh-aw/specs/safe-output-outcome-evaluation/).

docs/src/content/docs/reference/outcomes.md:55

The add_labels row says accepted means “label retention”, but the current Go evaluator can’t verify retention because the manifest doesn’t record which labels were added; it returns pending when any labels exist and only rejected when none exist (pkg/cli/outcome_eval_label.go). Please adjust this row to match current behavior (or clarify that it’s a non-accepting/indeterminate check today).

| `add_comment` | reacted to or replied to | dedicated rule |
| `add_labels` | label retention | limited check |
| `add_reviewer` | review target exists | fallback rule |

specs/safe-output-outcome-evaluation.md:93

In the “Current Default Acceptance Map”, the add_labels row says the evaluator accepts based on “label retention”, but evalAddLabels currently returns pending when labels are present (it cannot verify retention because added labels aren’t recorded) and only rejects on an empty label list (pkg/cli/outcome_eval_label.go:37-47). Please update the table to match current runtime behavior.

| `create_issue` | `evalCreateIssue` | completed/closed |
| `add_comment` | `evalAddComment` | reacted to or replied to |
| `add_labels` | `evalAddLabels` | label retention |
| `add_reviewer` | `evalGenericSticky` fallback | review target exists |

docs/src/content/docs/reference/outcomes.md:63

These rows describe “discussion target exists” as the fallback behavior, but the Go fallback evaluator (evalGenericSticky) only calls GET /issues/{number}. That won’t validate a Discussion’s existence and will likely produce an error outcome. Consider updating the table to reflect the real fallback behavior/limitations for non-issue/PR objects.

| `close_discussion` | none yet | no implemented rule yet |
| `create_discussion` | none yet | no implemented rule yet |
| `update_discussion` | discussion target exists | fallback rule |
| `create_pull_request_review_comment` | none yet | no implemented rule yet |

specs/safe-output-outcome-evaluation.md:95

The evalGenericSticky fallback only checks whether an issue/PR exists via GET /issues/{number} (it does not validate “review target exists”). Please update the accepted at a glance wording for fallback rows like add_reviewer to match the actual behavior (generic issue/PR existence only).

| `add_labels` | `evalAddLabels` | label retention |
| `add_reviewer` | `evalGenericSticky` fallback | review target exists |
| `update_issue` | `evalGenericSticky` fallback | issue still exists |
| `update_pull_request` | `evalGenericSticky` fallback | PR still exists |

docs/src/content/docs/reference/outcomes.md:74

Several “fallback rule” entries here (e.g. dispatch_workflow, code scanning alerts, link_sub_issue) imply the evaluator checks the existence of those specific resources. In Go, unknown types fall back to evalGenericSticky, which only calls GET /issues/{number}; for non-issue/PR resources this won’t confirm existence and will likely error/pending. Consider adjusting these summaries to reflect the current fallback limitations.

| `dispatch_workflow` | dispatch target exists | fallback rule |
| `autofix_code_scanning_alert` | alert target exists | fallback rule |
| `create_code_scanning_alert` | alert target exists | fallback rule |
| `link_sub_issue` | sub-issue link target exists | fallback rule |
| `hide_comment` | none yet | no implemented rule yet |

specs/safe-output-outcome-evaluation.md:104

For update_discussion and other non-issue/PR types, the evalGenericSticky fallback won’t validate the target because it calls the Issues REST endpoint (GET /issues/{number}). The table currently implies resource-specific existence checks (“discussion target exists”, “review target exists”, etc.); please revise the accepted summaries so they match actual fallback behavior/limitations.

| `update_discussion` | `evalGenericSticky` fallback | discussion target exists |
| `create_pull_request_review_comment` | `evalReviewComment` | none yet |
| `submit_pull_request_review` | `evalGenericSticky` fallback | PR still exists |
| `reply_to_pull_request_review_comment` | `evalGenericSticky` fallback | review target exists |
| `resolve_pull_request_review_thread` | `evalResolveThread` | none yet |

Files reviewed: 3/3 changed files
Comments generated: 3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-05-25T11:02:28Z

✅ smoke-ci: safeoutputs CLI comment + comment-memory run (26397041759)

Generated by 🧪 Smoke CI for issue #34627 · ◷

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

mnkiefer · 2026-05-25T11:17:41Z

@copilot Address remaining review comments and resolve merge conflicts.

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Copilot · 2026-05-25T11:30:52Z

@copilot Address remaining review comments and resolve merge conflicts.

Addressed. Remaining review feedback is handled in a00caf2, and merge conflicts were resolved by merging main and fixing lockfile conflicts in 8ca9be9. Branch is now updated with latest main.

docs: Add outcome span attributes and outcomes reference

81f1252

mnkiefer self-assigned this May 25, 2026

Copilot AI review requested due to automatic review settings May 25, 2026 10:52

Copilot started reviewing on behalf of mnkiefer May 25, 2026 10:52 View session

This comment has been minimized.

Sign in to view

Copilot AI reviewed May 25, 2026

View reviewed changes

Comment thread docs/src/content/docs/reference/outcomes.md Outdated

Comment thread specs/safe-output-outcome-evaluation.md Outdated

Comment thread docs/src/content/docs/reference/open-telemetry.md Outdated

mnkiefer and others added 2 commits May 25, 2026 12:58

Potential fix for pull request finding

443edaa

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

235991d

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of mnkiefer May 25, 2026 11:04 View session

Copilot AI and others added 2 commits May 25, 2026 11:11

Plan: address single PR review comment

8fb6ee6

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Clarify outcome attributes table is non-exhaustive

a00caf2

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Copilot finished work on behalf of mnkiefer May 25, 2026 11:17

Copilot started work on behalf of mnkiefer May 25, 2026 11:17 View session

Copilot AI and others added 2 commits May 25, 2026 11:23

Plan: address remaining review and conflicts

d545778

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Merge main into add-outcome-docs and resolve workflow lock conflicts

8ca9be9

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Copilot finished work on behalf of mnkiefer May 25, 2026 11:31

mnkiefer merged commit 209fa60 into main May 25, 2026

mnkiefer deleted the add-outcome-docs branch May 25, 2026 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Add outcome span attributes and outcomes reference#34627

docs: Add outcome span attributes and outcomes reference#34627
mnkiefer merged 7 commits into
mainfrom
add-outcome-docs

mnkiefer commented May 25, 2026 •

edited by github-actions Bot

Loading

Uh oh!

This comment has been minimized.

This comment has been minimized.

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

mnkiefer commented May 25, 2026

Uh oh!

Copilot AI commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mnkiefer commented May 25, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!