Mark agent spans as Langfuse "generation" observations and (opt-in) emit prompt/completion content

### Problem

The agent's prompt and completion text never leave the runner. gh-aw emits token counts and a model name but not the prompt body, the response body, or the OTel `langfuse.observation.type = "generation"` marker. The consequence in Langfuse:

- Agent spans render as plain "span" rows rather than the rich generation rows (no model + cost + usage in-table, no prompt-link UI).
- Trace detail view has empty Input / Output panels — useless for human review, evals, or annotation queues.
- LLM-as-a-Judge evaluators can't run because they need input/output content.

### Proposal

Two changes:

**A. Always (no opt-in):** add `langfuse.observation.type = "generation"` to the dedicated agent span (and to the conclusion span when it carries the token usage because no dedicated agent span exists). In `send_otlp_span.cjs` around line ~1997 (the `agentAttributes = [...attributes, ...usageAttrs]` line):

```javascript
agentAttributes.push(buildAttr("langfuse.observation.type", "generation"));
```

And when usage attributes fall through to the conclusion span (line ~1985 region), similarly add `langfuse.observation.type = "generation"` to the conclusion span attributes.

**B. Opt-in via frontmatter:** new field

```yaml
observability:
  otlp:
    endpoint: ...
  langfuse:
    capture-content: true            # default false
    redact-patterns:                 # optional
      - 'ghp_[A-Za-z0-9]{36,}'
      - 'sk-[A-Za-z0-9]{20,}'
```

When `capture-content: true`:

1. Add a post-agent step that reads `/tmp/gh-aw/agent_output.json` and the resolved prompt body, applies `sanitizeOTLPPayload`-style redaction plus user-supplied redact-patterns, and emits a child span carrying `langfuse.observation.input` and `langfuse.observation.output` attributes.
2. The default for `capture-content` must be `false` because prompts can carry repo paths, issue bodies, or secrets.

### Acceptance criteria

- (A) Agent / agent-conclusion spans always carry `langfuse.observation.type = "generation"`. Langfuse trace detail renders model name, tokens, cost without configuration. Tests cover both the dedicated-agent-span path and the fallback (conclusion-span-carries-usage) path.
- (B) When `capture-content: true`, post-agent step writes a child span with `langfuse.observation.input` / `output` populated. Redaction is applied first.
- When `capture-content` is unset or `false`, no content attributes are emitted (regression test).
- Documentation in `docs/` describes both options and warns about content sensitivity.

### References

- [Langfuse observation type and input/output mapping](https://langfuse.com/integrations/native/opentelemetry#observation-level-attributes)
- [Langfuse prompt linking via OTEL (discussion #9065)](https://github.com/orgs/langfuse/discussions/9065)
- gh-aw existing redaction: `actions/setup/js/send_otlp_span.cjs` → `sanitizeOTLPPayload`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark agent spans as Langfuse "generation" observations and (opt-in) emit prompt/completion content #33644

Problem

Proposal

Acceptance criteria

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mark agent spans as Langfuse "generation" observations and (opt-in) emit prompt/completion content #33644

Description

Problem

Proposal

Acceptance criteria

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions