Problem
The agent's prompt and completion text never leave the runner. gh-aw emits token counts and a model name but not the prompt body, the response body, or the OTel langfuse.observation.type = "generation" marker. The consequence in Langfuse:
- Agent spans render as plain "span" rows rather than the rich generation rows (no model + cost + usage in-table, no prompt-link UI).
- Trace detail view has empty Input / Output panels — useless for human review, evals, or annotation queues.
- LLM-as-a-Judge evaluators can't run because they need input/output content.
Proposal
Two changes:
A. Always (no opt-in): add langfuse.observation.type = "generation" to the dedicated agent span (and to the conclusion span when it carries the token usage because no dedicated agent span exists). In send_otlp_span.cjs around line ~1997 (the agentAttributes = [...attributes, ...usageAttrs] line):
agentAttributes.push(buildAttr("langfuse.observation.type", "generation"));
And when usage attributes fall through to the conclusion span (line ~1985 region), similarly add langfuse.observation.type = "generation" to the conclusion span attributes.
B. Opt-in via frontmatter: new field
observability:
otlp:
endpoint: ...
langfuse:
capture-content: true # default false
redact-patterns: # optional
- 'ghp_[A-Za-z0-9]{36,}'
- 'sk-[A-Za-z0-9]{20,}'
When capture-content: true:
- Add a post-agent step that reads
/tmp/gh-aw/agent_output.json and the resolved prompt body, applies sanitizeOTLPPayload-style redaction plus user-supplied redact-patterns, and emits a child span carrying langfuse.observation.input and langfuse.observation.output attributes.
- The default for
capture-content must be false because prompts can carry repo paths, issue bodies, or secrets.
Acceptance criteria
- (A) Agent / agent-conclusion spans always carry
langfuse.observation.type = "generation". Langfuse trace detail renders model name, tokens, cost without configuration. Tests cover both the dedicated-agent-span path and the fallback (conclusion-span-carries-usage) path.
- (B) When
capture-content: true, post-agent step writes a child span with langfuse.observation.input / output populated. Redaction is applied first.
- When
capture-content is unset or false, no content attributes are emitted (regression test).
- Documentation in
docs/ describes both options and warns about content sensitivity.
References
Problem
The agent's prompt and completion text never leave the runner. gh-aw emits token counts and a model name but not the prompt body, the response body, or the OTel
langfuse.observation.type = "generation"marker. The consequence in Langfuse:Proposal
Two changes:
A. Always (no opt-in): add
langfuse.observation.type = "generation"to the dedicated agent span (and to the conclusion span when it carries the token usage because no dedicated agent span exists). Insend_otlp_span.cjsaround line ~1997 (theagentAttributes = [...attributes, ...usageAttrs]line):And when usage attributes fall through to the conclusion span (line ~1985 region), similarly add
langfuse.observation.type = "generation"to the conclusion span attributes.B. Opt-in via frontmatter: new field
When
capture-content: true:/tmp/gh-aw/agent_output.jsonand the resolved prompt body, appliessanitizeOTLPPayload-style redaction plus user-supplied redact-patterns, and emits a child span carryinglangfuse.observation.inputandlangfuse.observation.outputattributes.capture-contentmust befalsebecause prompts can carry repo paths, issue bodies, or secrets.Acceptance criteria
langfuse.observation.type = "generation". Langfuse trace detail renders model name, tokens, cost without configuration. Tests cover both the dedicated-agent-span path and the fallback (conclusion-span-carries-usage) path.capture-content: true, post-agent step writes a child span withlangfuse.observation.input/outputpopulated. Redaction is applied first.capture-contentis unset orfalse, no content attributes are emitted (regression test).docs/describes both options and warns about content sensitivity.References
actions/setup/js/send_otlp_span.cjs→sanitizeOTLPPayload