Fix #320: capture prompt + completion content on provider patches by anilmurty · Pull Request #321 · Metabuilder-Labs/tokenjam

anilmurty · 2026-06-26T03:15:29Z

The full-request capture (#209) was incomplete for the provider monkey-patches: with capture.prompts = true, a patch_anthropic / patch_openai / patch_gemini / patch_bedrock LLM span got request_params populated but not gen_ai.prompt.content (the messages) or gen_ai.completion.content. Only litellm.py and sdk/agent.py set them. A span without the prompt isn't self-contained enough to replay — the whole point of #209.

Summary

Added shared record_prompt_content(span, messages) / record_completion_content(span, text) helpers to _request_capture.py, alongside record_request_params/record_request_tools, using the GenAIAttributes.PROMPT_CONTENT / COMPLETION_CONTENT semconv constants (Rule 10).
record_prompt_content serializes as json.dumps(messages) — the exact shape litellm/sdk/agent.py already use, so a replay harness / backfill reading gen_ai.prompt.content gets one consistent serialization regardless of which patch produced the span.
Wired both into all four provider patches, set unconditionally on the LLM-call span. strip_captured_content() at ingest already gates them per the [capture] prompts / completions toggles — same pattern litellm uses; no ingest-gate change, no default change (capture stays off by default).

Per provider

anthropic: messages kwarg + response content text blocks (joined; tool_use blocks skipped). Streaming path captures the prompt; completion there would need stream buffering (out of scope, noted below).
openai: messages kwarg + choices[0].message.content.
gemini: contents (first positional arg or kwarg) + response.text, on both the sync and async generate_content patches.
bedrock: messages/prompt/inputText from the JSON body; completion from the already-parsed response body inside _extract_bedrock_usage (avoids a second read of the response stream).

Also in this PR (the optional refactor)

Refactored litellm.py::_record_prompt_content + the non-streaming completion setter to call the shared helpers, so there's one prompt serialization across every capture path. Dropped the now-duplicate _serialize_prompt (its messages-positional / text-prompt fallbacks are preserved in _record_prompt_content). Behavior-preserving — all existing litellm tests pass.

Tests

tests/unit/test_provider_content_capture.py:

the shared helpers (incl. json.dumps(messages) shape, None no-ops, non-serialisable fallback);
each provider's completion/contents extraction (SDK-independent);
anthropic + openai patches end-to-end through a fake upstream — asserting the span carries PROMPT_CONTENT == json.dumps(messages) and COMPLETION_CONTENT;
gemini + bedrock via the exact capture calls their patches make (their SDKs aren't installed in CI, so the install path early-returns);
the ingest gate: both stripped when [capture] is off, kept when on, and gate independently (prompts on / completions off).

Full suite: 1173 passed. ruff + mypy clean on all touched files.

What's NOT in this PR

Completion content on streaming paths (anthropic/openai stream wrappers): the wrappers don't aggregate the streamed assistant text, so capturing it would mean buffering the stream — a separate concern. The prompt is captured on streaming paths (it's known up front); only streaming completion is deferred.
Framework patches (langchain / langgraph / crewai / autogen / llamaindex) — explicitly out of scope for this issue.

Closes #320

🤖 Generated with Claude Code

The full-request capture (#209) was incomplete for the provider monkey-patches. With capture.prompts = true, a patch_anthropic / patch_openai / patch_gemini / patch_bedrock LLM span got request_params but NOT gen_ai.prompt.content (the messages) or gen_ai.completion.content — only litellm.py and sdk/agent.py set them. A span without the prompt isn't self-contained enough to replay, which is the whole point of #209. Add shared record_prompt_content / record_completion_content helpers to _request_capture.py (using the GenAIAttributes semconv constants, Rule 10). record_prompt_content serializes as json.dumps(messages) — the SAME shape every other capture path uses — so a replay harness / backfill reading gen_ai.prompt.content gets one consistent serialization regardless of which patch produced the span. Each provider patch now pulls its request messages and response text per that provider's shape and records both UNCONDITIONALLY on the LLM-call span; strip_captured_content() at ingest already gates them per the [capture] prompts / completions toggles, so defaults are unchanged (capture off). Per provider: - anthropic: messages kwarg + response content text blocks (joined; tool_use blocks skipped). Streaming path captures the prompt; completion there would need stream buffering (out of scope). - openai: messages kwarg + choices[0].message.content. - gemini: contents (first positional arg or kwarg) + response.text, on both the sync and async generate_content patches. - bedrock: messages/prompt/inputText from the JSON body; completion from the already-parsed response body in _extract_bedrock_usage (no second read of the response stream). Also refactored litellm.py to call the shared helpers so there's ONE prompt serialization across every path (dropped the now-duplicate _serialize_prompt). Framework patches (langchain etc.) are untouched, out of scope for this issue. Tests (tests/unit/test_provider_content_capture.py): the shared helpers, each provider's extraction, the anthropic + openai patches end-to-end through a fake upstream (asserting PROMPT_CONTENT == json.dumps(messages) + COMPLETION_CONTENT), the gemini/bedrock capture calls (their SDKs aren't installed in CI), and the ingest gate stripping both when [capture] is off / keeping them when on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

anilmurty

Verified against #320, with attention on the one contract that matters most: the serialization. record_prompt_content writes exactly json.dumps(messages), the same shape litellm and sdk/agent already use, with a test asserting it — so a span from any capture path now reads back identically, which is what a replay harness / backfill depends on. Shared helpers in _request_capture.py, semconv constants (Rule 10), wired into all four provider patches, set unconditionally with strip_captured_content gating at ingest (no default change). The litellm consolidation to one serialization is a clean bonus, and the scope discipline is good — streaming completion and framework patches explicitly deferred. Comprehensive tests (helpers, per-provider extraction, anthropic/openai end-to-end, the ingest gate's independent toggles), 1173 passed, ruff + mypy clean, all four CI jobs green.

Ready to merge.

anilmurty commented Jun 26, 2026

View reviewed changes

anilmurty merged commit 21fc84c into main Jun 26, 2026
4 checks passed

anilmurty deleted the fix/320-prompt-completion-capture branch June 26, 2026 03:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #320: capture prompt + completion content on provider patches#321

Fix #320: capture prompt + completion content on provider patches#321
anilmurty merged 1 commit into
mainfrom
fix/320-prompt-completion-capture

anilmurty commented Jun 26, 2026

Uh oh!

anilmurty left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

anilmurty commented Jun 26, 2026

Summary

Per provider

Also in this PR (the optional refactor)

Tests

What's NOT in this PR

Uh oh!

anilmurty left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant