Fix #320: capture prompt + completion content on provider patches#321
Conversation
The full-request capture (#209) was incomplete for the provider monkey-patches. With capture.prompts = true, a patch_anthropic / patch_openai / patch_gemini / patch_bedrock LLM span got request_params but NOT gen_ai.prompt.content (the messages) or gen_ai.completion.content — only litellm.py and sdk/agent.py set them. A span without the prompt isn't self-contained enough to replay, which is the whole point of #209. Add shared record_prompt_content / record_completion_content helpers to _request_capture.py (using the GenAIAttributes semconv constants, Rule 10). record_prompt_content serializes as json.dumps(messages) — the SAME shape every other capture path uses — so a replay harness / backfill reading gen_ai.prompt.content gets one consistent serialization regardless of which patch produced the span. Each provider patch now pulls its request messages and response text per that provider's shape and records both UNCONDITIONALLY on the LLM-call span; strip_captured_content() at ingest already gates them per the [capture] prompts / completions toggles, so defaults are unchanged (capture off). Per provider: - anthropic: messages kwarg + response content text blocks (joined; tool_use blocks skipped). Streaming path captures the prompt; completion there would need stream buffering (out of scope). - openai: messages kwarg + choices[0].message.content. - gemini: contents (first positional arg or kwarg) + response.text, on both the sync and async generate_content patches. - bedrock: messages/prompt/inputText from the JSON body; completion from the already-parsed response body in _extract_bedrock_usage (no second read of the response stream). Also refactored litellm.py to call the shared helpers so there's ONE prompt serialization across every path (dropped the now-duplicate _serialize_prompt). Framework patches (langchain etc.) are untouched, out of scope for this issue. Tests (tests/unit/test_provider_content_capture.py): the shared helpers, each provider's extraction, the anthropic + openai patches end-to-end through a fake upstream (asserting PROMPT_CONTENT == json.dumps(messages) + COMPLETION_CONTENT), the gemini/bedrock capture calls (their SDKs aren't installed in CI), and the ingest gate stripping both when [capture] is off / keeping them when on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
anilmurty
left a comment
There was a problem hiding this comment.
Verified against #320, with attention on the one contract that matters most: the serialization. record_prompt_content writes exactly json.dumps(messages), the same shape litellm and sdk/agent already use, with a test asserting it — so a span from any capture path now reads back identically, which is what a replay harness / backfill depends on. Shared helpers in _request_capture.py, semconv constants (Rule 10), wired into all four provider patches, set unconditionally with strip_captured_content gating at ingest (no default change). The litellm consolidation to one serialization is a clean bonus, and the scope discipline is good — streaming completion and framework patches explicitly deferred. Comprehensive tests (helpers, per-provider extraction, anthropic/openai end-to-end, the ingest gate's independent toggles), 1173 passed, ruff + mypy clean, all four CI jobs green.
Ready to merge.
The full-request capture (#209) was incomplete for the provider monkey-patches: with
capture.prompts = true, apatch_anthropic/patch_openai/patch_gemini/patch_bedrockLLM span gotrequest_paramspopulated but notgen_ai.prompt.content(the messages) orgen_ai.completion.content. Onlylitellm.pyandsdk/agent.pyset them. A span without the prompt isn't self-contained enough to replay — the whole point of #209.Summary
record_prompt_content(span, messages)/record_completion_content(span, text)helpers to_request_capture.py, alongsiderecord_request_params/record_request_tools, using theGenAIAttributes.PROMPT_CONTENT/COMPLETION_CONTENTsemconv constants (Rule 10).record_prompt_contentserializes asjson.dumps(messages)— the exact shape litellm/sdk/agent.pyalready use, so a replay harness / backfill readinggen_ai.prompt.contentgets one consistent serialization regardless of which patch produced the span.strip_captured_content()at ingest already gates them per the[capture] prompts/completionstoggles — same pattern litellm uses; no ingest-gate change, no default change (capture stays off by default).Per provider
messageskwarg + responsecontenttext blocks (joined;tool_useblocks skipped). Streaming path captures the prompt; completion there would need stream buffering (out of scope, noted below).messageskwarg +choices[0].message.content.contents(first positional arg or kwarg) +response.text, on both the sync and asyncgenerate_contentpatches.messages/prompt/inputTextfrom the JSONbody; completion from the already-parsed response body inside_extract_bedrock_usage(avoids a second read of the response stream).Also in this PR (the optional refactor)
Refactored
litellm.py::_record_prompt_content+ the non-streaming completion setter to call the shared helpers, so there's one prompt serialization across every capture path. Dropped the now-duplicate_serialize_prompt(itsmessages-positional / text-promptfallbacks are preserved in_record_prompt_content). Behavior-preserving — all existing litellm tests pass.Tests
tests/unit/test_provider_content_capture.py:json.dumps(messages)shape,Noneno-ops, non-serialisable fallback);PROMPT_CONTENT == json.dumps(messages)andCOMPLETION_CONTENT;[capture]is off, kept when on, and gate independently (prompts on / completions off).Full suite: 1173 passed.
ruff+mypyclean on all touched files.What's NOT in this PR
Closes #320
🤖 Generated with Claude Code