Skip to content

Fix #320: capture prompt + completion content on provider patches#321

Merged
anilmurty merged 1 commit into
mainfrom
fix/320-prompt-completion-capture
Jun 26, 2026
Merged

Fix #320: capture prompt + completion content on provider patches#321
anilmurty merged 1 commit into
mainfrom
fix/320-prompt-completion-capture

Conversation

@anilmurty

Copy link
Copy Markdown
Contributor

The full-request capture (#209) was incomplete for the provider monkey-patches: with capture.prompts = true, a patch_anthropic / patch_openai / patch_gemini / patch_bedrock LLM span got request_params populated but not gen_ai.prompt.content (the messages) or gen_ai.completion.content. Only litellm.py and sdk/agent.py set them. A span without the prompt isn't self-contained enough to replay — the whole point of #209.

Summary

  • Added shared record_prompt_content(span, messages) / record_completion_content(span, text) helpers to _request_capture.py, alongside record_request_params/record_request_tools, using the GenAIAttributes.PROMPT_CONTENT / COMPLETION_CONTENT semconv constants (Rule 10).
  • record_prompt_content serializes as json.dumps(messages) — the exact shape litellm/sdk/agent.py already use, so a replay harness / backfill reading gen_ai.prompt.content gets one consistent serialization regardless of which patch produced the span.
  • Wired both into all four provider patches, set unconditionally on the LLM-call span. strip_captured_content() at ingest already gates them per the [capture] prompts / completions toggles — same pattern litellm uses; no ingest-gate change, no default change (capture stays off by default).

Per provider

  • anthropic: messages kwarg + response content text blocks (joined; tool_use blocks skipped). Streaming path captures the prompt; completion there would need stream buffering (out of scope, noted below).
  • openai: messages kwarg + choices[0].message.content.
  • gemini: contents (first positional arg or kwarg) + response.text, on both the sync and async generate_content patches.
  • bedrock: messages/prompt/inputText from the JSON body; completion from the already-parsed response body inside _extract_bedrock_usage (avoids a second read of the response stream).

Also in this PR (the optional refactor)

Refactored litellm.py::_record_prompt_content + the non-streaming completion setter to call the shared helpers, so there's one prompt serialization across every capture path. Dropped the now-duplicate _serialize_prompt (its messages-positional / text-prompt fallbacks are preserved in _record_prompt_content). Behavior-preserving — all existing litellm tests pass.

Tests

tests/unit/test_provider_content_capture.py:

  • the shared helpers (incl. json.dumps(messages) shape, None no-ops, non-serialisable fallback);
  • each provider's completion/contents extraction (SDK-independent);
  • anthropic + openai patches end-to-end through a fake upstream — asserting the span carries PROMPT_CONTENT == json.dumps(messages) and COMPLETION_CONTENT;
  • gemini + bedrock via the exact capture calls their patches make (their SDKs aren't installed in CI, so the install path early-returns);
  • the ingest gate: both stripped when [capture] is off, kept when on, and gate independently (prompts on / completions off).

Full suite: 1173 passed. ruff + mypy clean on all touched files.

What's NOT in this PR

  • Completion content on streaming paths (anthropic/openai stream wrappers): the wrappers don't aggregate the streamed assistant text, so capturing it would mean buffering the stream — a separate concern. The prompt is captured on streaming paths (it's known up front); only streaming completion is deferred.
  • Framework patches (langchain / langgraph / crewai / autogen / llamaindex) — explicitly out of scope for this issue.

Closes #320

🤖 Generated with Claude Code

The full-request capture (#209) was incomplete for the provider monkey-patches.
With capture.prompts = true, a patch_anthropic / patch_openai / patch_gemini /
patch_bedrock LLM span got request_params but NOT gen_ai.prompt.content (the
messages) or gen_ai.completion.content — only litellm.py and sdk/agent.py set
them. A span without the prompt isn't self-contained enough to replay, which is
the whole point of #209.

Add shared record_prompt_content / record_completion_content helpers to
_request_capture.py (using the GenAIAttributes semconv constants, Rule 10).
record_prompt_content serializes as json.dumps(messages) — the SAME shape every
other capture path uses — so a replay harness / backfill reading
gen_ai.prompt.content gets one consistent serialization regardless of which
patch produced the span. Each provider patch now pulls its request messages and
response text per that provider's shape and records both UNCONDITIONALLY on the
LLM-call span; strip_captured_content() at ingest already gates them per the
[capture] prompts / completions toggles, so defaults are unchanged (capture off).

Per provider:
- anthropic: messages kwarg + response content text blocks (joined; tool_use
  blocks skipped). Streaming path captures the prompt; completion there would
  need stream buffering (out of scope).
- openai: messages kwarg + choices[0].message.content.
- gemini: contents (first positional arg or kwarg) + response.text, on both the
  sync and async generate_content patches.
- bedrock: messages/prompt/inputText from the JSON body; completion from the
  already-parsed response body in _extract_bedrock_usage (no second read of the
  response stream).

Also refactored litellm.py to call the shared helpers so there's ONE prompt
serialization across every path (dropped the now-duplicate _serialize_prompt).
Framework patches (langchain etc.) are untouched, out of scope for this issue.

Tests (tests/unit/test_provider_content_capture.py): the shared helpers, each
provider's extraction, the anthropic + openai patches end-to-end through a fake
upstream (asserting PROMPT_CONTENT == json.dumps(messages) + COMPLETION_CONTENT),
the gemini/bedrock capture calls (their SDKs aren't installed in CI), and the
ingest gate stripping both when [capture] is off / keeping them when on.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@anilmurty anilmurty left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified against #320, with attention on the one contract that matters most: the serialization. record_prompt_content writes exactly json.dumps(messages), the same shape litellm and sdk/agent already use, with a test asserting it — so a span from any capture path now reads back identically, which is what a replay harness / backfill depends on. Shared helpers in _request_capture.py, semconv constants (Rule 10), wired into all four provider patches, set unconditionally with strip_captured_content gating at ingest (no default change). The litellm consolidation to one serialization is a clean bonus, and the scope discipline is good — streaming completion and framework patches explicitly deferred. Comprehensive tests (helpers, per-provider extraction, anthropic/openai end-to-end, the ingest gate's independent toggles), 1173 passed, ruff + mypy clean, all four CI jobs green.

Ready to merge.

@anilmurty anilmurty merged commit 21fc84c into main Jun 26, 2026
4 checks passed
@anilmurty anilmurty deleted the fix/320-prompt-completion-capture branch June 26, 2026 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[capture] Provider patches capture #209 request_params but not the #195 prompt content (spans not replayable)

1 participant