[internal copy of #29550] fix: passthrough endpoints duplicate logs by mateo-berri · Pull Request #29598 · BerriAI/litellm

mateo-berri · 2026-06-03T16:56:44Z

Automated copy of #29550 into litellm_internal_staging for pr-babysitter.

Original head: mubashir1osmani/litellm:litellm__fix_passthrough_stream_dedup @ 57311e4df3d2

Two bugs caused _PROXY_track_cost_callback to see stream=True + complete_streaming_response=None on every streaming pass-through request, making the dedup guard in dispatch_success_handlers permanently inactive: 1. pass_through_endpoints.py created the Logging object with stream=False for all requests. _is_assembled_stream_success short-circuits on self.stream is not True, so has_dispatched_final_stream_success was never set and any second dispatch went through unchecked. Fix: set logging_obj.stream = True after stream detection. 2. _create_anthropic_response_logging_payload set complete_streaming_response inside the try block after litellm.completion_cost(), so a pricing error caused an early return without setting it on model_call_details. Fix: set complete_streaming_response before the try block. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greptile-apps · 2026-06-03T16:59:01Z

Greptile Summary

This PR fixes duplicate success-callback dispatches for streaming Anthropic passthrough endpoints by ensuring logging_obj.stream is set to True before chunk_processor is invoked, and by recording complete_streaming_response on model_call_details early so the dedup guard in dispatch_success_handlers can fire even when cost calculation raises.

pass_through_endpoints.py: sets logging_obj.stream = True in both the explicit-stream=True branch and the SSE-fallback branch, activating _is_assembled_stream_success's dedup guard.
anthropic_passthrough_logging_handler.py: assigns complete_streaming_response to model_call_details before entering the try block so the key is present even when pricing fails — but does this unconditionally for all calls, not only streaming ones.
Four new regression tests document the pre/post-fix behaviour of the guard; the existing unit test mock is updated to initialise model_call_details as a dict.

Confidence Score: 4/5

Safe to merge for the streaming dedup fix; the non-streaming Anthropic passthrough path now silently drops sync callbacks (langfuse, s3, etc.) as a side-effect.

_create_anthropic_response_logging_payload is called for both streaming and non-streaming Anthropic responses. The unconditional complete_streaming_response write causes success_handler to return early (line 2127 in litellm_logging.py) on every non-streaming call, so any sync callback registered via success_callback = ["langfuse"] or "s3" will stop firing for non-streaming Anthropic passthrough requests. The intended fix is scoped to streaming calls only.

litellm/proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py — the early complete_streaming_response write needs a logging_obj.stream is True guard.

Important Files Changed

Filename	Overview
litellm/proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py	Adds early `complete_streaming_response` assignment in `_create_anthropic_response_logging_payload` — applied unconditionally to both streaming and non-streaming calls, which will suppress sync callbacks (langfuse, s3) for non-streaming Anthropic passthrough requests.
litellm/proxy/pass_through_endpoints/pass_through_endpoints.py	Sets `logging_obj.stream = True` in both the explicit-stream branch and the SSE-fallback branch; straightforward and correct fix for the dedup guard precondition.
tests/test_litellm/proxy/pass_through_endpoints/llm_provider_handlers/test_anthropic_passthrough_logging_handler.py	Adds `TestStreamFalseDeduplication` class with four focused regression tests covering the dedup guard behaviour pre/post fix; tests are mock-only and well-structured.
tests/test_litellm/proxy/pass_through_endpoints/test_pass_through_endpoints.py	Adds two integration-level regression tests that confirm `logging_obj.stream` and `model_call_details["stream"]` are set correctly for both explicit-stream and SSE-fallback code paths.
tests/pass_through_unit_tests/test_unit_test_anthropic_pass_through.py	Adds `model_call_details = {}` initialisation to the existing mock so the new `_create_anthropic_response_logging_payload` write doesn't raise an `AttributeError`; minimal and correct.

_{Reviews (6): Last reviewed commit: "fix(pass_through): gate complete_streami..." | Re-trigger Greptile}

…s dict The anthropic passthrough logging payload now records the assembled response on model_call_details before cost calculation, which requires model_call_details to support item assignment. In production it is always a dict; the existing unit test stubbed the logging object with a bare Mock whose attribute is not subscriptable, so the new assignment raised TypeError. Use a real dict to match the production logging object.

mateo-berri · 2026-06-03T17:10:31Z

@greptileai

codecov · 2026-06-03T17:13:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…_stream_dedup

mateo-berri · 2026-06-03T17:32:37Z

@greptileai

The streaming branch of pass_through_request that marks the logging object as streaming (logging_obj.stream and model_call_details["stream"]) had no unit coverage, so the patch coverage gate flagged it. Add a regression test that drives a streaming pass-through request through pass_through_request and asserts the logging object is flagged as a stream before dispatch.

mateo-berri · 2026-06-03T17:51:46Z

@greptileai

veria-ai · 2026-06-03T18:13:52Z

PR overview

All previously flagged issues have been addressed. No open security concerns remain on this pull request.

Security review

No open security issues remain on this pull request.

Fixed/addressed: 1 · PR risk: 0/10

The auto-detected streaming branch of pass_through_request (when a request that was not flagged as streaming returns a text/event-stream response) sets logging_obj.stream and model_call_details["stream"] but had no unit coverage, so the codecov patch gate failed at 60%. Drive a non-streaming pass-through request whose upstream response is SSE through pass_through_request and assert the logging object is flagged as a stream before dispatch.

mateo-berri · 2026-06-03T18:31:28Z

@greptileai

perform_redaction only scrubs complete_streaming_response when model_call_details["stream"] is True. Setting it unconditionally for non-streaming Anthropic pass-through responses left the assembled response unredacted in model_call_details, which is handed to logging callbacks as kwargs when message logging is disabled. Only record it for actual streaming responses so redaction always applies.

mateo-berri · 2026-06-03T18:34:50Z

@greptileai

greptile-apps · 2026-06-03T18:38:20Z

+        # Only record complete_streaming_response for actual streaming responses.
+        # perform_redaction scrubs this field only when stream is True, so setting
+        # it on a non-streaming response would bypass message redaction.
+        if logging_obj.model_call_details.get("stream") is True:
+            logging_obj.model_call_details["complete_streaming_response"] = (
+                litellm_model_response


complete_streaming_response set unconditionally, breaks sync callbacks for non-streaming calls

_create_anthropic_response_logging_payload is called for both streaming and non-streaming Anthropic passthrough responses. Setting complete_streaming_response in model_call_details unconditionally causes success_handler to hit its early-return guard at line 2127 (if "complete_streaming_response" in self.model_call_details: return) even for non-streaming calls. Any user with sync callbacks configured (e.g. success_callback = ["langfuse"] or "s3") would have those callbacks silently skipped for all non-streaming Anthropic passthrough requests, because the executor-submitted success_handler exits before iterating the callback list. The guard at line 1635 (if self.stream is not True: return False) prevents _is_assembled_stream_success from misfiring, but success_handler's early-return at 2127 is unconditional. The assignment should be guarded so it only fires for streaming calls.

This is already addressed on HEAD in a5b0053. The complete_streaming_response write in _create_anthropic_response_logging_payload is no longer unconditional; it is now gated on logging_obj.model_call_details.get("stream") is True (line 120). For non-streaming Anthropic pass-through requests model_call_details["stream"] is never set to True (only the two streaming branches in pass_through_endpoints.py set it), so the key is never written for non-streaming calls, success_handler's early-return at line 2127 is not taken, and sync callbacks such as langfuse and s3 keep firing. The guard scopes the assignment to streaming calls exactly as suggested, so there is no regression for non-streaming requests

* fix duplicate cost callbacks for anthropic streaming pass-through Two bugs caused _PROXY_track_cost_callback to see stream=True + complete_streaming_response=None on every streaming pass-through request, making the dedup guard in dispatch_success_handlers permanently inactive: 1. pass_through_endpoints.py created the Logging object with stream=False for all requests. _is_assembled_stream_success short-circuits on self.stream is not True, so has_dispatched_final_stream_success was never set and any second dispatch went through unchecked. Fix: set logging_obj.stream = True after stream detection. 2. _create_anthropic_response_logging_payload set complete_streaming_response inside the try block after litellm.completion_cost(), so a pricing error caused an early return without setting it on model_call_details. Fix: set complete_streaming_response before the try block. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix stream * add stream to logging obj * test(pass_through): give mock logging object a real model_call_details dict The anthropic passthrough logging payload now records the assembled response on model_call_details before cost calculation, which requires model_call_details to support item assignment. In production it is always a dict; the existing unit test stubbed the logging object with a bare Mock whose attribute is not subscriptable, so the new assignment raised TypeError. Use a real dict to match the production logging object. * test(pass_through): cover streaming logging-obj stream flag The streaming branch of pass_through_request that marks the logging object as streaming (logging_obj.stream and model_call_details["stream"]) had no unit coverage, so the patch coverage gate flagged it. Add a regression test that drives a streaming pass-through request through pass_through_request and asserts the logging object is flagged as a stream before dispatch. * test(pass_through): cover SSE-response stream flag fallback branch The auto-detected streaming branch of pass_through_request (when a request that was not flagged as streaming returns a text/event-stream response) sets logging_obj.stream and model_call_details["stream"] but had no unit coverage, so the codecov patch gate failed at 60%. Drive a non-streaming pass-through request whose upstream response is SSE through pass_through_request and assert the logging object is flagged as a stream before dispatch. * fix(pass_through): gate complete_streaming_response on stream flag perform_redaction only scrubs complete_streaming_response when model_call_details["stream"] is True. Setting it unconditionally for non-streaming Anthropic pass-through responses left the assembled response unredacted in model_call_details, which is handed to logging callbacks as kwargs when message logging is disabled. Only record it for actual streaming responses so redaction always applies. --------- Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 2bbdbfa)

mubashir1osmani and others added 4 commits June 2, 2026 17:51

fix stream

2468e16

add

2bbb786

add stream to logging obj

57311e4

Merge branch 'litellm_internal_staging' into litellm__fix_passthrough…

8e1aa10

…_stream_dedup

veria-ai Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread .../proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py Outdated

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

mateo-berri requested a review from yuneng-berri June 3, 2026 19:03

yuneng-berri approved these changes Jun 3, 2026

View reviewed changes

mateo-berri merged commit 2bbdbfa into litellm_internal_staging Jun 3, 2026
145 of 146 checks passed

mateo-berri deleted the litellm__fix_passthrough_stream_dedup branch June 3, 2026 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[internal copy of #29550] fix: passthrough endpoints duplicate logs#29598

[internal copy of #29550] fix: passthrough endpoints duplicate logs#29598
mateo-berri merged 9 commits into
litellm_internal_stagingfrom
litellm__fix_passthrough_stream_dedup

mateo-berri commented Jun 3, 2026

Uh oh!

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

Important Files Changed

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

codecov Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

Uh oh!

veria-ai Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

greptile-apps Bot Jun 3, 2026

Uh oh!

mateo-berri Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mateo-berri commented Jun 3, 2026

Uh oh!

greptile-apps Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

codecov Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

Uh oh!

veria-ai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR overview

Security review

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

mateo-berri commented Jun 3, 2026

Uh oh!

greptile-apps Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

mateo-berri Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

codecov Bot commented Jun 3, 2026 •

edited

Loading

veria-ai Bot commented Jun 3, 2026 •

edited

Loading