Skip to content

fix: unblock SDK security dependency patches#336

Merged
declan-scale merged 1 commit intonextfrom
sec/relax-httpx-for-litellm-vuln
Apr 30, 2026
Merged

fix: unblock SDK security dependency patches#336
declan-scale merged 1 commit intonextfrom
sec/relax-httpx-for-litellm-vuln

Conversation

@scale-ballen
Copy link
Copy Markdown
Contributor

@scale-ballen scale-ballen commented Apr 30, 2026

Summary

This PR updates agentex-sdk on the next branch so downstream repos can consume a 0.10.3 release that resolves the dependency blockers currently preventing security remediation.

The original blocker was agentex-sdk==0.10.2 requiring httpx>=0.27.2,<0.28, while patched litellm>=1.83.7 requires httpx==0.28.1. This PR relaxes the SDK bound and raises the litellm floor so downstream consumers can upgrade without dependency conflicts.

While reviewing older security work, I also found PR #289 was never merged. I ported the still-relevant pieces from that PR into this current next-based branch instead of cherry-picking it wholesale, because #289 was based on older code and would revert/remove a lot of current next work.

What Changed

Dependency and security remediation

  • Relaxed httpx from >=0.27.2,<0.28 to >=0.28.1,<0.29.
  • Raised litellm from >=1.83.0,<2 to >=1.83.7,<2.
  • Removed the tight fastapi<0.116 cap so the resolver can select patched Starlette-compatible FastAPI releases.
  • Added explicit starlette>=0.49.1 floor for CVE-2025-62727 remediation.
  • Added explicit tornado>=6.5.5 floor for CVE-2026-31958 remediation.
  • Regenerated uv.lock from the current next branch.

Resolved package versions in the lockfile:

Package Before After Why
httpx 0.27.2 0.28.1 Required by patched litellm
litellm 1.83.0 1.83.7 Security remediation floor
fastapi 0.115.14 0.136.1 Allows patched Starlette resolution
starlette 0.46.2 1.0.0 Fixes CVE-2025-62727 range
tornado 6.5.2 6.5.5 Fixes CVE-2026-31958 range

Runtime compatibility fixes carried forward from #289

  • Replaced RequestIDMiddleware's BaseHTTPMiddleware implementation with pure ASGI middleware.
    • Why: this avoids Starlette BaseHTTPMiddleware streaming-response buffering/loss behavior, which matters now that Starlette can resolve to newer versions.
  • Updated sync and async send_message to consume message responses via streaming response handling.
    • Why: FastACP can emit newline-delimited JSON frames even for non-streaming message send flows; this preserves compatibility with that server behavior.
  • Made SendMessageStreamResponse.result optional.
    • Why: intermediate or terminal stream frames can have result: null; validating those as required caused client-side failures.
  • Switched OpenAI Agents Temporal hooks from workflow.execute_activity_method to workflow.execute_activity.
    • Why: execute_activity is the public API path used by the current Temporal SDK.
  • Updated tutorial test runner logic to use the locally built SDK wheel when available.
    • Why: tutorial CI should validate the wheel built by this PR rather than accidentally testing against the previously published package.
    • Implemented with a Bash array so paths with spaces do not break argument parsing.

SDK tracing surface fix

  • Added task_id to the generated span model, create/update params, span resource methods, and Agentex tracing processor calls.
    • Why: current handwritten tracing code already passes task_id, but the generated span types/resources did not expose it. CI surfaced this as pyright errors after dependency resolution started matching the declared pyproject.toml dependencies.

What Was Not Ported From #289

I did not merge #289 wholesale. That branch was based on older main/pre-next state and includes broad generated-code churn, tutorial changes, and reversions that do not apply cleanly to current next. This PR only carries forward the still-relevant security and runtime compatibility fixes.

Validation

Local validation:

  • uv lock --check
  • uv run --with ruff ruff check .
  • uv run --with pyright==1.1.399 --with nox --with respx --with dirty-equals --with httpx-aiohttp --with claude-agent-sdk==0.1.52 pyright
  • uv build
  • uv run --with pytest-xdist --with respx --with dirty-equals pytest
    • 724 passed, 961 skipped

GitHub CI after the latest push:

  • lint: passed
  • build: passed
  • test: passed
  • tutorial-agent workflow permission/discovery/deprecation jobs: passed
  • Socket project report: passed

Note: rye is not installed locally, so local validation used uv equivalents with the same pinned pyright version and CI-resolved dev packages.

Greptile Summary

This PR unblocks security dependency patches by relaxing httpx and fastapi bounds, raising the litellm floor, and adding explicit starlette and tornado floors to address two CVEs. It also ports runtime compatibility fixes from the never-merged PR #289 — replacing BaseHTTPMiddleware with pure ASGI middleware, switching send_message to streaming-response parsing, making SendMessageStreamResponse.result optional, migrating Temporal hooks to workflow.execute_activity, and surfacing task_id across the span types/resources/processor.

Confidence Score: 5/5

Safe to merge — all findings are P2 style/robustness suggestions with no blocking correctness issues.

The dependency updates are well-justified and the lockfile has been regenerated. The ASGI middleware rewrite, Temporal API migration, and span field additions are all correct. The only non-trivial behaviour change is the streaming-based send_message implementation, which is logically sound; the one flagged concern (silent empty-stream fallback) is a robustness improvement suggestion, not a blocking bug. CI (lint, build, test, tutorials) passes.

src/agentex/resources/agents.py — review the empty-stream fallback behaviour in the new send_message streaming implementation.

Important Files Changed

Filename Overview
src/agentex/resources/agents.py sync/async send_message refactored to consume streaming responses; streaming parse logic silently skips unrecognised frames and falls back to constructing SendMessageResponse from collected task_messages
src/agentex/lib/sdk/fastacp/base/base_acp_server.py RequestIDMiddleware replaced with pure ASGI middleware; correct bytes-based header lookup and scope-type guard
pyproject.toml httpx upper bound raised to <0.29, fastapi upper bound removed, starlette>=0.49.1 and tornado>=6.5.5 floors added for CVE remediation, litellm floor raised to 1.83.7
src/agentex/lib/core/temporal/plugins/openai_agents/hooks/hooks.py Three execute_activity_method calls migrated to execute_activity (public Temporal SDK API)
src/agentex/types/agent_rpc_response.py SendMessageStreamResponse.result made Optional[TaskMessageUpdate] to handle null intermediate/terminal frames
src/agentex/types/span.py task_id Optional[str] field added to Span model
src/agentex/types/span_create_params.py task_id Optional[str] field added to SpanCreateParams TypedDict
src/agentex/types/span_update_params.py task_id Optional[str] field added to SpanUpdateParams TypedDict
src/agentex/resources/spans.py task_id parameter added to create/update on both sync and async span resources
src/agentex/lib/core/tracing/processors/agentex_tracing_processor.py task_id forwarded in both sync and async tracing processor create calls
examples/tutorials/run_agent_test.sh Bash array used for pytest command so wheel path with spaces does not break argument parsing; CI runner path searched first, then local dist/ as fallback
tests/api_resources/test_spans.py task_id added to create and update test calls for both sync and async span resources

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[send_message called] --> B{agent_id or agent_name?}
    B -->|agent_id| C[with_streaming_response.rpc]
    B -->|agent_name| D[with_streaming_response.rpc_by_name]
    B -->|neither| E[raise ValueError]
    C --> F[Open streaming context]
    D --> F
    F --> G[iter_lines]
    G --> H{line empty?}
    H -->|yes| G
    H -->|no| I[Strip data: prefix if SSE]
    I --> J[json.loads line]
    J --> K{Validate as SendMessageResponse?}
    K -->|success| L[return immediately]
    K -->|ValidationError| M[Validate as SendMessageStreamResponse]
    M --> N{result.type == full?}
    N -->|yes| O[append parent_task_message]
    N -->|no| G
    O --> G
    M -->|ValidationError| G
    J -->|JSONDecodeError| G
    G -->|stream exhausted| P[return SendMessageResponse id jsonrpc result=task_messages]
Loading

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/agentex/resources/agents.py:516-520
**Empty-stream fallback yields a silent no-op response**

If the server closes the stream without emitting any parseable frames (e.g. a transport error, an empty body, or all lines being skipped), `response_meta` will be `{}` and `task_messages` will be `[]`. The function then returns `SendMessageResponse(id=None, jsonrpc=None, result=[])` — an empty success — rather than surfacing the failure to the caller.

Consider raising when `response_meta` is still empty after exhausting the stream, so network-level failures are not silently swallowed.

Reviews (4): Last reviewed commit: "fix: allow litellm security patch" | Re-trigger Greptile

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 30, 2026

@scale-ballen scale-ballen force-pushed the sec/relax-httpx-for-litellm-vuln branch 2 times, most recently from 209dc1f to 42b9d17 Compare April 30, 2026 16:24
@scale-ballen scale-ballen changed the title fix: allow litellm security patch fix: unblock SDK security dependency patches Apr 30, 2026
@declan-scale
Copy link
Copy Markdown
Contributor

Can you pull the latest next and rebase this branch off of next? The diff looks a little messed up

@scale-ballen scale-ballen force-pushed the sec/relax-httpx-for-litellm-vuln branch from 42b9d17 to 91d33eb Compare April 30, 2026 18:30
@declan-scale declan-scale merged commit c980948 into next Apr 30, 2026
13 checks passed
@declan-scale declan-scale deleted the sec/relax-httpx-for-litellm-vuln branch April 30, 2026 19:38
@stainless-app stainless-app Bot mentioned this pull request Apr 30, 2026
declan-scale added a commit that referenced this pull request Apr 30, 2026
* feat(api): api update

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* chore(internal): more robust bootstrap script

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* fix: use correct field name format for multipart file arrays

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* feat: support setting headers via env

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* codegen metadata

* fix: allow litellm security patch (#336)

* fix(adk): Always inject headers on execute activity (#337)

* perf(streaming): coalesce per-token publishes to Redis (50ms / 128-char window) (#333)

* perf(streaming): coalesce per-token publishes to Redis (50ms / 128-char window)

Per-token Redis publishes from TemporalStreamingModel were adding ~45s
(56-62%) overhead to agent response latency, mostly from head-of-line
blocking on the model's event loop: each `await streaming_context.stream_update(...)`
inside the OpenAI stream `async for` paused token consumption until the
publish round-trip completed.

This change introduces a `CoalescingBuffer` driven by an `asyncio.Event`,
so the producer never awaits on Redis. Deltas are merged consecutive-only
(preserving character order in every (type, index) channel) and flushed
on a 50ms timer, on a 128-char size threshold, or immediately for the
first delta to keep perceived responsiveness high. The buffer's `close()`
drains remaining deltas before the DONE event, so consumers see the full
sequence in order.

A new `StreamingMode = Literal["off", "per_token", "coalesced"]` lives
in `streaming.py` as the single source of truth and is plumbed through
the adk streaming module, `StreamingService.streaming_task_message_context`,
and `StreamingTaskMessageContext`. Default is `"coalesced"` everywhere,
so all 13+ existing context callers (claude_agents, langgraph, litellm
provider, openai sync provider, etc.) benefit automatically.

* chore(streaming): fix import ordering (ruff I001)

* fix(streaming): address greptile review findings

- _run: when CancelledError is raised mid-flush in the for-loop, re-enqueue
  the in-flight item plus any remaining items in the local `drained` list
  back into self._buf so close()'s final drain can recover them. Previously
  the local `drained` list was unreachable after CancelledError exited the
  for-loop, causing the last coalesced batch to be silently dropped on
  close-during-flush races. Trade-off: the in-flight item may be duplicated
  on the consumer side (Redis pub may have completed before cancel was
  delivered), which is preferable to silent loss for streaming UX.

- _merge_pair: replace `return b` fallback with AssertionError. All six
  current TaskMessageDelta variants have explicit isinstance branches, so
  the fallback is unreachable today. But _can_merge returns True for any
  same-type pair, so adding a 7th delta variant without updating
  _merge_pair would silently drop `a`'s accumulated content. Asserting
  turns a future silent data-loss into an immediate, diagnosable crash.

* test(streaming): add coalescing-layer tests; loosen one model assertion

After merging the test-suite repair from main (#334) into this branch, one
model test (test_responses_api_streaming) regressed because its
assert_called_with strict-matched all kwargs of streaming_task_message_context
and didn't tolerate the new `streaming_mode='coalesced'` kwarg this PR
adds. Switched to assert_called() + targeted kwarg checks so the test
verifies what it cares about (task_id threading) without locking in
implementation details.

Replaced the ad-hoc smoke scripts that lived in conversation with a real
pytest module at tests/lib/core/services/adk/test_streaming.py covering:

- _delta_char_len, _can_merge, _merge_pair: per-channel correctness +
  None-handling
- _merge_consecutive: pure-text collapse, cross-channel order preservation,
  per-channel reconstruction matches per-token semantics
- CoalescingBuffer: first-delta-immediate flush within ~20ms,
  size-threshold flush before timer fires, multi-delta coalescing within
  one window, idle close, add-after-close no-op
- CoalescingBuffer cancel-during-flush regression test for the P1 fix:
  five queued chunks must all surface across publishes when close()
  cancels mid-flush (asserts substring presence rather than exact
  ordering, since the documented trade-off allows duplicates of the
  in-flight item)
- StreamingTaskMessageContext mode dispatch: "off" suppresses publishes
  but persists full content, "per_token" publishes each delta synchronously,
  "coalesced" batches and persists full content

* chore(streaming): route TemporalStreamingModel logger through make_logger

The model file used raw ``logging.getLogger("agentex.temporal.streaming")``,
which returns a logger with no handler attached and no level configured —
so the existing ``[TemporalStreamingModel] Initialized ... streaming_mode=...``
INFO log was silently dropped, making it impossible to verify at runtime
that a coalesced (or any) streaming mode was actually wired.

Switch to the SDK's ``make_logger`` helper (level=INFO, RichHandler in
local mode, StreamHandler otherwise) used everywhere else in the SDK.
The explicit logger name ``agentex.temporal.streaming`` is preserved so
any external logging configuration targeting that name keeps working.

* codegen metadata

* feat(api): api update

* release: 0.10.3

---------

Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
Co-authored-by: Brandon Allen <brandon.allen@scale.com>
Co-authored-by: Declan Brady <declan.brady@scale.com>
Co-authored-by: Stas Moreinis <stas.moreinis@scale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants