fix: unblock SDK security dependency patches by scale-ballen · Pull Request #336 · scaleapi/scale-agentex-python

scale-ballen · 2026-04-30T16:10:41Z

Summary

This PR updates agentex-sdk on the next branch so downstream repos can consume a 0.10.3 release that resolves the dependency blockers currently preventing security remediation.

The original blocker was agentex-sdk==0.10.2 requiring httpx>=0.27.2,<0.28, while patched litellm>=1.83.7 requires httpx==0.28.1. This PR relaxes the SDK bound and raises the litellm floor so downstream consumers can upgrade without dependency conflicts.

While reviewing older security work, I also found PR #289 was never merged. I ported the still-relevant pieces from that PR into this current next-based branch instead of cherry-picking it wholesale, because #289 was based on older code and would revert/remove a lot of current next work.

What Changed

Dependency and security remediation

Relaxed httpx from >=0.27.2,<0.28 to >=0.28.1,<0.29.
Raised litellm from >=1.83.0,<2 to >=1.83.7,<2.
Removed the tight fastapi<0.116 cap so the resolver can select patched Starlette-compatible FastAPI releases.
Added explicit starlette>=0.49.1 floor for CVE-2025-62727 remediation.
Added explicit tornado>=6.5.5 floor for CVE-2026-31958 remediation.
Regenerated uv.lock from the current next branch.

Resolved package versions in the lockfile:

Package	Before	After	Why
`httpx`	`0.27.2`	`0.28.1`	Required by patched `litellm`
`litellm`	`1.83.0`	`1.83.7`	Security remediation floor
`fastapi`	`0.115.14`	`0.136.1`	Allows patched Starlette resolution
`starlette`	`0.46.2`	`1.0.0`	Fixes CVE-2025-62727 range
`tornado`	`6.5.2`	`6.5.5`	Fixes CVE-2026-31958 range

Runtime compatibility fixes carried forward from #289

Replaced RequestIDMiddleware's BaseHTTPMiddleware implementation with pure ASGI middleware.
- Why: this avoids Starlette BaseHTTPMiddleware streaming-response buffering/loss behavior, which matters now that Starlette can resolve to newer versions.
Updated sync and async send_message to consume message responses via streaming response handling.
- Why: FastACP can emit newline-delimited JSON frames even for non-streaming message send flows; this preserves compatibility with that server behavior.
Made SendMessageStreamResponse.result optional.
- Why: intermediate or terminal stream frames can have result: null; validating those as required caused client-side failures.
Switched OpenAI Agents Temporal hooks from workflow.execute_activity_method to workflow.execute_activity.
- Why: execute_activity is the public API path used by the current Temporal SDK.
Updated tutorial test runner logic to use the locally built SDK wheel when available.
- Why: tutorial CI should validate the wheel built by this PR rather than accidentally testing against the previously published package.
- Implemented with a Bash array so paths with spaces do not break argument parsing.

SDK tracing surface fix

Added task_id to the generated span model, create/update params, span resource methods, and Agentex tracing processor calls.
- Why: current handwritten tracing code already passes task_id, but the generated span types/resources did not expose it. CI surfaced this as pyright errors after dependency resolution started matching the declared pyproject.toml dependencies.

What Was Not Ported From #289

I did not merge #289 wholesale. That branch was based on older main/pre-next state and includes broad generated-code churn, tutorial changes, and reversions that do not apply cleanly to current next. This PR only carries forward the still-relevant security and runtime compatibility fixes.

Validation

Local validation:

uv lock --check
uv run --with ruff ruff check .
uv run --with pyright==1.1.399 --with nox --with respx --with dirty-equals --with httpx-aiohttp --with claude-agent-sdk==0.1.52 pyright
uv build
uv run --with pytest-xdist --with respx --with dirty-equals pytest
- 724 passed, 961 skipped

GitHub CI after the latest push:

lint: passed
build: passed
test: passed
tutorial-agent workflow permission/discovery/deprecation jobs: passed
Socket project report: passed

Note: rye is not installed locally, so local validation used uv equivalents with the same pinned pyright version and CI-resolved dev packages.

Greptile Summary

This PR unblocks security dependency patches by relaxing httpx and fastapi bounds, raising the litellm floor, and adding explicit starlette and tornado floors to address two CVEs. It also ports runtime compatibility fixes from the never-merged PR #289 — replacing BaseHTTPMiddleware with pure ASGI middleware, switching send_message to streaming-response parsing, making SendMessageStreamResponse.result optional, migrating Temporal hooks to workflow.execute_activity, and surfacing task_id across the span types/resources/processor.

Confidence Score: 5/5

Safe to merge — all findings are P2 style/robustness suggestions with no blocking correctness issues.

The dependency updates are well-justified and the lockfile has been regenerated. The ASGI middleware rewrite, Temporal API migration, and span field additions are all correct. The only non-trivial behaviour change is the streaming-based send_message implementation, which is logically sound; the one flagged concern (silent empty-stream fallback) is a robustness improvement suggestion, not a blocking bug. CI (lint, build, test, tutorials) passes.

src/agentex/resources/agents.py — review the empty-stream fallback behaviour in the new send_message streaming implementation.

Important Files Changed

Filename	Overview
src/agentex/resources/agents.py	sync/async send_message refactored to consume streaming responses; streaming parse logic silently skips unrecognised frames and falls back to constructing SendMessageResponse from collected task_messages
src/agentex/lib/sdk/fastacp/base/base_acp_server.py	RequestIDMiddleware replaced with pure ASGI middleware; correct bytes-based header lookup and scope-type guard
pyproject.toml	httpx upper bound raised to <0.29, fastapi upper bound removed, starlette>=0.49.1 and tornado>=6.5.5 floors added for CVE remediation, litellm floor raised to 1.83.7
src/agentex/lib/core/temporal/plugins/openai_agents/hooks/hooks.py	Three execute_activity_method calls migrated to execute_activity (public Temporal SDK API)
src/agentex/types/agent_rpc_response.py	SendMessageStreamResponse.result made Optional[TaskMessageUpdate] to handle null intermediate/terminal frames
src/agentex/types/span.py	task_id Optional[str] field added to Span model
src/agentex/types/span_create_params.py	task_id Optional[str] field added to SpanCreateParams TypedDict
src/agentex/types/span_update_params.py	task_id Optional[str] field added to SpanUpdateParams TypedDict
src/agentex/resources/spans.py	task_id parameter added to create/update on both sync and async span resources
src/agentex/lib/core/tracing/processors/agentex_tracing_processor.py	task_id forwarded in both sync and async tracing processor create calls
examples/tutorials/run_agent_test.sh	Bash array used for pytest command so wheel path with spaces does not break argument parsing; CI runner path searched first, then local dist/ as fallback
tests/api_resources/test_spans.py	task_id added to create and update test calls for both sync and async span resources

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[send_message called] --> B{agent_id or agent_name?}
    B -->|agent_id| C[with_streaming_response.rpc]
    B -->|agent_name| D[with_streaming_response.rpc_by_name]
    B -->|neither| E[raise ValueError]
    C --> F[Open streaming context]
    D --> F
    F --> G[iter_lines]
    G --> H{line empty?}
    H -->|yes| G
    H -->|no| I[Strip data: prefix if SSE]
    I --> J[json.loads line]
    J --> K{Validate as SendMessageResponse?}
    K -->|success| L[return immediately]
    K -->|ValidationError| M[Validate as SendMessageStreamResponse]
    M --> N{result.type == full?}
    N -->|yes| O[append parent_task_message]
    N -->|no| G
    O --> G
    M -->|ValidationError| G
    J -->|JSONDecodeError| G
    G -->|stream exhausted| P[return SendMessageResponse id jsonrpc result=task_messages]

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/agentex/resources/agents.py:516-520
**Empty-stream fallback yields a silent no-op response**

If the server closes the stream without emitting any parseable frames (e.g. a transport error, an empty body, or all lines being skipped), `response_meta` will be `{}` and `task_messages` will be `[]`. The function then returns `SendMessageResponse(id=None, jsonrpc=None, result=[])` — an empty success — rather than surfacing the failure to the caller.

Consider raising when `response_meta` is still empty after exhausting the stream, so network-level failures are not silently swallowed.

_{Reviews (4): Last reviewed commit: "fix: allow litellm security patch" | Re-trigger Greptile}

socket-security · 2026-04-30T16:15:23Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Package	Supply Chain Security	Vulnerability
tornado@6.5.2 ⏵ 6.5.5	⁺¹	⁺²³
litellm@1.83.0 ⏵ 1.83.7		⁺¹⁶
temporalio@1.18.2 ⏵ 1.26.0	^-7
openai-agents@0.4.2 ⏵ 0.14.1	^-8
aiohttp@3.12.15 ⏵ 3.13.5	⁺¹	⁺³²
jsonschema@4.25.1 ⏵ 4.23.0
mcp@1.15.0 ⏵ 1.27.0	⁺¹	⁺¹⁶
pydantic@2.11.9 ⏵ 2.12.5	⁺¹
starlette@0.46.2 ⏵ 1.0.0		⁺¹⁸
fastapi@0.115.14 ⏵ 0.136.1	⁺¹
httpx@0.27.2 ⏵ 0.28.1

View full report

declan-scale · 2026-04-30T17:29:53Z

Can you pull the latest next and rebase this branch off of next? The diff looks a little messed up

* feat(api): api update * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * chore(internal): more robust bootstrap script * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * fix: use correct field name format for multipart file arrays * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * feat: support setting headers via env * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * codegen metadata * fix: allow litellm security patch (#336) * fix(adk): Always inject headers on execute activity (#337) * perf(streaming): coalesce per-token publishes to Redis (50ms / 128-char window) (#333) * perf(streaming): coalesce per-token publishes to Redis (50ms / 128-char window) Per-token Redis publishes from TemporalStreamingModel were adding ~45s (56-62%) overhead to agent response latency, mostly from head-of-line blocking on the model's event loop: each `await streaming_context.stream_update(...)` inside the OpenAI stream `async for` paused token consumption until the publish round-trip completed. This change introduces a `CoalescingBuffer` driven by an `asyncio.Event`, so the producer never awaits on Redis. Deltas are merged consecutive-only (preserving character order in every (type, index) channel) and flushed on a 50ms timer, on a 128-char size threshold, or immediately for the first delta to keep perceived responsiveness high. The buffer's `close()` drains remaining deltas before the DONE event, so consumers see the full sequence in order. A new `StreamingMode = Literal["off", "per_token", "coalesced"]` lives in `streaming.py` as the single source of truth and is plumbed through the adk streaming module, `StreamingService.streaming_task_message_context`, and `StreamingTaskMessageContext`. Default is `"coalesced"` everywhere, so all 13+ existing context callers (claude_agents, langgraph, litellm provider, openai sync provider, etc.) benefit automatically. * chore(streaming): fix import ordering (ruff I001) * fix(streaming): address greptile review findings - _run: when CancelledError is raised mid-flush in the for-loop, re-enqueue the in-flight item plus any remaining items in the local `drained` list back into self._buf so close()'s final drain can recover them. Previously the local `drained` list was unreachable after CancelledError exited the for-loop, causing the last coalesced batch to be silently dropped on close-during-flush races. Trade-off: the in-flight item may be duplicated on the consumer side (Redis pub may have completed before cancel was delivered), which is preferable to silent loss for streaming UX. - _merge_pair: replace `return b` fallback with AssertionError. All six current TaskMessageDelta variants have explicit isinstance branches, so the fallback is unreachable today. But _can_merge returns True for any same-type pair, so adding a 7th delta variant without updating _merge_pair would silently drop `a`'s accumulated content. Asserting turns a future silent data-loss into an immediate, diagnosable crash. * test(streaming): add coalescing-layer tests; loosen one model assertion After merging the test-suite repair from main (#334) into this branch, one model test (test_responses_api_streaming) regressed because its assert_called_with strict-matched all kwargs of streaming_task_message_context and didn't tolerate the new `streaming_mode='coalesced'` kwarg this PR adds. Switched to assert_called() + targeted kwarg checks so the test verifies what it cares about (task_id threading) without locking in implementation details. Replaced the ad-hoc smoke scripts that lived in conversation with a real pytest module at tests/lib/core/services/adk/test_streaming.py covering: - _delta_char_len, _can_merge, _merge_pair: per-channel correctness + None-handling - _merge_consecutive: pure-text collapse, cross-channel order preservation, per-channel reconstruction matches per-token semantics - CoalescingBuffer: first-delta-immediate flush within ~20ms, size-threshold flush before timer fires, multi-delta coalescing within one window, idle close, add-after-close no-op - CoalescingBuffer cancel-during-flush regression test for the P1 fix: five queued chunks must all surface across publishes when close() cancels mid-flush (asserts substring presence rather than exact ordering, since the documented trade-off allows duplicates of the in-flight item) - StreamingTaskMessageContext mode dispatch: "off" suppresses publishes but persists full content, "per_token" publishes each delta synchronously, "coalesced" batches and persists full content * chore(streaming): route TemporalStreamingModel logger through make_logger The model file used raw ``logging.getLogger("agentex.temporal.streaming")``, which returns a logger with no handler attached and no level configured — so the existing ``[TemporalStreamingModel] Initialized ... streaming_mode=...`` INFO log was silently dropped, making it impossible to verify at runtime that a coalesced (or any) streaming mode was actually wired. Switch to the SDK's ``make_logger`` helper (level=INFO, RichHandler in local mode, StreamHandler otherwise) used everywhere else in the SDK. The explicit logger name ``agentex.temporal.streaming`` is preserved so any external logging configuration targeting that name keeps working. * codegen metadata * feat(api): api update * release: 0.10.3 --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com> Co-authored-by: Brandon Allen <brandon.allen@scale.com> Co-authored-by: Declan Brady <declan.brady@scale.com> Co-authored-by: Stas Moreinis <stas.moreinis@scale.com>

scale-ballen force-pushed the sec/relax-httpx-for-litellm-vuln branch 2 times, most recently from 209dc1f to 42b9d17 Compare April 30, 2026 16:24

scale-ballen changed the title ~~fix: allow litellm security patch~~ fix: unblock SDK security dependency patches Apr 30, 2026

fix: allow litellm security patch

91d33eb

scale-ballen force-pushed the sec/relax-httpx-for-litellm-vuln branch from 42b9d17 to 91d33eb Compare April 30, 2026 18:30

declan-scale approved these changes Apr 30, 2026

View reviewed changes

declan-scale merged commit c980948 into next Apr 30, 2026
13 checks passed

declan-scale deleted the sec/relax-httpx-for-litellm-vuln branch April 30, 2026 19:38

stainless-app Bot mentioned this pull request Apr 30, 2026

release: 0.10.3 #330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: unblock SDK security dependency patches#336

fix: unblock SDK security dependency patches#336
declan-scale merged 1 commit intonextfrom
sec/relax-httpx-for-litellm-vuln

scale-ballen commented Apr 30, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

socket-security Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

declan-scale commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

scale-ballen commented Apr 30, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Dependency and security remediation

Runtime compatibility fixes carried forward from #289

SDK tracing surface fix

What Was Not Ported From #289

Validation

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

socket-security Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

declan-scale commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scale-ballen commented Apr 30, 2026 •

edited by greptile-apps Bot

Loading

socket-security Bot commented Apr 30, 2026 •

edited

Loading