Skip to content

release: 0.11.0#343

Open
stainless-app[bot] wants to merge 60 commits intomainfrom
release-please--branches--main--changes--next
Open

release: 0.11.0#343
stainless-app[bot] wants to merge 60 commits intomainfrom
release-please--branches--main--changes--next

Conversation

@stainless-app
Copy link
Copy Markdown
Contributor

@stainless-app stainless-app Bot commented May 4, 2026

Automated Release PR

0.11.0 (2026-05-05)

Full Changelog: v0.10.4...v0.11.0

Features

  • openai_agents: expose real usage, response_id, plumb previous_response_id, opt-in prompt_cache_key for stateful responses and prompt caching (#335) (ba5d64b)

Chores

  • internal: reformat pyproject.toml (76e0299)
  • internal: version bump (0d318ad)

This pull request is managed by Stainless's GitHub App.

The semver version number is based on included commit messages. Alternatively, you can manually set the version number in the title of this pull request.

For a better experience, it is recommended to use either rebase-merge or squash-merge when merging this pull request.

🔗 Stainless website
📚 Read the docs
🙋 Reach out for help or questions

Greptile Summary

  • Introduces batched span dispatch (on_spans_start/on_spans_end) across the tracing stack: AsyncTracingProcessor gets default fan-out implementations, SGPAsyncTracingProcessor overrides them to coalesce all spans in a drain batch into a single upsert_batch HTTP call, and AsyncSpanQueue._process_items is reworked to group spans by processor before dispatching.
  • Three issues flagged in prior review threads remain unresolved: _spans is populated before the upsert_batch call (orphaned end events on HTTP failure), shutdown() lacks a disabled guard (crashes with AttributeError when disabled spans are in-flight), and the assert type-guard in _process_items is silently elided under python -O.
  • New concern this pass: the default on_spans_start fans out to self.on_span_start, while SGPAsyncTracingProcessor.on_span_start delegates to self.on_spans_start; any future subclass that follows the same delegation pattern without also overriding on_spans_start will hit infinite recursion.

Confidence Score: 4/5

Safe to merge with caveats — the three issues from prior review threads remain open, but they are bounded in scope; the new mutual-recursion trap is P2 only.

Score capped at 4 due to the unresolved P1 shutdown() crash when disabled=True (flagged in a prior thread). No new P0 or P1 issues were introduced in this pass; the mutual-recursion concern and empty-batch HTTP call are P2.

src/agentex/lib/core/tracing/processors/sgp_tracing_processor.py — unresolved shutdown/disabled crash and stale-_spans-on-failure issues from prior review; src/agentex/lib/core/tracing/span_queue.py — assert guard elided under -O.

Important Files Changed

Filename Overview
src/agentex/lib/core/tracing/processors/sgp_tracing_processor.py Refactored to delegate single-span methods to batched variants; populates _spans before the HTTP call (stale entry on failure) and shutdown() crashes on disabled=True with in-flight spans — both flagged in previous review threads plus a new empty-batch HTTP call issue.
src/agentex/lib/core/tracing/processors/tracing_processor_interface.py Adds default on_spans_start/on_spans_end methods that fan out to per-span calls; design creates a mutual-recursion trap for future subclasses that delegate on_span_starton_spans_start without also overriding on_spans_start.
src/agentex/lib/core/tracing/span_queue.py Reworked _process_items to group spans by processor and dispatch via batched methods; assert guard for same-event-type precondition is stripped by -O (flagged in prior review thread); logic otherwise sound.
tests/lib/core/tracing/processors/test_sgp_tracing_processor.py Adds batch start/end tests verifying single upsert_batch call and correct span tracking; coverage is thorough for the happy path.
tests/lib/core/tracing/processors/test_tracing_processor_interface.py New test file verifying default fanout behavior, failure isolation, and per-span error logging; well-structured.
tests/lib/core/tracing/test_span_queue.py Adds TestProcessItemsPreconditions and TestAsyncSpanQueueBatchedDispatch suites; mock helper updated to fan out batched calls to per-span mocks for backwards-compatible test assertions.

Sequence Diagram

sequenceDiagram
    participant Q as AsyncSpanQueue
    participant PI as _process_items
    participant AP as AsyncTracingProcessor (default)
    participant SGP as SGPAsyncTracingProcessor

    Q->>PI: _process_items(starts)
    PI->>PI: group spans by processor
    PI->>SGP: on_spans_start([span_a, span_b, ...])
    SGP->>SGP: populate _spans[id] for each
    SGP->>SGP: if disabled → return
    SGP-->>SGP: upsert_batch(all spans in one HTTP call)

    Note over PI,AP: Processor using default fallback
    PI->>AP: on_spans_start([span_a, span_b])
    AP->>AP: asyncio.gather(on_span_start(a), on_span_start(b))
    AP-->>AP: per-span exceptions caught and logged

    Q->>PI: _process_items(ends)
    PI->>SGP: on_spans_end([span_a, span_b, ...])
    SGP->>SGP: pop _spans, update fields
    SGP-->>SGP: upsert_batch(ended spans)

    Q->>SGP: shutdown()
    SGP-->>SGP: upsert_batch(_spans remaining)
    Note over SGP: crashes if disabled=True with spans in-flight
Loading

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
src/agentex/lib/core/tracing/processors/tracing_processor_interface.py:44-65
**Mutual-recursion trap for future processor implementations**

The default `on_spans_start` fans out to `self.on_span_start`, and `SGPAsyncTracingProcessor.on_span_start` delegates back to `self.on_spans_start([span])`. This only works today because `SGPAsyncTracingProcessor` also overrides `on_spans_start`. Any future subclass that follows the same delegation pattern in `on_span_start` (calling `self.on_spans_start([span])`) but forgets to override `on_spans_start` will hit infinite recursion: default `on_spans_start``on_span_start``on_spans_start` → …. The docstring should explicitly warn that processors overriding `on_span_start` to delegate to `on_spans_start` must also override `on_spans_start`, otherwise they create an infinite loop.

### Issue 2 of 2
src/agentex/lib/core/tracing/processors/sgp_tracing_processor.py:190-195
**`shutdown()` sends an unnecessary HTTP request when `_spans` is empty**

In the normal case where all spans are properly closed before shutdown, `_spans` is empty and `shutdown()` still calls `upsert_batch(items=[])`. This is a pointless network round-trip every time the processor shuts down cleanly. A guard like `if not self._spans: return` (after any disabled check) would avoid the empty call.

Reviews (10): Last reviewed commit: "release: 0.11.0" | Re-trigger Greptile

@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from 364e2b2 to b1d20d6 Compare May 4, 2026 19:56
@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from b1d20d6 to b837eeb Compare May 4, 2026 20:22
@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from b837eeb to ac067a6 Compare May 4, 2026 22:16
@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from ac067a6 to 16b956f Compare May 4, 2026 22:51
@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from 16b956f to 2ea4386 Compare May 4, 2026 23:22
@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from 2ea4386 to 3b9a668 Compare May 5, 2026 00:22
Comment on lines +107 to +111
event_type = items[0].event_type
assert all(i.event_type == event_type for i in items), (
"_process_items requires all items to share the same event_type; "
"callers must split START and END batches before dispatching."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 assert in production guard defeats data-corruption protection

The code comment correctly identifies this as a potential "silent data-corruption bug," but using assert for the guard means it is silently stripped when Python runs with the -O (optimize) flag. If a caller ever passes a mixed-event-type list, START and END spans would be fed to the wrong batched method with no warning. Use an explicit if/raise instead.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agentex/lib/core/tracing/span_queue.py
Line: 107-111

Comment:
**`assert` in production guard defeats data-corruption protection**

The code comment correctly identifies this as a potential "silent data-corruption bug," but using `assert` for the guard means it is silently stripped when Python runs with the `-O` (optimize) flag. If a caller ever passes a mixed-event-type list, START and END spans would be fed to the wrong batched method with no warning. Use an explicit `if/raise` instead.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Cursor Fix in Claude Code Fix in Codex

Comment on lines +141 to 163
sgp_spans: list[SGPSpan] = []
for span in spans:
self._add_source_to_span(span)
sgp_span = create_span(
name=span.name,
span_type=_get_span_type(span),
span_id=span.id,
parent_id=span.parent_id,
trace_id=span.trace_id,
input=span.input,
output=span.output,
metadata=span.data,
)
sgp_span.start_time = span.start_time.isoformat() # type: ignore[union-attr]
self._spans[span.id] = sgp_span
sgp_spans.append(sgp_span)

if self.disabled:
logger.warning("SGP is disabled, skipping span upsert")
return
# TODO(AGX1-198): Batch multiple spans into a single upsert_batch call
# instead of one span per HTTP request.
# https://linear.app/scale-epd/issue/AGX1-198/actually-use-sgp-batching-for-spans
await self.sgp_async_client.spans.upsert_batch( # type: ignore[union-attr]
items=[sgp_span.to_request_params()]
items=[s.to_request_params() for s in sgp_spans]
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 _spans populated before upsert — stale entries on HTTP failure

Spans are added to self._spans before the upsert_batch HTTP call (lines 155–156). If the batch upsert throws (network error, server 5xx), the exception is caught upstream by the queue's _handle, but _spans already holds entries for spans whose start event was never delivered to SGP. A subsequent on_spans_end will find those spans, update them, and send end-only upserts — orphaned end events with no matching start on the server.

The old single-span code registered the span in _spans only after a successful upsert, so failures were cleanly skipped on the end path. Consider populating _spans only after confirming the batch call succeeded, or rolling back entries on exception.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agentex/lib/core/tracing/processors/sgp_tracing_processor.py
Line: 141-163

Comment:
**`_spans` populated before upsert — stale entries on HTTP failure**

Spans are added to `self._spans` before the `upsert_batch` HTTP call (lines 155–156). If the batch upsert throws (network error, server 5xx), the exception is caught upstream by the queue's `_handle`, but `_spans` already holds entries for spans whose start event was never delivered to SGP. A subsequent `on_spans_end` will find those spans, update them, and send end-only upserts — orphaned end events with no matching start on the server.

The old single-span code registered the span in `_spans` only after a successful upsert, so failures were cleanly skipped on the end path. Consider populating `_spans` only after confirming the batch call succeeded, or rolling back entries on exception.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Cursor Fix in Claude Code Fix in Codex

@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from 3b9a668 to b702eb9 Compare May 5, 2026 01:22
Comment on lines +154 to 163
sgp_span.start_time = span.start_time.isoformat() # type: ignore[union-attr]
self._spans[span.id] = sgp_span
sgp_spans.append(sgp_span)

if self.disabled:
logger.warning("SGP is disabled, skipping span upsert")
return
# TODO(AGX1-198): Batch multiple spans into a single upsert_batch call
# instead of one span per HTTP request.
# https://linear.app/scale-epd/issue/AGX1-198/actually-use-sgp-batching-for-spans
await self.sgp_async_client.spans.upsert_batch( # type: ignore[union-attr]
items=[sgp_span.to_request_params()]
items=[s.to_request_params() for s in sgp_spans]
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 shutdown() crashes with AttributeError when disabled=True and spans are in-flight

on_spans_start now populates self._spans (line 155) before the if self.disabled: return guard (line 158). If any spans are started but not yet ended when shutdown() is called in disabled mode, it reaches self.sgp_async_client.spans.upsert_batch(...) where self.sgp_async_client is None, triggering an AttributeError. Before this PR the disabled path returned before populating _spans, so _spans was always empty at shutdown time and this was never triggered in practice. The fix is to either move the self._spans[span.id] = sgp_span assignment after the if self.disabled guard, or add an early if self.disabled: return check at the top of shutdown() (mirroring how on_spans_end handles it at line 184).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agentex/lib/core/tracing/processors/sgp_tracing_processor.py
Line: 154-163

Comment:
**`shutdown()` crashes with `AttributeError` when `disabled=True` and spans are in-flight**

`on_spans_start` now populates `self._spans` (line 155) **before** the `if self.disabled: return` guard (line 158). If any spans are started but not yet ended when `shutdown()` is called in disabled mode, it reaches `self.sgp_async_client.spans.upsert_batch(...)` where `self.sgp_async_client` is `None`, triggering an `AttributeError`. Before this PR the disabled path returned before populating `_spans`, so `_spans` was always empty at shutdown time and this was never triggered in practice. The fix is to either move the `self._spans[span.id] = sgp_span` assignment after the `if self.disabled` guard, or add an early `if self.disabled: return` check at the top of `shutdown()` (mirroring how `on_spans_end` handles it at line 184).

How can I resolve this? If you propose a fix, please make it concise.

Fix in Cursor Fix in Claude Code Fix in Codex

@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from b702eb9 to 04eafa5 Compare May 5, 2026 03:22
@stainless-app stainless-app Bot force-pushed the release-please--branches--main--changes--next branch from 04eafa5 to 65af241 Compare May 5, 2026 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant