Skip to content

feat(sdk): default merge events in SDK#155

Merged
namrataghadi-galileo merged 27 commits intomainfrom
feature/59787-merge-events
Apr 4, 2026
Merged

feat(sdk): default merge events in SDK#155
namrataghadi-galileo merged 27 commits intomainfrom
feature/59787-merge-events

Conversation

@namrataghadi-galileo
Copy link
Copy Markdown
Contributor

@namrataghadi-galileo namrataghadi-galileo commented Mar 31, 2026

Summary

This PR makes the SDK the single owner of ControlExecutionEvent creation and emission, moving all observability event reconstruction from the server to the SDK. The /api/v1/evaluation endpoint now returns only evaluation semantics, keeping evaluation and observability concerns cleanly separated.

Key changes:

  • SDK reconstructs both local and server control execution events after receiving the lightweight EvaluationResponse
  • Server /evaluation endpoint is now evaluation-only — no longer builds or ingests observability events
  • Events are emitted exclusively through the SDK's existing observability pipeline (batcher → /api/v1/observability/events → OSS storage)
  • Event identity consistency preserved using ControlDefinition.observability_identity() for composite conditions

Behavior

When SDK observability is enabled

  • check_evaluation_with_local() reconstructs local + server events in the SDK and enqueues one combined batch
  • check_evaluation() reconstructs server events in the SDK and enqueues through the built-in pipeline
  • Trace/span correlation preserved through the tracing resolver and reconstructed control metadata

When SDK observability is disabled

  • Evaluation still works normally
  • No control-execution events are created or emitted

Error handling

  • If local controls succeed but the server request/parsing fails, local events are still enqueued before the error is re-raised
  • Safe failure handling ensures observability data isn't lost

What changed

SDK (sdks/python/src/agent_control/):

  • Added shared event reconstruction helpers in evaluation_events.py
  • Updated evaluation.py to support:
    • SDK reconstruction of local events
    • SDK reconstruction of server events from lightweight EvaluationResponse + cached control definitions
    • Combined enqueue of local + server events through existing SDK batcher
    • Standalone check_evaluation() reconstruction when observability enabled
  • Removed old session-scoped merge_events mode from Python SDK surface

Server (server/src/agent_control_server/):

  • Updated endpoints/evaluation.py to be evaluation-only
  • Returns sanitized EvaluationResponse without observability payloads
  • Removed observability event emission and ingestion from /evaluation endpoint

TypeScript client:

  • Cleaned up generated client drift to match current pure-evaluation contract

Testing

Tests updated to cover:

  • ✅ Default local event enqueue behavior
  • ✅ SDK reconstruction of server events
  • ✅ Local + server combined enqueue behavior
  • ✅ Provider-backed trace/span propagation
  • ✅ Local-event preservation when server call/parsing fails
  • ✅ Pure /evaluation behavior with no server-side observability emission
  • ✅ Response sanitization and helper behavior on server side

Migration notes

  • This is now the default observability behavior (not an optional mode)
  • Direct callers to /api/v1/evaluation no longer get observability automatically unless they also emit events through /api/v1/observability/events

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 91.26984% with 11 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
sdks/python/src/agent_control/evaluation.py 91.54% 6 Missing ⚠️
sdks/python/src/agent_control/evaluation_events.py 89.79% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@lan17 lan17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still two blockers before we merge this. I left inline notes on the partial sink integration and the unconditional reconstruction work on the evaluation hot path.

Copy link
Copy Markdown
Contributor

@lan17 lan17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking issue:

  • server/src/agent_control_server/endpoints/evaluation.py:242-263 now trusts X-Agent-Control-Merge-Events from the caller and skips server-side observability ingestion solely on that header. A direct API caller can set the header and suppress control-execution events without reconstructing/re-emitting them anywhere. The server needs to treat merged mode as a trusted SDK flow, or otherwise keep the default ingestion path for untrusted callers.

@namrataghadi-galileo
Copy link
Copy Markdown
Contributor Author

@lan17 Fixed. The server no longer trusts X-Agent-Control-Merge-Events by itself. Merged mode is now established during init(..., merge_events=True) / initAgent, and the server records that enablement for the initialized (authenticated client, agent) session. On /evaluation, the server only skips observability ingestion if that same client previously initialized that same agent with merge-events enabled; otherwise it keeps the default ingestion path. Since merge_events defaults to False, existing behavior remains unchanged unless merged mode is explicitly turned on.

Copy link
Copy Markdown
Contributor

@lan17 lan17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed current head 9af1755.

The original header-trust issue looks fixed, but there is still one blocker:

  • server/src/agent_control_server/merge_event_sessions.py:28-40 scopes merge-enable state by (client.api_key, agent_name). For session-cookie auth, _authenticate_via_cookie() returns AuthenticatedClient(api_key="", ...) in server/src/agent_control_server/auth.py:118-121, so every cookie-authenticated browser session collapses to the same key ("", agent_name). If one logged-in UI session initializes an agent with merge events enabled, any other logged-in UI session can send X-Agent-Control-Merge-Events: true and suppress server-side ingestion for that agent. The trusted-session key needs to use a per-session or otherwise user-specific identifier, not the empty api_key sentinel.

@lan17
Copy link
Copy Markdown
Contributor

lan17 commented Apr 2, 2026

Stepping back from the line-level issues, I think the direction here is reasonable, but the architecture is taking on too much implicit session and state coupling too quickly. Separating evaluation semantics from observability event delivery is the right idea, and extracting shared event reconstruction logic is a real improvement over duplicating emission behavior in multiple places. Where this starts to feel brittle is that the behavior of a single evaluation request now depends on prior init(...)/initAgent side effects, cached control definitions in the SDK, special headers, and server-side trust state. That is a lot of hidden coordination for a latency-sensitive path, and it makes correctness and debuggability harder than it should be.

For a v1, I would be comfortable with a narrower version of this. I would keep merged event creation scoped only to the local-first SDK path, keep check_evaluation() on the plain server-owned flow, and keep the built-in queue as the only delivery mechanism for now. In other words, make one explicit opt-in contract that says the SDK owns creation of the merged batch for this initialized session, and avoid broadening the public surface area until that protocol is solid. That still gets the main benefit of this work without committing us to a more general session-plus-delivery framework yet.

What I do not think we should do yet is spread this across multiple entry points, multiple trust mechanisms, and alternate delivery sinks at the same time. Once merge mode works in a way that is explicit, robust across auth modes and deployment topologies, and easy to reason about end to end, then adding a configurable sink is a straightforward follow-on. Right now #160 feels like it is layering a flexible delivery abstraction on top of a control plane that is still settling.

So my recommendation would be: land a smaller, clearer version of merged creation first, prove that the ownership model is correct, and only then generalize it. I think the core idea is good. I just do not think we should lock in this much flexibility before the underlying contract is simpler and more stable.

@namrataghadi-galileo namrataghadi-galileo changed the title feat(sdk): 59787 merge events feat(sdk): default merge events in SDK Apr 3, 2026
@namrataghadi-galileo namrataghadi-galileo merged commit 5984a60 into main Apr 4, 2026
8 checks passed
@namrataghadi-galileo namrataghadi-galileo deleted the feature/59787-merge-events branch April 4, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants