-
Notifications
You must be signed in to change notification settings - Fork 95
Refactor responses-backed Agent sessions #281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Will be good to create an e2e test that is equivalent to https://llamastack.github.io/docs/getting_started/quickstart#step-3-run-the-demo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there need to be another change when conversation support is added?
@raghotham confused by your statement: ![]() |
- replace legacy `client.alpha.agents.*` paths in both sync and async agent implementations with the `/v1/responses` + `/v1/conversations` flow - treat each `Agent.create_session()` as a lazily created conversation, caching the returned `conv_…` ID for later turns - stream turns via `client.responses.create(..., stream=True)` and translate OpenAI `ResponseObjectStream` events into the agent event surface introduced in `lib/agents/stream_events.py` - run client and builtin tool calls by emitting follow-up responses with `previous_response_id`, mirroring the old turn-resume semantics - remove the legacy `AgentTurnResponseStreamChunk` dependency, introduce a lightweight `AgentStreamChunk`, and keep tool outputs inside `lib/` only - clean up aux imports, drop the unused `__future__` pragmas, and ensure the entire module passes `ruff check` This refactor keeps the public `Agent` API (create_session/create_turn) intact while aligning the implementation with stable responses/conversations APIs, so users can interoperate with standard OpenAI-compatible clients going forward.
This commit implements a high-level turn and step event model that wraps the low-level responses API stream events. The new model provides semantic meaning to agent interactions and distinguishes between server-side and client-side tool execution. Key changes: - Add turn_events.py with new event dataclasses (TurnStarted, StepProgress, etc.) - Add event_synthesizer.py for stateful event translation - Update Agent and AsyncAgent to use new event system - Update event_logger.py to work with new event structures - Separate server-side tools (file_search, web_search) from client-side function calls The turn model represents a complete interaction loop that may span multiple responses, with distinct inference and tool_execution steps. Server-side tools execute within responses and are logged as progress events, while client-side function tools trigger separate tool execution steps.
Major architectural change based on user feedback: - inference steps = model thinking/deciding what to do - tool_execution steps = ANY tool executing (server OR client-side) Previous incorrect design had server-side tools as progress within inference. New correct design: ALL tools (server and client) appear as tool_execution steps. The difference between server and client tools is operational: - Server-side (file_search, web_search, mcp_call): Execute within response stream, synthesizer emits tool_execution boundaries - Client-side (function): Break response stream, agent.py emits tool_execution when executing Both are annotated with metadata.server_side for clarity. Changes: - Rewrote event_synthesizer to emit tool_execution steps for server-side tools - Updated event_logger to differentiate server vs client in logs - Added metadata to StepStarted for server_side flag - Server-side tools now: complete inference -> tool_execution step -> new inference
Three focused tests validate core architecture: 1. test_basic_turn_without_tools - Validates simple text-only turn - Verifies turn_started -> inference step -> turn_completed flow - No tool execution steps 2. test_server_side_file_search_tool ⭐ KEY TEST - Validates server-side tools appear as tool_execution steps - Verifies metadata.server_side=True - Tests inference -> tool_execution (server) -> inference flow 3. test_client_side_function_tool - Validates client-side tools appear as tool_execution steps - Verifies metadata.server_side=False - Tests inference -> tool_execution (client) -> inference flow All tests verify the key principle: tool_execution steps for ALL tools, regardless of where they execute (server or client).
Python dataclasses require fields with default values to come after fields without defaults. Reordered all event dataclass fields to fix TypeError: non-default argument follows default argument.
5b00477
to
4fa1653
Compare
Landing this! |
…3810) This PR updates the Conversation item related types and improves a couple critical parts of the implemenation: - it creates a streaming output item for the final assistant message output by the model. until now we only added content parts and included that message in the final response. - rewrites the conversation update code completely to account for items other than messages (tool calls, outputs, etc.) ## Test Plan Used the test script from llamastack/llama-stack-client-python#281 for this ``` TEST_API_BASE_URL=http://localhost:8321/v1 \ pytest tests/integration/test_agent_turn_step_events.py::test_client_side_function_tool -xvs ```
Summary
Testing