Conversation
- Fix max_tool_calls check order to allow correct number of calls - Add close() methods to SQLiteRunStore and SQLiteSessionStore - Validate worker names in _select_worker_name against known workers - Remove duplicate CONTEXT_SUMMARY_MAX_ATTEMPTS constant - Remove global warning filter from litellm adapter module level - Remove unused MCP wrapper functions - Add debug logging for vector memory exception handling - Clarify operator precedence with parentheses in session.py - Fix docs: run.failed payload uses errors list not error field
Add LiteLLM CustomLogger integration for LLM lifecycle events (llm.started, llm.completed, llm.failed) with token usage, latency, and cost tracking. Add cost calculation utilities using LiteLLM pricing data. Refactor worker_runner.py for simplified and consistent sync/async flow. Updated docs. Update AGENTS.md and adapter documentation. Add tests for callback handler and async/sync entry point guards.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9208b6e883
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| warnings.filterwarnings("ignore", message="Pydantic serializer warnings") | ||
| response = litellm.completion(**litellm_params) | ||
| if stream: | ||
| return cast(list[dict[str, Any]], response) | ||
| emit_llm_completed(model, response) |
There was a problem hiding this comment.
Emit llm.completed for streaming requests
When stream=True, complete() returns the stream before emit_llm_completed runs, so no llm.completed event (and thus no usage/cost payload) is ever emitted for streaming runs. This means telemetry subscribers will silently miss all streaming calls; additionally, exceptions raised while iterating the stream won’t trigger llm.failed because the try/except only wraps the initial litellm.completion call. Consider emitting completion/failure after the stream is consumed (e.g., in _astream_completion) to cover the streaming path.
Useful? React with 👍 / 👎.
Summary
llm.started,llm.completed,llm.failed) with token usage, latency, and cost trackingChanges
Features
adapters/cost.pymodule for cost utilitiesBug Fixes
Docs