-
Notifications
You must be signed in to change notification settings - Fork 61
fix: stream stats updates from RemoteConversation via WebSocket #1142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change fixes issue #1087 where the visualizer showed $0.00 cost until conversation.run() returned, instead of showing real-time updates. The root cause was that stats updates were done by mutating the usage_to_metrics dictionary directly, which doesn't trigger the __setattr__ hook that broadcasts state change events. Changes: - Added set_on_stats_change callback to ConversationStats - Modified register_llm to call the callback when stats change - Connected stats callback to state change notification in ConversationState - Added tests to verify stats changes trigger state update events The RemoteState already handles ConversationStateUpdateEvent with key='stats', so no changes were needed there. Stats updates are now properly broadcasted via WebSocket during agent execution. Fixes #1087 Co-authored-by: openhands <openhands@all-hands.dev>
Coverage Report •
|
|||||||||||||||||||||||||||||||||||
This fixes issue #1087 where RemoteConversation stats were not updated in real-time during agent execution. The previous implementation only triggered stats callbacks when NEW LLMs were registered, but not when existing LLM metrics were updated via add_cost(), add_token_usage(), etc. This meant that during conversation execution, cost and token usage updates would not be broadcast via WebSocket. Changes: 1. Added callback mechanism to Metrics class: - Added _on_change private attribute to hold callback - Added set_on_change() method to register/remove callback - Added _notify_change() method to invoke callback - Modified add_cost(), add_token_usage(), add_response_latency(), and merge() to call _notify_change() after mutations 2. Updated ConversationStats.register_llm(): - Sets up the stats callback on newly registered LLM metrics objects - This ensures all metrics updates are propagated to state changes 3. Updated ConversationStats.set_on_stats_change(): - Wires up callback to all already-registered metrics - Handles case where callback is set after LLMs are registered 4. Added comprehensive tests: - test_metrics_callback.py: Tests Metrics callback mechanism - test_stats_streaming_integration.py: Integration tests verifying metrics updates trigger ConversationStateUpdateEvents Result: Stats are now streamed in real-time via WebSocket during agent execution, allowing visualizers to show live cost updates instead of bash.00 until conversation.run() returns. Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands \codereview-roasted this PR Also, we seem to send back Metrics every time -- which includes a growing list. Is there any opportunity to send back |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
Summary of ChangesI've successfully implemented a fix for issue #1087: Remote conversation stats streaming. ProblemPreviously, when LLM metrics were updated during conversation execution (e.g., adding costs or token usage), the state change callbacks were not triggered. This meant remote clients couldn't receive real-time stats updates during long-running conversations. Solution ArchitectureImplemented a three-layer callback mechanism to propagate metrics changes:
Additional Changes
Tests Added
All tests pass and the implementation follows the repository's engineering principles: simple, clear, backward-compatible, and maintainable. The branch |
|
supercede by #1159 |
Summary
Fixes #1087 - Enables real-time stats updates in RemoteConversation by streaming stats changes via WebSocket events during agent execution.
Problem
When running the example script
uv run examples/02_remote_agent_server/01_convo_with_local_agent_server.py, the visualizer would display$0.00cost duringconversation.run()execution and only update to the actual cost after completion. This made it impossible to track LLM costs in real-time during agent execution.Root Cause
The bug occurred because stats callbacks were only triggered when new LLMs were registered, but not when existing LLM metrics were updated during conversation execution.
During agent execution, when LLM calls are made:
llm.metrics.add_cost(0.05)andllm.metrics.add_token_usage(...)are calledConversationStateUpdateEventwas broadcastedSolution
Implemented a comprehensive callback mechanism in the Metrics class to propagate all metric updates:
1. Added callback mechanism to Metrics class
_on_changeprivate attribute to hold callback functionset_on_change()method to register/remove callbacks_notify_change()helper to safely invoke callbacksadd_cost()- triggers after cost is addedadd_token_usage()- triggers after token usage is recordedadd_response_latency()- triggers after latency is recordedmerge()- triggers after metrics are merged2. Connected Metrics callbacks to ConversationStats
ConversationStats.set_on_stats_change():ConversationStats.register_llm():3. State change propagation (unchanged)
ConversationState.set_on_state_change()already wires stats callbackConversationState._notify_stats_change()already createsConversationStateUpdateEventRemoteState.update_state_from_event()already updates cached statsChanges
Modified Files
openhands-sdk/openhands/sdk/llm/utils/metrics.py: Added callback mechanism to Metrics classopenhands-sdk/openhands/sdk/conversation/conversation_stats.py: Updated to set callbacks on all metricsTest Files
tests/sdk/llm/test_metrics_callback.py: Unit tests for Metrics callback mechanism (7 tests)tests/sdk/conversation/test_stats_streaming_integration.py: Integration tests for full stats streaming flow (3 tests)Testing
Unit Tests for Metrics Callbacks
test_metrics_change_callback_on_add_cost- Callback triggered on add_cost()test_metrics_change_callback_on_add_token_usage- Callback triggered on add_token_usage()test_metrics_change_callback_on_multiple_updates- Multiple updates trigger multiple callbackstest_metrics_callback_can_be_cleared- Callback can be removedtest_metrics_callback_exception_does_not_break_update- Exceptions don't prevent updatestest_metrics_merge_triggers_callback- merge() triggers callbacktest_metrics_add_response_latency_triggers_callback- Latency updates trigger callbackIntegration Tests for Stats Streaming
test_metrics_updates_trigger_state_change_events- Metrics updates create ConversationStateUpdateEventstest_multiple_llms_metrics_updates_all_trigger_events- Multiple LLMs all stream correctlytest_callback_removal_stops_stats_streaming- Callback removal stops streamingExisting Tests
Example Flow
Backward Compatibility
This change is fully backward compatible:
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:5f15733-pythonRun
All tags pushed for this build
About Multi-Architecture Support
5f15733-python) is a multi-arch manifest supporting both amd64 and arm645f15733-python-amd64) are also available if needed