-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve websocket message coalescing to handle thundering herds better #118268
Conversation
During startup the websocket would frequently disconnect if more than 4096 entities were added back to back. Some MQTT setups will have more than 10000 entities. Match the websocket peak value to the max expected entities
…ntities' into websocket_match_max_expected_entities
…annot handle it" This reverts commit 439e2d7.
Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
@coderabbitai review |
WalkthroughThe changes introduce a delayed token deletion mechanism in the authentication component to avoid immediate connection closure and enhance the WebSocket API's message handling by managing queue sizes and future releases. Additionally, the tests are updated to reflect these changes, ensuring the new behaviors are correctly validated. Changes
Sequence Diagram(s) (Beta)sequenceDiagram
participant User
participant HomeAssistant
participant AuthComponent
participant WebSocketAPI
User->>HomeAssistant: Request Authentication
HomeAssistant->>AuthComponent: Validate Token
AuthComponent->>User: Token Validated
User->>WebSocketAPI: Open WebSocket Connection
WebSocketAPI->>WebSocketAPI: Manage Queue Size
WebSocketAPI->>WebSocketAPI: Release Ready Future if Conditions Met
User->>AuthComponent: Request Token Deletion
AuthComponent->>AuthComponent: _delete_current_token_soon()
AuthComponent->>HomeAssistant: Schedule Token Deletion Task (async)
Note right of AuthComponent: Token deleted after delay
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
I tried doing 5 ms timer, but that increased the latency too much. Call_soon Seems to be the best way, since it balances the desired have low latency with not overloading the system |
@coderabbitai review |
I've had the bootstrap test fail a few times today. I fixed it in #118285 |
thanks |
* dev: (751 commits) Use runtime_data in ping (home-assistant#118332) Fix last_reported_timestamp not being updated when last_reported is changed (home-assistant#118341) Replace pop calls with del where the result is discarded in restore_state (home-assistant#118339) Improve websocket message coalescing to handle thundering herds better (home-assistant#118268) Add cache to more complex entity filters (home-assistant#118344) Reduce the intent response data sent to LLMs (home-assistant#118346) Small speed up to connecting dispatchers (home-assistant#118342) Tweak Assist LLM API prompt (home-assistant#118343) Add Conversation command to timers (home-assistant#118325) LLM Assist API to ignore intents if not needed for exposed entities or calling device (home-assistant#118283) Replace pop calls with del where the result is discarded in entity (home-assistant#118340) Replace pop calls with del where the result is discarded in mqtt (home-assistant#118338) Use del instead of pop in the entity platform remove (home-assistant#118337) Update the recommended model for Google Gen AI (home-assistant#118323) Fix source_change not triggering an update (home-assistant#118312) Several fixes for the Matter climate platform (home-assistant#118322) Use None default for traccar server battery level sensor (home-assistant#118324) [esphome] 100% voice assistant test coverage (home-assistant#118334) Mark sonos group update a background task (home-assistant#118333) Filter timers more when pausing/unpausing (home-assistant#118331) ...
* dev: (8244 commits) Update zwave_js WS APIs for provisioning (home-assistant#117400) Add OSO Energy binary sensors (home-assistant#117174) Add august open action (home-assistant#113795) Add smoke detector temperature to Yale Smart Alarm (home-assistant#116306) Don't report entities with invalid unique id when loading the entity registry (home-assistant#118290) Fix epic_games_store mystery game URL (home-assistant#118314) Use runtime_data in ping (home-assistant#118332) Fix last_reported_timestamp not being updated when last_reported is changed (home-assistant#118341) Replace pop calls with del where the result is discarded in restore_state (home-assistant#118339) Improve websocket message coalescing to handle thundering herds better (home-assistant#118268) Add cache to more complex entity filters (home-assistant#118344) Reduce the intent response data sent to LLMs (home-assistant#118346) Small speed up to connecting dispatchers (home-assistant#118342) Tweak Assist LLM API prompt (home-assistant#118343) Add Conversation command to timers (home-assistant#118325) LLM Assist API to ignore intents if not needed for exposed entities or calling device (home-assistant#118283) Replace pop calls with del where the result is discarded in entity (home-assistant#118340) Replace pop calls with del where the result is discarded in mqtt (home-assistant#118338) Use del instead of pop in the entity platform remove (home-assistant#118337) Update the recommended model for Google Gen AI (home-assistant#118323) ...
* dev: (1785 commits) Update zwave_js WS APIs for provisioning (home-assistant#117400) Add OSO Energy binary sensors (home-assistant#117174) Add august open action (home-assistant#113795) Add smoke detector temperature to Yale Smart Alarm (home-assistant#116306) Don't report entities with invalid unique id when loading the entity registry (home-assistant#118290) Fix epic_games_store mystery game URL (home-assistant#118314) Use runtime_data in ping (home-assistant#118332) Fix last_reported_timestamp not being updated when last_reported is changed (home-assistant#118341) Replace pop calls with del where the result is discarded in restore_state (home-assistant#118339) Improve websocket message coalescing to handle thundering herds better (home-assistant#118268) Add cache to more complex entity filters (home-assistant#118344) Reduce the intent response data sent to LLMs (home-assistant#118346) Small speed up to connecting dispatchers (home-assistant#118342) Tweak Assist LLM API prompt (home-assistant#118343) Add Conversation command to timers (home-assistant#118325) LLM Assist API to ignore intents if not needed for exposed entities or calling device (home-assistant#118283) Replace pop calls with del where the result is discarded in entity (home-assistant#118340) Replace pop calls with del where the result is discarded in mqtt (home-assistant#118338) Use del instead of pop in the entity platform remove (home-assistant#118337) Update the recommended model for Google Gen AI (home-assistant#118323) ...
Proposed change
When entities are added, we may add them in back-to-back tasks, which may return control to the event loop due to I/O or other suspension. This means that the WebSocket sender would run between each task, and the messages would not be coalesced, resulting in a thundering herd of messages to the browser, which would, in some cases, make the UI unresponsive.
To solve this, we now check if the queue has grown each time we are about to release the sender's future and reschedule the release of the queue for the next iteration of the event loop in the event the queue has grown in size. A safety value of
PENDING_MSG_MAX_FORCE_READY
is used to make sure we don't coalesce too many messages together to avoid the payload growing too large or the delay in sending becoming more than a few microseconds.The auth code to delete the current refresh token (which closes the user's connection) expected it would always take one event loop iteration to send the response to deleting all the refresh tokens. Since it may take a few now, that code needed to be adjusted to have a longer delay to ensure the response could be sent before the user's own token was deleted.
Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.To help with the load of incoming pull requests: