Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permanent error: Client exceeded max pending messages [2]: 512 #26724

Closed
Spirituss opened this issue Sep 19, 2019 · 76 comments
Closed

Permanent error: Client exceeded max pending messages [2]: 512 #26724

Spirituss opened this issue Sep 19, 2019 · 76 comments

Comments

@Spirituss
Copy link

Home Assistant release with the issue:

HA 0.98.5

Last working Home Assistant release (if known):
HA 0.91 (0.92.0?)

Operating environment (Hass.io/Docker/Windows/etc.):

HASS.io/Docker/Raspbian Buster

Component/platform:

homeassistant.components.websocket_api.http.connection

Description of problem:
I have to reopen the issue (#23938) since it was unreasonably closed (some people have the same problem but in different environment).
To be more exact, no use of Node-Red and no HA automation restarts but issue is till on.

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

Tons of errors in error log with no any clear reason. I can have more than 15.000 such errors a day.

Traceback (if applicable):

ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.xxxx] Client exceeded max pending messages [2]: 512

Additional information:

@stainlessray
Copy link

stainlessray commented Oct 31, 2019

I have the same stupid issue. Hass is unusable. As stated in op there is no clear reason for this error. Examples by other posts indicate the cause is completely unknowable at this time.

Maybe try NOT closing the issue again all? You're decentralizing the attempts to troubleshoot with no possible gain in doing so.

@arnonel
Copy link

arnonel commented Oct 31, 2019

Getting same error

@Spirituss
Copy link
Author

Updated to 101.2 - HA becomes unusable at all. I have hundreds of such errors every minutes. All server CPU wastes to error log write.

@stainlessray
Copy link

stainlessray commented Nov 2, 2019

Updated to 101.2 - HA becomes unusable at all. I have hundreds of such errors every minutes. All server CPU wastes to error log write.

I updated, same result. I'm leaning toward starting from scratch. Such a hassle.

@rainisto
Copy link

rainisto commented Nov 10, 2019

I'm getting same error from fresh hass.io HA 101.3 install. Also tested plain 101.3 HA venv install without hass.io and that one works without generating errors to log, so my guess is that problem is related to hass.io docker container permissions/routing.

@mfl76
Copy link

mfl76 commented Nov 10, 2019

Same issue with 0.100.2.

@browetd
Copy link

browetd commented Nov 12, 2019

Same error with 0.101.3 (in a virtual environment)... I started to track that in the logs ... Here is the last extract from the logs (HA seems to work fine despite this recurring error message):

2019-11-12 11:13:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.chauffage_salle_de_bain_bas, old_state=<stat$
2019-11-12 11:13:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.atlantic_salle_de_bain_haut, old_state=<stat$
2019-11-12 11:13:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=automation.get_smappee_c_m, old_state=<state automa$
2019-11-12 11:13:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_c, old_state=<state sensor.sma$
2019-11-12 11:13:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:13:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_solar_c, old_state=<state sensor.sma$
2019-11-12 11:13:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:13:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_m, old_state=<state sensor.sma$
2019-11-12 11:13:41 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=automation.get_smappee_c_m, old_state=<state automa$
2019-11-12 11:13:45 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_solar_c, old_state=<state sensor.sma$
2019-11-12 11:13:45 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:13:45 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_m, old_state=<state sensor.sma$
2019-11-12 11:13:45 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_c, old_state=<state sensor.sma$
2019-11-12 11:13:45 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:13:52 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.chauffe_eau_cuisine, old_state=<state switch$
2019-11-12 11:13:52 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.chauffage_salle_de_bain_bas, old_state=<stat$
2019-11-12 11:13:52 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.atlantic_salle_de_bain_haut, old_state=<stat$
2019-11-12 11:14:00 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.time, old_state=<state sensor.time=11:13; fr$
2019-11-12 11:14:00 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.date_time, old_state=<state sensor.date_time$
2019-11-12 11:14:01 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=automation.get_smappee_c_m, old_state=<state automa$
2019-11-12 11:14:07 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_solar_c, old_state=<state sensor.sma$
2019-11-12 11:14:07 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:14:07 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_m, old_state=<state sensor.sma$
2019-11-12 11:14:07 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_c, old_state=<state sensor.sma$
2019-11-12 11:14:07 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:14:17 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.hall_d_entree_io_9605662_2, old_state=<state$
2019-11-12 11:14:17 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.chambre_thomas_io_4977817_2, old_state=<stat$
2019-11-12 11:14:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=climate.hall_d_entree_i2g_actuator, old_state=<stat$
2019-11-12 11:14:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=climate.chambre_thomas_i2g_actuator, old_state=<sta$
2019-11-12 11:14:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=automation.get_smappee_c_m, old_state=<state automa$
2019-11-12 11:14:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.chauffage_salle_de_bain_bas, old_state=<stat$
2019-11-12 11:14:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.chauffe_eau_cuisine, old_state=<state switch$
2019-11-12 11:14:23 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=switch.atlantic_salle_de_bain_haut, old_state=<stat$
2019-11-12 11:14:29 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_c, old_state=<state sensor.sma$
2019-11-12 11:14:29 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:14:29 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_usage_m, old_state=<state sensor.sma$
2019-11-12 11:14:29 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_solar_c, old_state=<state sensor.sma$
2019-11-12 11:14:29 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=sensor.smappee_injected, old_state=<state sensor.sm$
2019-11-12 11:14:41 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Sending {'id': 2, 'type': 'event', 'event': <Event state_changed[L]: entity_id=automation.get_smappee_c_m, old_state=<state automa$
2019-11-12 11:26:09 ERROR (MainThread) [homeassistant.core] Error doing job: Fatal read error on socket transport
Traceback (most recent call last):
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 801, in _read_ready__data_received
    data = self._sock.recv(self.max_size)
TimeoutError: [Errno 110] Connection timed out
2019-11-12 11:26:09 INFO (MainThread) [homeassistant.components.websocket_api.http.connection.2614778736] Connection closed by client
2019-11-12 11:26:09 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2614778736] Disconnected
2019-11-12 11:32:21 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Client exceeded max pending messages [2]: 512
2019-11-12 11:32:21 INFO (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Connection closed by client
2019-11-12 11:32:21 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection.2612512176] Disconnected

@electrofloat
Copy link

Same here:

Dec  4 08:57:40 localhost hass[18209]: 2019-12-04 08:57:40 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.140515727017040] Client exceeded max pending messages [2]: 512
Dec  4 08:57:40 localhost hass[18209]: message repeated 2 times: [ 2019-12-04 08:57:40 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.140515727017040] Client exceeded max pending messages [2]: 512]

It seems to be happening sometime after I'm connecting to HA from the Android webapp.

@tomlut
Copy link

tomlut commented Dec 31, 2019

Found what was causing these errors for me and can reproduce 100%

  1. Open and edit a Lovelace card in Firefox (have not tested Chrome yet).
  2. Close the card edit and exit UI edit mode, do not refresh the browser.
  3. Navigate to another view or sidebar menu a few times.
  4. Sooner rather than later Firefox starts throwing up rapidly repeating “Refresh Lovelace” toast notifications in the lower left corner. This may eventually cause the screen to show a blank page with a useless “Refresh Lovelace” button in the centre. This causes thousands of errors in a very short time.

Standard browser refresh clears the error loop.

So I guess I just have to remember to refresh my browser after editing a card.

Possibly a polymer/frontend issue with how HA attempts to refresh Firefox.

@Spirituss
Copy link
Author

I still have this error message in 0.103.5, but more rarely yet.

@andriej
Copy link
Contributor

andriej commented Mar 30, 2020

Issue in frontend part: home-assistant/frontend#5328

@Spirituss
Copy link
Author

On 107.7 HA is very unstable. Interface periodically becomes unavailable and CPU usage of HA server increases from 5% to more than 40%. During such a period I see a tons of errors: ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.140484979526352] Client exceeded max pending messages [2]: 512

@browetd
Copy link

browetd commented Apr 11, 2020

The problem is still there in 0.108.2... After a request from lovelace (via a button) or through the development tools to reload automations, 102 message in the same second "2020-04-11 05:49:36 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.2614746480] Client exceeded max pending messages [2]: 512" ...

@cratermaddie
Copy link

I'm seeing this issue as well. I noticed that removing every card that used card-templater seemed to resolve the issue for me but that's obviously not an ideal resolution...

@fraschizzato
Copy link

fraschizzato commented Apr 24, 2020

Same in 0.108.8 - Happens randomly, have to restart home assistant to get it work again.
I never used card-templater, only card-tools.

Solved the continuous errors, here the steps from my post in community.home-assistant.io:

After many tries I've found that the "problem" is hacs and/or card-tools... I have to find if directly related to that integrations or if a sum of other issues... Let me explain:

  • Removed all lovelace resource from gui
  • Removed all hacs integration and hacs itself
  • Commented out entities that uses other custom_components
  • Cleaned manually all the folders from custom components and resource
  • Removed (after 2 restart as mentioned in the hacs removal guide) the hacs files in .stored folder

During the clean i've found a lot of unused old stuff and some js that where initially installed manually and then via hacs... So probably the issue is from that confusion beside the custom_components itself...
Now, in these days, i will re-add progressively all the appdaemon,components and entities I've removed or commented out to see if the problem re-appears...
One thing is sure, until the steps described above, the issue was still there.

Sometimes the issue re-happen, Hass get's stuck for some minutes and then runs correctly.
As by observation I've seen that can be related to the sleep mode of a win10 pc client. If that client has a chrome tab open with Hass, when it goes in sleep mode the error starts

@balloob
Copy link
Member

balloob commented Apr 28, 2020

109 includes a fix to allow websocket API clients to catch up for 5 seconds if they fall behind on consuming messages.

@fraschizzato
Copy link

I think is related to chrome tabs that remains open for long time. More details below:

  • I've tried with and without: hacs - appdaemon - custom_components - custom_cards and (except for custom_cards) I didn't find a direct correlation. For custom_card there are 2 different causes: UI not correctly reloaded (most common - use CTRL+F5 in chrome to reload) and duplication/overlap of cards/integration for who now uses hacs but before added them manually (a lot more buggy - double check if all customization are removed before using hacs to add them)
  • I've found that one automation was not triggered during the ERROR so I've made an mqtt watchdog timer to exactly see when the problem occurs:
  • Initially I've found that (with my config) it happens about twice a day, Hass is not responsive during the error BUT 3 out of 4 times it self-restored - after some times (variable between 30sec and 8min) Hass works as if nothing happened before. So many times the error is not appreciable, It causes problem to the user if he tries to use the UI during the error.
  • But, MOST IMPORTANT, I've found that the error happens ONLY if there is a client (phone or pc - chrome or home assistant companion) with the ui open for long time. Now, that i'm closing always any Home Assistant tab after use, the error is gone (for now 12h without any error and keep going)
  • It seems, but not sure about this, that it's related to the fact that the pc/phone with the tab open goes on sleep mode or there are (like in my case) wifi/network jumps

Probably Hass and Frontend does not manage correctly the socket disconnection-reconnection. Hass must stop sending data through websocket after first pending messages and meanwhile Frontend must attempt a reconnection (like when hass is restarted but ui is keeped open)

Hope it helps

@andriej
Copy link
Contributor

andriej commented Apr 29, 2020

I fully agree with your part of analysis saying, that it's always caused by clients having UI open for long time.
In my case client is generating so much traffic over time (jncreasing) that HA becomes more and more unresponsive (delays with lights) because of amount of logs probably.

@SnowViet
Copy link

I have a similar issue.
It happens when I keep open my browser for a long time (Chrome 81.0.4044.129).
HA keep logging:

ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.140484979526352] Client exceeded max pending messages [2]: 512

It's getting frozen, my automation are really slow, etc.
But when I close the borwser (or just the HA page), it works again.

@andriej
Copy link
Contributor

andriej commented Apr 29, 2020

Check your logs size, it's getting frozen beacuse of gigabytes it has to write to disk ;-)

@SnowViet
Copy link

I don't think so, the logfile is "only" 100MB, with more than 550'000 error lines (and it's working for the time being).
It's frozen because of high CPU usage ? I don't know, it's running inside docker on my NAS, and it's CPU limited, I will check the I/O, CPU, RAM usage next time it gets frozen.

@fraschizzato
Copy link

With more than 800 entity... can You imagine the traffic on the socket... For me everything starts happening last week, I've updated hass but didn't remember previous version (last working). No more entities added, the only update was to home assistant...

Another fact is that with the number in logs I can't identify which device causes the issue, maybe in the error message a client identifier (much more detailed) can be usefull...

@fraschizzato
Copy link

fraschizzato commented Apr 29, 2020

Just found that the client is causing all is appdaemon.... Didn't know if itself or specific app

Update: AppDaemon is the client that shows up on logs, isn't clear if is the cause or a symptom

@cgarwood
Copy link
Member

cgarwood commented May 1, 2020

I've also been seeing this occasionally on 0.108 and 0.109 - Does seem to affect appdaemon, as my appdaemon automations all stopped working. However, I still saw the errors in the logs continuing after I stopped AppDaemon. Some combination of closing HA tabs in my browser and an HA restart finally cleared it up (for now).

@fraschizzato
Copy link

Also in 109.1, with appdaemon 4.0.3

2020-05-01 21:52:31 ERROR (MainThread) [backoff] Giving up get_ratelimit(...) after 5 tries (concurrent.futures._base.TimeoutError)
2020-05-01 21:52:39 ERROR (MainThread) [homeassistant.core] Error doing job: Exception in callback WebSocketResponse._send_heartbeat()
2020-05-01 22:00:08 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
2020-05-01 22:13:11 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.140167701502664] Client unable to keep up with pending messages. Stayed over 512 for 5 seconds
2020-05-01 22:13:11 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection.140167705232104] Client unable to keep up with pending messages. Stayed over 512 for 5 seconds
2020-05-01 22:18:23 ERROR (MainThread) [homeassistant.core] Error doing job: Exception in callback WebSocketResponse._send_heartbeat()
2020-05-01 22:18:23 ERROR (MainThread) [homeassistant.core] Error doing job: Exception in callback WebSocketResponse._send_heartbeat()
2020-05-01 22:18:23 ERROR (MainThread) [homeassistant.core] Error doing job: Exception in callback WebSocketResponse._send_heartbeat()

At 22:10 I've received the message from the watchdog... Hass was unresponsive and doesn't make any automation (as the watchdog is based on automated mqtt from hass)

logs from appdaemon:

2020-05-01 22:09:34.040928 INFO revert_back: automation.autostoppannellocamera on
2020-05-01 22:09:34.043889 INFO revert_back: automation.watchdog_spa on
2020-05-01 22:09:53.057904 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
2020-05-01 22:10:58.075173 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
2020-05-01 22:10:58.107157 INFO revert_back: automation.cloro_e_ph off
2020-05-01 22:10:58.110539 INFO revert_back: automation.luci_natale_on off

didn't know how to do at this point

@Botr16
Copy link

Botr16 commented May 2, 2020

Same problem here. I noticed it started happening when editing the lovelace interface and i had just started using layout-card and card-mod with button-card. As of 0.109 the issues got worse: I dont get the "Connection Lost" popup anymore but the interface is slow, sluggish, sometimes fails to load and the page i was working on is now broken:
hassio_tagpage
Running 0.108 this page was fine, but the socket was overloaded everytime i made more than 1 change to the interface without reloading the page. I have a phone always on and showing hassio but this wasnt an issue in the last few months.
I'm not very experienced so forgive me if I missed something, feel free to ask me anything

@bdraco
Copy link
Member

bdraco commented Nov 9, 2020

Its going to depend on the performance of the system, latency of the connection, and how fast the client can process the messages. On my system, it can easily process 100 state changes event per second. On an RPi3 it will fall over.

30 every 5 seconds is a bit high if they all get delivered at the same time. Most, but not all systems will be able to keep up. I'd expected the lower end hardware will fail at that level. Using force_update should probably be an option for users who have the hardware/setup to handle it.

@RobBie1221
Copy link
Contributor

@bdraco Thanks! This helps a lot. Having it as an option for the integration is also not really ideal, looking at this specific integration only maybe 5 out of the 30 entities are really relevant for force_update, but even that could be very personal. That would mean even people with hardware which cannot handle 30 would perhaps want to have it on a handful. I've also done some work for rfxtrx integration where now force_update is true for sensor and binary sensor entities, but that may also be a bit overkill as some people have 100+ devices in this integration of which all have a battery and rssi entity.

Perhaps maybe somewhere in future this can be added to customization, to allow users to set it entity specific...

@pierre2113
Copy link

pierre2113 commented Nov 25, 2020

just wondering if anybody else is finding this issue

"[homeassistant.components.websocket_api.http.connection] [1755060880] Client unable to keep up with pending messages. Stayed over 512 for 5 seconds"

related to automations where you are running update_entity on a sensor.

- service: homeassistant.update_entity
    data:
      entity_id: sensor.blueirisprofile

I had a few, in different automations, in HA .118.3 the error message 512 for 5 seconds is always appearing right after an automation is running. When I comment out update_entity and reload automation the cpu usage has dropped down as well at least 10%.
I still have a few update_entity left, just waiting to see if the "...512 for 5" message follows the remaining automations that still have it before I remove the remaining last 2.

@Aham330
Copy link

Aham330 commented Dec 7, 2020

I have the same issue, started after update 0.118.5 running hassos 4.17 on RPI 4gb

Running hacs integration, ipp and flux led and limitless led I get loads of update of light taking longer than 10 seconds would love to see some actual solutions on these issues.

Having said that, I love HA and it’s enormous flexibility.

@honkerst
Copy link

honkerst commented Dec 8, 2020

I used to get these messages pretty regularly on an RPi 3B+. Several dozen a day. My experience was that mostly the messages can be ignored, but some integrations generate so many state changes it triggers these errors to the point that Lovelace gets so laggy that it's unusable. A few different things would trigger this, early versions of the EdgeOS integration, heavy use of auto-entities, heavy use of card-mod to change colours of icons based on state.

I figured my HA install was just outgrowing the Pi. However a few weeks ago I moved my HA install to VirtualBox on an i7 Macbook Pro, and these messages are reduced but I still get them. Maybe around 5 a day now.

@pierre2113
Copy link

I've reduced my overall cpu usage on the RPi 3B+ from 40% to barely 20% (average now is only 15%), I still see a few of these messages "512 for 5 seconds" every day. When one of my integrations running in the homeassistant docker container stopped for 1 week by me (hardware died so I stopped automations), the message went away completely. It makes me speculate either a HA resource is synchronized and all integrations have to take turns to call the shared api, or its docker restricting cpu usage on each container. My overall RPi 3B+ is only 20% max at the raspbian buster level, but when I run "docker stats" is see the homeassistant container reach 150% cpu, but the RPi 3B+ still has plenty cpu power. I'll have to try find some docker command to see if I can give more CPU resource from the Hardware to the homeassistant container only, its the only container that is consistently cpu hungry. I run homeassistant in supervised installation, so I may have some flexibility to try these things, but not 100% sure.

@pierre2113
Copy link

this might help someone else, I've narrowed this down to too many history graphs. my guess each graph plotting sensor data, is querying the mariadb at the same time. When I moved my graphs from one raspberry pi to another to offload some the overhead, the pending messages moved to the new raspberry pi with the custom sensor history graphs I have setup.

@ceesvanegmond
Copy link

@JIOB Have you fixed this problem? I'm also using full kiosk browser on my tablet and having the same issues.

@JIOB
Copy link

JIOB commented Feb 28, 2021

@JIOB Have you fixed this problem? I'm also using full kiosk browser on my tablet and having the same issues.

@ceesvanegmond I managed to track my issue to a script I have to automatically switch my FullyKiosk instances on/off. I increased the delay on switching the wall tablet off and the problem went away.

@ceesvanegmond
Copy link

@JIOB Are you be able to share that script? Very interesting!

@JIOB
Copy link

JIOB commented Feb 28, 2021

@JIOB Are you be able to share that script? Very interesting!

Sure, the script essentially keeps my wall tablet on until no motion is detected. I increased the delay in 'hallway_panel_timer_off' from 1 second to 5 and the problem went away. As for the script I'm sure there are better ways of doing this, however, I have ran this script and others like it for a long long time and they work for me.

  hallway_timed_panel:
    alias: "Turn on hallway panel and set timer"
    sequence:
      # Cancel ev. old timers
      - service: script.turn_off
        data:
           entity_id: script.hallway_panel_timer_off
      - service: switch.turn_on
        data:
          entity_id: switch.hallway_panel
      # Set new timer
      - service: script.turn_on
        data:
          entity_id: script.hallway_panel_timer_off

  hallway_panel_timer_off:
    alias: "Turn off hallway panel after x"
    sequence:
      - delay:
          seconds: 5
      - service_template: > 
          {% if ((not is_state('sensor.hallway_motion', 'None')))  %}
            script.turn_on
          {% else %}
            switch.turn_off
          {% endif %}
        data_template:
          entity_id: >
            {% if ((not is_state('sensor.hallway_motion', 'None')))  %}
              script.hallway_panel_timer_off_restart
            {% else %}
              switch.hallway_panel
            {% endif %}

  hallway_panel_timer_off_restart:
    alias: "Turn off hallway panel after x (restart)"
    sequence:
      # Cancel current script
      - service: script.turn_off
        data:
          entity_id: script.hallway_panel_timer_off
      # Set new timer
      - service: script.turn_on
        data:
          entity_id: script.hallway_panel_timer_off

@github-actions
Copy link

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label May 29, 2021
@tomlut
Copy link

tomlut commented May 29, 2021

Still occurring when backgrounding the iOS app.

@github-actions github-actions bot removed the stale label May 29, 2021
@isabellaalstrom
Copy link
Contributor

Also still experiencing this.

@rjenx
Copy link

rjenx commented Jun 1, 2021

Me to. this message has started appearing in my logs, no further detail so assume I will have to turn up the detail on logging.
Logger: homeassistant.components.websocket_api.http.connection
Source: components/websocket_api/http.py:124
[140518542877504] Client unable to keep up with pending messages. Stayed over 512 for 5 seconds

Running HASSOS 2021.5.5

@mgabetta
Copy link

Sane here:

Logger: homeassistant.components.websocket_api.http.connection
Source: components/websocket_api/http.py:127
Integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 12:03:38 (1 occurrences)
Last logged: 12:03:38

[547571963792] Client unable to keep up with pending messages. Stayed over 512 for 5 seconds

@ehendrix23
Copy link
Contributor

Happening here as well. I do not have AppDaemon, definitely seems to be front-end related. Not sure if possible for somewhere to add some more info when this occurs to better get down to what is causing this? Seems that it might not always be 1 thing and is something that can come back from time to time.

@github-actions
Copy link

github-actions bot commented Nov 1, 2021

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Nov 1, 2021
@github-actions github-actions bot closed this as completed Nov 8, 2021
@wtadler
Copy link

wtadler commented Nov 8, 2021

This is still happening for me on core-2021.11.1.

@Dominik-1980
Copy link

For me, too

@AlessandroTischer
Copy link

I have the same issue on 2021.11.5

@pierre2113
Copy link

pierre2113 commented Dec 5, 2021

I may have a suggestion, I can't prove it'll work for anybody, and if my idea still requires a tweak. I was getting "Client exceeded max pending messages" because I have too many history graphs. Graphs have to query the database for entity_id, state, last_updated. I'm using mariadb so I used a tool DBEAVER to connect to HA, I took a peek at the index in homeassistant "states" table, the closest index I found that could be used for history graphs is named "ix_states_entity_id_last_updated" it has columns (entity_id, last_updated) only. so a SQL query used to retrieve the data to draw history graphs would have to use the index then read the table by rowid to retrieve state data, to plot column state on y axis, column last_updated on x axis for a specific entity_id.
the index is efficient, table access is efficient but there will be extra table disk I/O to get all data needed for 1 data point in history graph. but if we have enough history graphs there will be concurrent SQL with contention on the table disk I/O portion.

My hunch for my case this is where the bottleneck is for my hardware rbi3b+ with a solid state drive. as an experiment I've added a new Index "ix_states_entity_id_last_updated2" to help optimize the query just slightly so it wouldn't have to access the table to get the needed data, the index storage size will go up slightly, but everything the query needs will be in the index and less disk I/O

here's the DDL to add the index for mariadb, if you don't think the index can be unique for the 3 column combination then remove the UNIQUE keyword.

CREATE UNIQUE INDEX 
  IF NOT EXISTS ix_states_entity_id_last_updated2
    using btree
    ON states (entity_id, state, last_updated);

here's the DDL to drop the index for mariadb, after you're done with the experiment

drop index ix_states_entity_id_last_updated2 on states;

I'm still looking at my logs for the max pending messages, I haven't seen it lately, that could be because of my experiment or HA version 2021.11.5 helped on the solid state drive hardware I use.
its very likely my new index is ignored completely. I'm still keeping an eye on my logs.
In an Oracle database if you had enough concurrent SQL queries, such an index would have made the optimizer use it and reduce contention on the table, in Oracle Index data are kept in cache better than table data. Not sure if Mariadb would behave similarly to the index tweak as an Oracle database,

if the experiment is successful, than the index "ix_states_entity_id_last_updated" would be replaced with the change, we shoudn't add another index unless there is a need, the rbpi3b+ has only so much ram to cache indexes.

the above DDL would be changed to, that is if mariadb allows create or replace from an index that is not unique to an index that is unique. if it doesn't you would have to drop the index 1st, then add it.

CREATE OR REPLACE UNIQUE INDEX 
  ix_states_entity_id_last_updated
    using btree
    ON states (entity_id, state, last_updated);

@amosyuen
Copy link
Contributor

amosyuen commented Nov 11, 2022

@pierre2113 Any updates on your experiment?

@pierre2113
Copy link

I'd forgotten about this error, I had other HA fires I had to fight. But looking at my logs I don't see it.
I do know the existing index "ix_states_entity_id_last_updated" in HA for the "STATES" table is still the old definition, its cardinality is 460 (aka number of distinct values, the bigger the more likely a database engine might use an index, at least in Oracle) my index cardinality is 3202. Its a bottleneck if my guess during history graph plotting the table is queried and you have alot of history graphs like me, I don't know if that's true, so I can't say if my index improved it or HA engine has been tuned. Any common resource for multiple processes/thread s that try to use them, there is a concurrency bottleneck. There is usually more than one in a system. If each threads time usage on a common resource is reduced to a smaller time that will help the concurrency bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests