-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebSocket connection seems to die and not reconnect #96
Comments
When did this start to happen? |
It's hard to pin-point the exact date, but it started happening about 2 months ago.
I'm running Hassio as a docker (using hassio_supervisor on unraid), and deconz installed on a separate server (rpi). Both are connected by ethernet cable directly to the router, so network issues are probably not the issue, though I'm not completely ruling it out. Though, if it is in fact network dropouts, I'd expect it to try and reconnect automatically.
I was thinking of ditching hassio and going strictly HA core as a docker, but I need to sort out a few things before I can make the switch. |
Ok, this happened much sooner than expected- this is happening when deconz is running on docker as well, running on the same host as HA. |
I'm not sure what it rules out, if this was a wide spread issue more people would report on it. Do you mean it just happens after a few minutes? I guess going back to earlier versions of hass to verify if it stops breaking is valuable |
When I posted about this I found people who were having the same problem and decided to abandon deconz completely over this. So it might just be that they didn't bother to report.
I meant that I expected to have to wait for a day or two, but it happened in a matter of a few hours. Then it started happening once every day or two again.
No, when the problem happens the websocket messages do not arrive anymore.
I just went back to as early as 0.118, I'll report back if this happens again with this version. |
The main issue is to be able to pin point the issue since it's not happening to me |
Ok, so I spent a few days with 0.118 and also 0.117 - both exhibit the same problem. |
Well what else are you running in your instance? It could be something else affecting the stability of the system. I did some minor improvements to retry mechanisms of the websocket, I don't know how big of an effect it will have though. Also improved logging a bit It will be a part of the 2021.4 beta scheduled to be released today. Its really basic code being used, so that it just hangs for you is unexpected. |
I have some dockers running on the same host, like: mosquitto, unifi-controller, embyServer, deluge, radarr, sonarr, jackett, bazzaar. That's it, the rest are the hassio dockers, which I'll either be converting to ha core or just use a proper vm. Thank you so much for taking the time to add retry mechanisms and more logs, I really appreciate it! |
Its out since last night ;) I want it as stable as possible. Thats why I still refactor it and improve testing. |
So I've switched over to a hassio VM with the latest beta. I even had to delete the deconz integration and set it up again, so I hoped that might do some voodoo wonders. These are the last 4 log lines I see in the debug log. Notice the time gap between the first two and the last two, the problem seems to have occurred at that point.
|
And no crashes or anything? Could you try out disabling all other integrations to see if that affects anything? |
No crashes or any sign of a problem, until I notice that a few of my sensor-based automations have stopped running. Luckily, most of my smart home is based on zigbee, so I was able to disable all other integrations as well as custom_integrations without too much disruption to see what's causing this. Will report back. |
It's possible that I'm starting to celebrate a little too early because less than 24 hours have passed since I've disabled all integrations, but I think there's a good chance that we've finally found the problem! When I started disabling the integrations one by one, I realized that there's a specific custom integration that I've added in early December (along with some other additions), and right around that time this problem has started happening. I didn't realize that it might have something to do with deconz do I didn't think of this earlier. |
That sounds promising at least! I should copy parts of HASS issue template to make sure that users with issues disable custom integrations before reporting. :) What integration is it that is problematic? |
So after 2 days with no issues, I'm pretty confident that it was indeed that integration! I'm really sorry for all of the hassle! I literally tried everything I could think of before opening an issue here, and never thought that other integrations could affect one another. However, I've been talking to many people about this issue, and you're the only one who was able to figure it out :) So even though this wasn't the right place - thank you, thank you, and thank you again. 🙏 |
Running latest stable version of all components: HA (2021.3.4), Deconz (2.09.03), and Conbee II (26680700).
At some random point in the day, the websocket connection seems to die.
This causes entities coming from Deconz to not get updated in HA (sensors, lights, etc).
Insights:
The last message I see from pydeconz.websocket looks like a normal state update message:
The text was updated successfully, but these errors were encountered: