New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rainbird integration still unstable #92857
Comments
Hey there @allenporter, mind taking a look at this issue as it has been labeled with an integration ( Code owner commandsCode owners of
(message by CodeOwnersMention) rainbird documentation |
I was hoping that the move to using update coordinators would resolve this by having only a single caller. Is this with other callers of the device or only home assistant? Which device is this? (I'm not seeing this fwiw) |
Only Home Assistant should be connecting to the device. The only second app I use is the android app from Rainbird, but this is disabled and I use it just for debugging. My unit is ESP-RZXe |
@konikvranik I am curios if you notice the same behavior when using the App where it fails that often. (I imagine you'd have to capture with the proxy to know). I was previously under the impression that this happens when there are multiple clients connecting to the device, so i am wondering why this happens. Are you familiar with other scenarios where the device rejects requests like this? |
Actually, when I enable the android app, the it connects without issues. Only the HA plugin complains for server unavailability. |
I wonder what is the difference between the app and home assistant. Is it just that it polls more often? Home assistant used to make multiple requests at the same time and the client library would retry and also had state. It used to still fail all the time for me and go avail and unavailable even with the retries, and have errors in the logs. I then made home assistant stop sending the requests in parallel and took out the retries because it no longer ever fails anymore for me, and now it just uses a normal update coordinator letting home assistant handle all errors. Maybe we should lower the poll frequency for that device if that device can't handle requests that often. Otherwise, I'd like to understand why that request type is failing. One other idea is that maybe it is a specific request failing if there are multiple being sent. Perhaps we can turn up debug logging for other parts of the library that logs the request details. |
So what logging should I enable? |
I think the interesting log messages are in As for polling interval, i don't think we can configure that without additional code changes. |
Here is the log with success and also failures: pyrainbird.log |
In the log it seems to succeed, then fail 1 minute later, then succeed 1 minute later, then fail 1 minute later. Back and forth like that. It's always on the first request to get the current active stations. It is entirely consistent. This definitely makes me think we should try to figure out the device behavior that works better, rather than adding in more retries. Right now the update interval in |
I tried it but it doesn't look to help. Here is the log right after the HA restart:
The last true is after I reloaded the integration. I'll observe the behavior the whole day and then I'll report the result. |
Thanks this is really interesting. I'm curious why requesting every two minutes would be more likely to leave the controller in a weir state. Is there something deterministic happening at the two minute mark? Regarding closing the connection, the rainbird integration is using the default home assistant client session when creating the client According to https://docs.aiohttp.org/en/stable/client_reference.html
When making requests, the rainbird client library uses the The home assistant connector does support multiple connections per host. So, to your point perhaps we can try a more conservative ClientSession tuned specifically for this device that does not allow multiple connections. I think the two possible options could be:
I'm thinking the rainbird app does the second one, since it will block out other hosts while the app is open. Home assistant probably should prefer the first to avoid blocking out the rainbird app. |
I have confirmed the rainbird app sets |
Good catch. So we should assure that we either close connection in every case, or we reuse the same connection and call the server synchronously, right? |
BTW, Rainbird uses the JSON RPC v2 which is capable to call multiple methods in one request, so what about to aggregate the requests onto the batches and fire multiple of them in one request? |
Yeah, the simplest thing is to limit connections to 1 so maybe we can start with that, then see if it's needed to close the connection. Very cool on json V2. I can definitely experiment with that. |
I did some testing with setting a limit of one connection per host, and noticed that it did close the connection right away. However, I then reverted the changed to see if I could observe mulitple connections open or a long lived connection open and noticed it was already closing them with the old code. Maybe you can see if you can confirm that the connection is hanging open? Then we can have more conviction that this will fix the issue you're seeing (since in my case, i'm actually not seeing an issue) My rainbird is
But then when i sniff the connection, i see the server/device response with
So if we see your device is not closing the connection then maybe that will be a hint that we need to close it... |
I tried playing with jsonrpc 2.0 batching but have not had any luck yet. I tried to just send the same command twice and got an invalid request:
|
Be careful, the IDs must differ. That the identifier which matches request to the response. So the response with the same ID matches the method call in the request. |
I had the same thought and also tried with an id+1 for second request and got the same result. |
Here is sequence of my 200 followed by 503:
I don't see any connection closing there. |
Which means that they just say the use JSON RPC 2.0, but the don't. Sending of multiple methods must be supported by JSON RPC compliant server. BTW, what about to not send the keep-alive header? |
I tried to remove the |
Do you see the connections open in |
hi, i'm having the same issue. In the home-assistant logs i see
In the meanwhile I don't see related active connections using:
is there a work around for the time being? |
Home Assistant 2023.7.2 Not sure if this is of any help but my irrigation automation failed earlier.
|
Hi, it needs to include debug information to be helpful at this point similar to the others we're discussing above. |
How can i provide/enable more debug information besides what i have in my home assistant configuration file?
|
From these two posts These are the interesting log components to also enable:
(You can also enable it in the UI and it should enable for |
@konikvranik up for giving this a try on |
Hi @allenporter, thank you for your patience. Now it report also aiohttp_retry in the log, but I'm afraid that the behavior is still the same: here is excerpt from the log:
|
@konikvranik thank you, would you mind include a little more detail? It doesn't appear to be showing details about the update coordinator successes and failures like before. |
I'm sorry @allenporter, I included just the beginning after restart. Here is excerpt with the errors:
and another one:
It seems that the stacktrace is present just in the first part of the log. All the rest is just like the second excerpt, just errors, no stacktraces. |
Thanks, that is what I was looking for. So we see it retrying three times now and all 3 times it gets a device busy error across 5 seconds, and the device still won't respond. Are you sure nothing else is talking to the device? You don't have multiple home assistant instances polling it or anything? This is pretty odd that the device is.busy for so long. What else? Should it try longer? |
No other device connected to the rainbird should be present in my network. Only my single instance of HA and phone, but the app wasn't active. My experience is that when it's unavailable, it gets available in aprox 1 minute. So if we could use cached results for about one minute and also cache the start stop requests for this period and execute it when available, this could help IMHO. |
Log file attached. |
@dffffffff that shows a single failure when the device is busy, which is expected when the device is busy since it only supports a single request at a time. Are you seeing the same flip flopping back and for unavailable/available as the original issue? |
@konikvranik in your latest example, it seems like there are many stop irrigation requests happening. Are you automating stop command retries yourself every 10-8 seconds? or were you interactively running those commands?
Then within 10 seconds there is another:
Then a few seconds later
Then a few seconds later
Then it succeeds. So from What do you experience with the Rainbird app when your device goes unavailable for the ~minute at a time? Is it also rejecting your requests? One other idea I have is to shift the polling interval so that it lines up with device availability. e.g. maybe your device should only poll every 2 minutes and align to the times of availability. |
I'm not aware of any automation regarding stopping. I didn't try the app in sync with HA, but sometimes I experience also issue with connecting with the official app. Regarding of the 2 min interval, I already tried it on the beginning, but it leaf to even worst lags. |
Do you have any thoughts about why there may be different stop requests sent a few moments apart? Is this an automation sending stop to a few different sprinklers? |
@allenporter still seeing this issue on Home Assistant 2023.8.2 - happens once or twice per 24 hours, the entities are marked as
|
@dffffffff that says it can't connect to your device so it probably has a poor wireless connection? This is working as intended (the device is unavailable) so it is correctly marking as unavailable and different from this issue where the device is rejecting the requests because it is busy. |
I see, the wifi link module is literally less than a meter from the wifi access point, doubt the connection is poor but i guess that is for another issue/thread :) |
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. |
In #112146 further adjustments were made to limit to only a single connection. |
The problem
The Rainbird integration still oscilating between available/unavailable states:
What version of Home Assistant Core has the issue?
2023.5.2
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Core
Integration causing the issue
Rainbird
Link to integration documentation on our website
https://www.home-assistant.io/integrations/rainbird
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
I'd suggest to ass some timeout after which the controller will be marked as unavailable. Until that the locally cached data will be used.
The text was updated successfully, but these errors were encountered: