New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automations issue/ ZHA Network busy errors after migrating to Skyconnect dongle #86411
Comments
Hey there @dmulcahey, @Adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration ( Code owner commandsCode owners of
(message by CodeOwnersMention) zha documentation |
The error means:
Your automation is referencing ZHA group entities. Are you rapidly sending messages to ZHA groups with it? If so, that's the cause of the error message. How fast are you sending them? |
Yes I use zha groups. example I have 4 lamps in my living room, I have a zigbee bulb in each. I have them all set in a zha group. |
Having the same issue, also using ZHA groups to control 2-4 lights at once depending on the group. And its not getting called often, in the example below i call the ZHA groups once (turn of all lights), its maybe 10 groups with 2-4 lights in each group. but the automation is only calling each light group once. The Automation actually turns off All lights (light service) by Area, meaning the Area could potentially contain both the ZHA group and the induvidual Light in that ZHA group, i usally Hide all the induvidual light entities but they are still in the same Area that the group is.
Another set of errors that might be related, `Logger: homeassistant.components.zha.core.channels.base [0xF4E3:1:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')] Edit: actually forgot to include the logs.. |
I had the same issue and found this post. I am now using helpers to create groups instead of ZHA groups and it seems to work a lot better. |
Thanks for the reply @SneakieGargamel I personally might give that a try, have heard from multiple people that this might solve the problem. But still, if it's a generally known issue it should be solved as it does make sense to try and keep the ZHA id requests at a minimum and they obviously created ZHA groups for a reason. |
If you are consistently trying to control several groups at once you should create an additional group with the members of the other groups. multicast / broadcast traffic has limits on EZSP. See the broadcast section here: https://community.silabs.com/s/article/guidelines-for-large-dense-networks-with-emberznet-pro?language=en_US (most relevant statement is 8 in a 9s window) keep in mind this means all broadcasts including ones initiated by the stack itself (not just your group messages) |
@dmulcahey that is kind of interesting, so do I understand correctly that using the independent lights Id´s will only worsen the result as it will probable more than 8 requests (~50 in my case); shouldn't creating a Helper group containing the same Id's cause the same issue? |
Helper groups send individual device messages so they are no different than just addressing lights individually. You can prove this by enabling debug logs and watching the Zigpy / bellows logs. Zigpy groups are meant to cut down Zigbee traffic and they send a single message to the group Id. again this can be seen in the nwk traffic. What you should not do is attempt to address several Zigbee groups at the same time. This will flood the network. If that is something you do consistently you should create an additional zigbee group with the members of the individual groups and use that. That article / the section I mentioned is specifically talking about broadcast messaging not about messaging individual devices. |
@dmulcahey Ah I see, appreciate the explanation. Sorry to bother you but as bit of a nerd I guess I need to know how some things function. You mentioned messages to a Zigpy group is meant to reduce traffic, but dousent the controller need to relay that message to each of the device Id of that group causing the same amount of end-to-end traffic as sending it to each device to begin with? Or is the group Ids somehow stored on the end-device, surely they can't be? |
The coordinator sends a multicast message to the group id so only 1 message is sent by the coordinator. Enable debug logs and you can follow along |
Also, you can read the group cluster stuff in here if you want: https://zigbeealliance.org/wp-content/uploads/2019/12/07-5123-06-zigbee-cluster-library-specification.pdf#page126 |
The 8 broadcast / 9 seconds is not one EZSP limit is one limit in the underlying 802.15.4 network and its shall being the same for all Zigbee 3 coordinator then forming on Zigbee 3 network. |
I have the same issues with ZHA groups, with SkyConnect. Deconz with the Conbee II works 100% better, regretting the migration of 100+ devices without testing. |
After trying virtually everything, I finally gave up on ZHA and migrated to Zigbee2MQTT (Z2M). Migrating is painful and Z2M is not perfect but it performs much better than ZHA in environments that include 40+ lighting devices such as mine. |
same issue with ZHA and SkyConnect... my group of 2 lights triggers this problem but neither a help group nor the individual lights will provoke it. |
@TheAlphaLaw @atr00 can you enable debug mode, run the actions that cause the issue and then attach the full logs please? |
@dmulcahey Note that, even then the action is successful, it is very slow (it takes 2 to 3 seconds for the lights to change their color... whereas individually or in a helper group it is almost instantaneous.) |
Thanks. This is really helpful. |
Happy to help! If you need more tests, just ask me. |
With debug mode on you should also get periodic counter dumps in the logs. Can you attach that too? Your log cuts off right when the command to read them is sent. |
I am not sure what they are... let me know if there is is what you need here: |
@atr00 I suspect you have an automation that needs debouncing, as you're sending a lot of group requests at once: # To turn on a single light, you'd need to send at most one `on()` command, and maybe one `move_to_color()`
2023-03-08 20:52:47.617 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.460 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:49.176 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:49.433 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.017 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:50.194 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.948 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:51.623 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
# `NETWORK_BUSY` immediately after sending the 10th request in 5 seconds
2023-03-08 20:52:52.559 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:52.574 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 118]
2023-03-08 20:52:53.023 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.034 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 125]
2023-03-08 20:52:55.077 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.088 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 126]
...
2023-03-08 20:53:04.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:05.604 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:06.809 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:07.841 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:07.884 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.664 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.836 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.631 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.968 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:10.904 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:11.292 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:11.963 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:13.916 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on() Group requests are network-wide broadcasts that are bounced back and forth between all devices to make sure every possible group member has a chance to "hear" the message. If you send this many at once, the network will become too busy, as the error indicates. The limit on the number of group requests/broadcasts is hard-coded in the firmware and can't be raised. You may find it better to switch to a normal light group where each bulb is individually contacted concurrently, as this won't have the same limitation. |
The under network layer 802.15.4 have broadcast storm protection so all routers is only handling 9 broadcast in 8 seconds and if its more they ignoring them. |
@puddly I'm actually not even using any automation here. I'm going to the coordinator device where I can see my 2-bulb zigbee light group and I change the color from there. I'm not sure the issue is only related to automation. And when I don't purposely jam the network by sending consecutive fast requests, the response is very slow (it take about 3 seconds between the click and the actual color change). When I change individual lights or a helper group I am not facing any issue. When I use Z2M using a dongle with the same chip (ERF32), it works flawlessly, whether on individual lights or on a Z2M zigbee group. @MattWestb if you see my answer here, this is not happening with Z2M, it's specific with ZHA and SkyConnect. My group has 2 bulbs in it. There's slowness as well on a single broadcast command. |
So this is only when using ZHA groups, right? There were some changes to this with zigpy/bellows#402, but there's still a difference in regards to how fast ZHA light groups react with TI coordinators (zigpy-znp) vs with EZSP coordinators (bellows). |
Watching this as I just moved all my zha devices over and set up lighting automations and am seeing the exact same thing, ~50 lights and ~12 zha groups created in the zigbee config |
I wanted to update with my findings, I fixed this by doing a couple things, not sure which one fixed it. I was using multi protocol firmware and I flashed stock then flashed the latest firmware. I then switched from channel 15 to 20 and re-paired all devices. I have 0 issues like this now and everything is snappy. Not sure if it was a channel change or if there are some issues in multi protocol but it's all good now. |
@jclendineng which firmware did you flash it with? |
I flashed the latest 7.2.2.2 for yellow but you would flash the one for sky connect. It's in the beta folder. I think that may have fixed me. Edit. The default sky connect or yellow firmware is 6.x and very old so the latest firmware would have a lot of qol and bug fixes so you should definitely try it. It was very easy, took me a few minutes to figure out how to flash but it's not too bad. |
I'm having this same issue. Attached logs. SkyConnect with a ZHA group addressing 9 Hue bulbs. I use the HASS HomeKit integration, and as soon as I drag the dimmer slider in iOS for the ZHA group for the lights, it dims briefly and then it becomes unresponsive and the only way to fix it is to delete the homekit bridge and create a new one. I believe the slider events are not debounced at all in the pipeline of HASS homekit bridge/ZHA/SkyConnect and immediately makes them unresponsive. Asking siri to turn them on and off works fine but that's a single action as opposed to dragging the dimmer slider. |
Hello, I also have the problem of network-busy-errors with ZHA groups. I would flash the Yellow with the latest beta NabuCasa_Yellow_EZSP_v7.3.0.0 as @jclendineng suggests. Is there anywhere I can read the actual firmware version of the Silicon Labs EZSP module in the Yellow? I can't find any info in the ZHA integration. |
I have the same issue. I have one group with 3 GU10 spots. I use the group to change the color with one message. Logger: homeassistant.components.automation.wc_foh_spots WC: FOH Spots: Choose at step 1: choice 4: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> |
Not sure if it's with the multiprotocol plugin, or just the radio, but I'm using a Sonoff ZBDongle-E, which is also based on the EFR32MG21 Multiprotocol SoC and am also running into this issue regularly after switching over from the original Sonoff Dongle. It all tracks, though, that the problems would start when I moved to multiprotocol, from what other are saying about the way it behaves compared to the original one. For single-item control I have no issues, but when controlling a number of items in a group a few times in quick succession, I get the message in the same way I did before when the items weren't in a Zigbee group. I've already tried changing channels a few times, but it's currently definitely on the best for my network. Just flashed the newest firmware on my device today and I was able to get it to trigger by changing the color of a group of bulbs a few times in a ten second period, so it's still going on. |
Group commands aren't appropriate if you're going to be sending them frequently like this. With a clear Zigbee channel, you can easily control many dozens of bulbs nearly instantaneously with just a normal light group, which won't suffer from this same "too many requests at once" problem. This isn't something that can really be fixed either because it's an inherent limitation of the protocol: the network is busy and the firmware is preventing you from transmitting more data until the existing group commands finish bouncing around. That being said, there are some things you can tweak to reduce the various broadcast radiuses, which will potentially make it into this upcoming beta. The original problem will still be there, however: groups aren't made to be used frequently enough to hit these limitations. |
I was always under the impression that Zigbee groups were the way to go, but it looks like this worked, at least as far as I can tell. Previously, I'd been having this same error pop up when I controlled the lights individually and it was only solved when I changed to a Zigbee group, so I wasn't aware this could be a solution. Unfortunately, this means I'm going to have to re-create all of my Zigbee groups in HA standard. Small price to pay to get everything working properly again. Thanks for cluing me in on the difference between standard groups and Zigbee groups. Edit: Okay, "solved" is an overstatement, unfortunately. Zigbee Groups solved a lot of problems before I switched to multiprotocol, in that the changes would be simultaneous across devices. Now that I've got everything on HA groups instead of Zigbee groups, I'm not getting the network busy errors, but full-home control takes a long time. I'll tell HA to turn all of my lights on, or change them to a specific color, and it'll go around my house, one by one, changing and turning on lights, and sometimes it'll end before they all change and the light will throw back an error until I turn them off and on again. ZHA groups seem to be the best way to deal with fixtures with more than one light and entire-home control, and I'm not entirely sure why it worked before I switched to multiprotocol and why it's having so many issues now. |
Hi, been looking into a lot of posts regarding this. I had a Conbee II which work fine and i wanted to test the HA Skyconnect. To add matter and support HA at the same time. Had no real issues with Conbee. After full reset of my ZHA (no backup at all) i redid my network manually. 46 devices of which are 30+ IKEA lights/outlets and the rest are sensors from Hue, Aqara, Ikea and Frient. Now im getting Network Busy 161 error from time to time from diffrent automations i have since before skyconnect. For example i have an automation that dims lights thats triggerd on playstatus on mediaplayer: Actions
light.livingroom = ZHA group with 8 devices And this causes Network Busy 161 error, am i exeeding the limit with these actions (thought limit was 10 devices)? If so do ConBee II handle this better, because automations is the same and it never happend before SkyConnect dongle? |
@ABEIDO |
Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:
|
Yep, I'm still in the same boat. Things work, but not as well as they did on my earlier device. I really do miss the instant-on that I'd get with my earlier Zigbee device when I was able to use ZHA groups properly, and even though things work, more or less, with standard HA groups, reaction times are slower and I do still get errors occasionally when controlling the entire house (not as often as with the new dongle + ZHA groups, but still more often than when I controlled all of my lights in ZHA groups on the older dongle). I don't want to give up the additional features that the new dongle brings, but it's still a bit of a bummer when stuff doesn't work perfectly/instantly when controlling large groups of lights. |
As for now i switched to HA groups, and yeah its a bit slower. And as many say ZHA groups is the way to go but as it is with SkyConnect its not behaving optimal. I cannot really find any diffrence on the setup except the dongle itself. I have the same extension cord, channel, location of the device and so on, its just when using Skyconnect(no multicontrol) ZHA groups its acting up and not with the ConbeeII and as my automation showed , i only turn on one ZHA group together with a indivdual light once and i get error(not really spamming in my book). Im on the way to migrate my HA from RPI to MicroPC so during that i most likely will go back to ConbeeII, atleast to continue ts. |
I found this the best explanation from @MattWestb. I'm running the SkyConnect 7.3.2.0 beta firmware and created an easy script where 2 lights, which I created a ZHA group for, blink on and off for 5 times each with a duration of 1 second. This would multicast 5x off and 5x on to the ZHA group. At the 9th time I receive the network busy error meaning that the broadcast storm protection kicked-in. As stated before Conbee II and other EmberZNet Serial Protocol (EZSP) controllers might not stick to the protocol specification, however Skyconnect does and I can understand if they decide to not lift the broadcast storm protection as this might cause overloading the Zigbee network. Decided for now to create smaller ZHA groups, and where needed a HA group to "bypass" the multicast storm protection since HA groups are sending commands to each individual light respectively. |
All EZSP firmware is having the same broadcast setting then its locked in the GSDK but can being changed but need one special patch from Silabs and then it cant being Zigbee certified then its out of standard. |
Getting I also have an 8 downlight group in the Living Room and two Sonoff Basic ZBR3s in different locations to aid in routing, but devices drop off very frequently. Such a pain. Any advice? I also get |
I also faced this issue trying to set the colour then turn off two lights in the same group. |
Don’t add groups to groups like this. Create an additional zigbee group and add all devices to it for this sort of thing. |
Put ZHA in debug mode and make the error happen. Then disable debug mode and attach the downloaded log file here. |
Something spooky have happening with broadcast in latest bug fix release then i dont using groups so much then IKEA have taking it away in all first gen controllers but still the system si doing much brodcast for routing discovery and so on.
|
I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0. The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network. |
Any reason of not doing it the easy way via webflasher? What version did you upgrade from? |
I disabled Multiprotocol support and it seems to have made a difference. |
I was on 7.3.1.0 but was never prompted to upgrade to latest version 7.3.2.0. Anyway, I have tried your recommendation and upgraded to 7.4.0.0. To do this, you need to download the relevant .gbl file and choose the Change Firmware option in that Web UI. |
I'm also on 7.3.1. Will give 7.4.0 a shot |
I'm on 7.3.2.0. For you guys updating to the 7.4.0.0b please come back with feedback here if you can |
I'm on 7.4.0.0 skyconnect firmware (zigbee only, no multiprotocol) and am facing this same issue. It even happens when controlling a single ZHA group manually from the iOS app with no automations involved. I suspect that the hass UI isn't doing any sort of debouncing so if your finger moves 10 pixels on the color chooser it seems like it spams the network with 20+ broadcasts and locks things up. My network is solid, no device is more than 10ft from a repeater and I've made sure my channel is clear and not conflicting with my wifi. I'll have to give plain groups a try. This issue is quite frustrating! It's very easy to trigger and usually results in lights being in all sorts of inconsistent states. I've avoided wifi devices because I thought zigbee would be more reliable but I've had nothing but trouble with reliability. I've tried zha and z2m with skyconnect, sonoff-p, and sonoff-e. Different unreliabilities with all of them. None of them have worked reliably like zwave does. Might be time to give up on zigbee and set up a dedicated ssid on an isolated subnet to make wifi devices secure. |
The problem
After migrating to Home Assistant Skyconnect usb dongle I've been running into network busy errors.
I currently have the dongle connected to a usb extension cable connected to R-Pie4 .
What version of Home Assistant Core has the issue?
Home Assistant 2023.1.6
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Automation
Link to integration documentation on our website
https://www.home-assistant.io/docs/automation/
Diagnostics information
No response
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
No response
The text was updated successfully, but these errors were encountered: