Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automations issue/ ZHA Network busy errors after migrating to Skyconnect dongle #86411

Open
jason1980p opened this issue Jan 23, 2023 · 69 comments
Assignees

Comments

@jason1980p
Copy link

The problem

After migrating to Home Assistant Skyconnect usb dongle I've been running into network busy errors.
I currently have the dongle connected to a usb extension cable connected to R-Pie4 .

What version of Home Assistant Core has the issue?

Home Assistant 2023.1.6

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Automation

Link to integration documentation on our website

https://www.home-assistant.io/docs/automation/

Diagnostics information

No response

Example YAML snippet

alias: "Pico: Master Bathroom remote"
description: ""
use_blueprint:
  path: stephack/core-pico.yaml
  input:
    pico_remote: a58ddd4ab05559d05de8267f82dd7c49
    top_on:
      - service: light.turn_on
        data:
          brightness_step_pct: 100
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006
    bottom_off_release:
      - service: light.turn_off
        data: {}
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006
    up_raise:
      - service: light.turn_on
        data:
          brightness_step_pct: 20
        target:
          entity_id:
            - light.light_unknown_master_bathroom_lights_zha_group_0x0006
    down_lower:
      - service: light.turn_on
        data:
          brightness_step_pct: -20
        target:
          entity_id: light.light_unknown_master_bathroom_lights_zha_group_0x0006

Anything in the logs that might be useful for us?

Logger: homeassistant.components.automation.pico_master_bedroom_remote
Source: components/zha/light.py:292
Integration: Automation (documentation, issues)
First occurred: January 21, 2023 at 8:37:18 PM (7 occurrences)
Last logged: 7:21:23 PM

Pico: Master Bathroom remote: Choose at step 1: choice 1: Choose at step 1: choice 1: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 684, in _async_call_service_step
    await service_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1755, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1792, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 958, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 570, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 978, in async_turn_on
    await super().async_turn_on(**kwargs)
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 292, in async_turn_on
    result = await self._level_channel.move_to_level_with_on_off(
  File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 324, in request
    return await self._endpoint.request(
  File "/usr/local/lib/python3.10/site-packages/zigpy/group.py", line 57, in request
    await self.application.send_packet(
  File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 782, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>

Additional information

No response

@home-assistant
Copy link

Hey there @dmulcahey, @Adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of zha can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Change the title of the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign zha Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


zha documentation
zha source
(message by IssueLinks)

@puddly
Copy link
Contributor

puddly commented Jan 23, 2023

The error means:

A message cannot be sent because the network is currently overloaded.

Your automation is referencing ZHA group entities. Are you rapidly sending messages to ZHA groups with it? If so, that's the cause of the error message. How fast are you sending them?

@jason1980p
Copy link
Author

Yes I use zha groups. example I have 4 lamps in my living room, I have a zigbee bulb in each. I have them all set in a zha group.
I control the group lights via a automation using the Lutron Pico remote.
when I press the on button it sends a on request to the group instead of each individual light bulb
same when turning the lights off.

@fakethinkpad85
Copy link

fakethinkpad85 commented Jan 27, 2023

Having the same issue, also using ZHA groups to control 2-4 lights at once depending on the group. And its not getting called often, in the example below i call the ZHA groups once (turn of all lights), its maybe 10 groups with 2-4 lights in each group. but the automation is only calling each light group once.

The Automation actually turns off All lights (light service) by Area, meaning the Area could potentially contain both the ZHA group and the induvidual Light in that ZHA group, i usally Hide all the induvidual light entities but they are still in the same Area that the group is.

image

image

Watch - Away: Choose at step 1: choice 1: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> Watch - Away: Error executing script. Unexpected error for choose at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161> While executing automation automation.watch_away Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in _async_step await getattr(self, handler)() File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 684, in _async_call_service_step await service_task File "/usr/src/homeassistant/homeassistant/core.py", line 1755, in async_call task.result() File "/usr/src/homeassistant/homeassistant/core.py", line 1792, in _execute_service await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)( File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service await service.entity_service_call( File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call future.result() # pop exception if have File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 958, in async_request_call await coro File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call await result File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 581, in async_handle_light_off_service await light.async_turn_off(**filter_turn_off_params(light, params)) File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 986, in async_turn_off await super().async_turn_off(**kwargs) File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 417, in async_turn_off result = await self._on_off_channel.off() File "/usr/local/lib/python3.10/site-packages/zigpy/zcl/__init__.py", line 324, in request return await self._endpoint.request( File "/usr/local/lib/python3.10/site-packages/zigpy/group.py", line 57, in request await self.application.send_packet( File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 782, in send_packet raise zigpy.exceptions.DeliveryError( zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>


Another set of errors that might be related,

`Logger: homeassistant.components.zha.core.channels.base
Source: components/zha/core/channels/base.py:486
Integration: Zigbee Home Automation (documentation, issues)
First occurred: January 26, 2023 at 16:49:07 (32 occurrences)
Last logged: January 26, 2023 at 16:49:10

[0xF4E3:1:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]
[0xB7E9:1:0x0300]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]
[0xC402:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]
[0x7588:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]
[0xC402:1:0x0008]: async_initialize: all attempts have failed: [DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>'), DeliveryError('Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>')]`

Edit: actually forgot to include the logs..

@SneakieGargamel
Copy link

SneakieGargamel commented Jan 28, 2023

I had the same issue and found this post. I am now using helpers to create groups instead of ZHA groups and it seems to work a lot better.

@fakethinkpad85
Copy link

Thanks for the reply @SneakieGargamel I personally might give that a try, have heard from multiple people that this might solve the problem. But still, if it's a generally known issue it should be solved as it does make sense to try and keep the ZHA id requests at a minimum and they obviously created ZHA groups for a reason.

@dmulcahey
Copy link
Contributor

If you are consistently trying to control several groups at once you should create an additional group with the members of the other groups.

multicast / broadcast traffic has limits on EZSP. See the broadcast section here: https://community.silabs.com/s/article/guidelines-for-large-dense-networks-with-emberznet-pro?language=en_US

(most relevant statement is 8 in a 9s window)

keep in mind this means all broadcasts including ones initiated by the stack itself (not just your group messages)

@fakethinkpad85
Copy link

@dmulcahey that is kind of interesting, so do I understand correctly that using the independent lights Id´s will only worsen the result as it will probable more than 8 requests (~50 in my case); shouldn't creating a Helper group containing the same Id's cause the same issue?

@dmulcahey
Copy link
Contributor

Helper groups send individual device messages so they are no different than just addressing lights individually. You can prove this by enabling debug logs and watching the Zigpy / bellows logs. Zigpy groups are meant to cut down Zigbee traffic and they send a single message to the group Id. again this can be seen in the nwk traffic. What you should not do is attempt to address several Zigbee groups at the same time. This will flood the network. If that is something you do consistently you should create an additional zigbee group with the members of the individual groups and use that.

That article / the section I mentioned is specifically talking about broadcast messaging not about messaging individual devices.

@fakethinkpad85
Copy link

@dmulcahey Ah I see, appreciate the explanation. Sorry to bother you but as bit of a nerd I guess I need to know how some things function. You mentioned messages to a Zigpy group is meant to reduce traffic, but dousent the controller need to relay that message to each of the device Id of that group causing the same amount of end-to-end traffic as sending it to each device to begin with?

Or is the group Ids somehow stored on the end-device, surely they can't be?

@dmulcahey
Copy link
Contributor

The coordinator sends a multicast message to the group id so only 1 message is sent by the coordinator. Enable debug logs and you can follow along

@dmulcahey
Copy link
Contributor

Also, you can read the group cluster stuff in here if you want: https://zigbeealliance.org/wp-content/uploads/2019/12/07-5123-06-zigbee-cluster-library-specification.pdf#page126

@MattWestb
Copy link
Contributor

The 8 broadcast / 9 seconds is not one EZSP limit is one limit in the underlying 802.15.4 network and its shall being the same for all Zigbee 3 coordinator then forming on Zigbee 3 network.
If using one TI CC-2531 with HA 1.X firmware is one other thing.
Some IT coordinator Zigbee 3 firmware have being patched with very high broadcast limit but its useless then all routers that is Zigbee certificated is silent doping the 9 package in 9 seconds window.
The limit is made for blocking broadcast storms that can blocking the network and its the same with other protocols like Thread that is using 802.15.4 under there own stack.

@TheAlphaLaw
Copy link

I have the same issues with ZHA groups, with SkyConnect. Deconz with the Conbee II works 100% better, regretting the migration of 100+ devices without testing.

@smartqasa
Copy link

After trying virtually everything, I finally gave up on ZHA and migrated to Zigbee2MQTT (Z2M). Migrating is painful and Z2M is not perfect but it performs much better than ZHA in environments that include 40+ lighting devices such as mine.

@atr00
Copy link

atr00 commented Mar 8, 2023

same issue with ZHA and SkyConnect... my group of 2 lights triggers this problem but neither a help group nor the individual lights will provoke it.
Z2M groups (using Sonoff Dongle-E so also eszp) work flawlessly as well.

@dmulcahey
Copy link
Contributor

@TheAlphaLaw @atr00 can you enable debug mode, run the actions that cause the issue and then attach the full logs please?

@atr00
Copy link

atr00 commented Mar 8, 2023

@dmulcahey
There you go: zha_logs.txt
There are both successful and failed actions in the log.

Note that, even then the action is successful, it is very slow (it takes 2 to 3 seconds for the lights to change their color... whereas individually or in a helper group it is almost instantaneous.)

@dmulcahey
Copy link
Contributor

Thanks. This is really helpful.

@atr00
Copy link

atr00 commented Mar 8, 2023

Happy to help! If you need more tests, just ask me.

@dmulcahey
Copy link
Contributor

dmulcahey commented Mar 8, 2023

With debug mode on you should also get periodic counter dumps in the logs. Can you attach that too?

Your log cuts off right when the command to read them is sent.

@atr00
Copy link

atr00 commented Mar 8, 2023

I am not sure what they are... let me know if there is is what you need here:
home-assistant_2023-03-08T12-53-49.370Z.log

@puddly
Copy link
Contributor

puddly commented Mar 8, 2023

@atr00 I suspect you have an automation that needs debouncing, as you're sending a lot of group requests at once:

# To turn on a single light, you'd need to send at most one `on()` command, and maybe one `move_to_color()`
2023-03-08 20:52:47.617 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.460 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:48.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:49.176 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:49.433 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.017 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:50.194 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:50.948 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:51.623 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()

# `NETWORK_BUSY` immediately after sending the 10th request in 5 seconds
2023-03-08 20:52:52.559 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:52:52.574 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 118]

2023-03-08 20:52:53.023 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.034 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 125]

2023-03-08 20:52:55.077 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:52:53.088 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendMulticast: [<EmberStatus.NETWORK_BUSY: 161>, 126]

...

2023-03-08 20:53:04.598 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:05.604 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=11272, color_y=48954, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:06.809 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:07.841 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=8912, color_y=2621, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:07.884 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.664 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:08.836 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=16318, color_y=6029, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.631 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=25230, color_y=10157, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:09.968 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:10.904 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0300] Sending request: move_to_color(color_x=29097, color_y=33881, transition_time=0, options_mask=None, options_override=None)
2023-03-08 20:53:11.292 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:11.963 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()
2023-03-08 20:53:13.916 DEBUG (MainThread) [zigpy.zcl] [Bedroom Desk Bulbs:None:0x0006] Sending request: on()

Group requests are network-wide broadcasts that are bounced back and forth between all devices to make sure every possible group member has a chance to "hear" the message. If you send this many at once, the network will become too busy, as the error indicates. The limit on the number of group requests/broadcasts is hard-coded in the firmware and can't be raised.

You may find it better to switch to a normal light group where each bulb is individually contacted concurrently, as this won't have the same limitation.

@MattWestb
Copy link
Contributor

The under network layer 802.15.4 have broadcast storm protection so all routers is only handling 9 broadcast in 8 seconds and if its more they ignoring them.
Ziogbee groups can being used for individual lights but is the best being used with middle and large groups of lights and its not getting so large problems if having routing problem in the network.

@atr00
Copy link

atr00 commented Mar 9, 2023

@puddly I'm actually not even using any automation here. I'm going to the coordinator device where I can see my 2-bulb zigbee light group and I change the color from there. I'm not sure the issue is only related to automation. And when I don't purposely jam the network by sending consecutive fast requests, the response is very slow (it take about 3 seconds between the click and the actual color change). When I change individual lights or a helper group I am not facing any issue. When I use Z2M using a dongle with the same chip (ERF32), it works flawlessly, whether on individual lights or on a Z2M zigbee group.

@MattWestb if you see my answer here, this is not happening with Z2M, it's specific with ZHA and SkyConnect. My group has 2 bulbs in it. There's slowness as well on a single broadcast command.

@TheJulianJES
Copy link
Contributor

TheJulianJES commented Mar 9, 2023

the response is very slow (it take about 3 seconds between the click and the actual color change

So this is only when using ZHA groups, right?
I also remember a "lag" when changing colors on ZHA groups (using the UI picker) and EZSP coordinators in my testing.
There are two packets sent: OnOff -> on and Color -> set_xy_color even when just changing a light color that is already on (this is somewhat because ZHA can't be 100% sure that the light is actually on and the service is called light.turn_on, so there's always either some on message or move_to_level_with_on_off being sent with "just a color change". This also follows the behavior of other integrations).
However, TI coordinators do not have this lag. They basically send both packets at almost the exact same time.
EZSP coordinators (when used with bellows) have a delay of about a second or so for me (until the color change shows up).

There were some changes to this with zigpy/bellows#402, but there's still a difference in regards to how fast ZHA light groups react with TI coordinators (zigpy-znp) vs with EZSP coordinators (bellows).

@jclendineng
Copy link

Watching this as I just moved all my zha devices over and set up lighting automations and am seeing the exact same thing, ~50 lights and ~12 zha groups created in the zigbee config

@jclendineng
Copy link

I wanted to update with my findings, I fixed this by doing a couple things, not sure which one fixed it. I was using multi protocol firmware and I flashed stock then flashed the latest firmware. I then switched from channel 15 to 20 and re-paired all devices. I have 0 issues like this now and everything is snappy. Not sure if it was a channel change or if there are some issues in multi protocol but it's all good now.

@atr00
Copy link

atr00 commented Mar 27, 2023

@jclendineng which firmware did you flash it with?

@jclendineng
Copy link

jclendineng commented Mar 29, 2023

flash

I flashed the latest 7.2.2.2 for yellow but you would flash the one for sky connect. It's in the beta folder. I think that may have fixed me.

Edit. The default sky connect or yellow firmware is 6.x and very old so the latest firmware would have a lot of qol and bug fixes so you should definitely try it. It was very easy, took me a few minutes to figure out how to flash but it's not too bad.

@RoyHP
Copy link

RoyHP commented Apr 23, 2023

I'm having this same issue. Attached logs. SkyConnect with a ZHA group addressing 9 Hue bulbs. I use the HASS HomeKit integration, and as soon as I drag the dimmer slider in iOS for the ZHA group for the lights, it dims briefly and then it becomes unresponsive and the only way to fix it is to delete the homekit bridge and create a new one. I believe the slider events are not debounced at all in the pipeline of HASS homekit bridge/ZHA/SkyConnect and immediately makes them unresponsive. Asking siri to turn them on and off works fine but that's a single action as opposed to dragging the dimmer slider.
home-assistant_homekit_2023-04-23T23-09-00.031Z.log

@dieneuser
Copy link

Hello,

I also have the problem of network-busy-errors with ZHA groups. I would flash the Yellow with the latest beta NabuCasa_Yellow_EZSP_v7.3.0.0 as @jclendineng suggests. Is there anywhere I can read the actual firmware version of the Silicon Labs EZSP module in the Yellow? I can't find any info in the ZHA integration.

@smartmatic
Copy link

smartmatic commented Jun 25, 2023

I have the same issue. I have one group with 3 GU10 spots. I use the group to change the color with one message.
The group is triggered via a Friends of Hue Zigbee Green Power Switch.

Logger: homeassistant.components.automation.wc_foh_spots
Source: components/zha/light.py:336
Integration: Automation (documentation, issues)
First occurred: 11:17:32 (3 occurrences)
Last logged: 11:17:32

WC: FOH Spots: Choose at step 1: choice 4: Error executing script. Unexpected error for call_service at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>
WC: FOH Spots: Error executing script. Unexpected error for choose at pos 1: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>
While executing automation automation.wc_foh_spots
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 452, in _async_step
await getattr(self, handler)()
File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 685, in _async_call_service_step
await service_task
File "/usr/src/homeassistant/homeassistant/core.py", line 1910, in async_call
task.result()
File "/usr/src/homeassistant/homeassistant/core.py", line 1950, in _execute_service
await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 226, in handle_service
await service.entity_service_call(
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 811, in entity_service_call
future.result() # pop exception if have
^^^^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1034, in async_request_call
await coro
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 851, in _handle_entity_call
await result
File "/usr/src/homeassistant/homeassistant/components/light/init.py", line 582, in async_handle_light_on_service
await light.async_turn_on(**filter_turn_on_params(light, params))
File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 1197, in async_turn_on
await super().async_turn_on(**kwargs)
File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 336, in async_turn_on
result = await self._level_cluster_handler.move_to_level_with_on_off(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/zigpy/zcl/init.py", line 331, in request
return await self._endpoint.request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/zigpy/group.py", line 57, in request
await self.application.send_packet(
File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 812, in send_packet
raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>

@cityeyes
Copy link

Not sure if it's with the multiprotocol plugin, or just the radio, but I'm using a Sonoff ZBDongle-E, which is also based on the EFR32MG21 Multiprotocol SoC and am also running into this issue regularly after switching over from the original Sonoff Dongle.

It all tracks, though, that the problems would start when I moved to multiprotocol, from what other are saying about the way it behaves compared to the original one.

For single-item control I have no issues, but when controlling a number of items in a group a few times in quick succession, I get the message in the same way I did before when the items weren't in a Zigbee group.

I've already tried changing channels a few times, but it's currently definitely on the best for my network. Just flashed the newest firmware on my device today and I was able to get it to trigger by changing the color of a group of bulbs a few times in a ten second period, so it's still going on.

@puddly
Copy link
Contributor

puddly commented Aug 16, 2023

Group commands aren't appropriate if you're going to be sending them frequently like this. With a clear Zigbee channel, you can easily control many dozens of bulbs nearly instantaneously with just a normal light group, which won't suffer from this same "too many requests at once" problem.

This isn't something that can really be fixed either because it's an inherent limitation of the protocol: the network is busy and the firmware is preventing you from transmitting more data until the existing group commands finish bouncing around. That being said, there are some things you can tweak to reduce the various broadcast radiuses, which will potentially make it into this upcoming beta. The original problem will still be there, however: groups aren't made to be used frequently enough to hit these limitations.

@cityeyes
Copy link

cityeyes commented Aug 16, 2023

normal light group

I was always under the impression that Zigbee groups were the way to go, but it looks like this worked, at least as far as I can tell. Previously, I'd been having this same error pop up when I controlled the lights individually and it was only solved when I changed to a Zigbee group, so I wasn't aware this could be a solution.

Unfortunately, this means I'm going to have to re-create all of my Zigbee groups in HA standard. Small price to pay to get everything working properly again. Thanks for cluing me in on the difference between standard groups and Zigbee groups.

Edit: Okay, "solved" is an overstatement, unfortunately. Zigbee Groups solved a lot of problems before I switched to multiprotocol, in that the changes would be simultaneous across devices.

Now that I've got everything on HA groups instead of Zigbee groups, I'm not getting the network busy errors, but full-home control takes a long time. I'll tell HA to turn all of my lights on, or change them to a specific color, and it'll go around my house, one by one, changing and turning on lights, and sometimes it'll end before they all change and the light will throw back an error until I turn them off and on again.

ZHA groups seem to be the best way to deal with fixtures with more than one light and entire-home control, and I'm not entirely sure why it worked before I switched to multiprotocol and why it's having so many issues now.

@ABEIDO
Copy link

ABEIDO commented Nov 21, 2023

Hi, been looking into a lot of posts regarding this.

I had a Conbee II which work fine and i wanted to test the HA Skyconnect. To add matter and support HA at the same time. Had no real issues with Conbee.

After full reset of my ZHA (no backup at all) i redid my network manually. 46 devices of which are 30+ IKEA lights/outlets and the rest are sensors from Hue, Aqara, Ikea and Frient. Now im getting Network Busy 161 error from time to time from diffrent automations i have since before skyconnect.

For example i have an automation that dims lights thats triggerd on playstatus on mediaplayer:

Actions

service: light.turn_on
data:
  brightness_pct: 10
  transition: 3
target:
  entity_id:
    - light.livingroom
    - light.hallway_table

light.livingroom = ZHA group with 8 devices
rest = single devices

And this causes Network Busy 161 error, am i exeeding the limit with these actions (thought limit was 10 devices)?

If so do ConBee II handle this better, because automations is the same and it never happend before SkyConnect dongle?

@ChristophHoltmann
Copy link

ChristophHoltmann commented Nov 23, 2023

@ABEIDO
I had the same idea and also switched from Conbee II to ZHA and Skyconnect some time ago. I had the same problems and after much back and forth I followed @cityeyes suggestion (just don't use ZHA groups, only HA groups). Maybe not the desired solution, but since then I have no more "Network Busy 161" errors. I have about 40 devices (mostly lights) and the delay is acceptable most of the time, even if I turn on several lights at the same time (without using ZHA groups).

@ABEIDO
Copy link

ABEIDO commented Nov 23, 2023

@ABEIDO I had the same idea and also switched from Conbee II to ZHA and Skyconnect some time ago. I had the same problems and after much back and forth I followed @cityeyes suggestion (just don't use ZHA groups, only HA groups). Maybe not the desired solution, but since then I have no more "Network Busy 161" errors. I have about 40 devices (mostly lights) and the delay is acceptable most of the time, even if I turn on several lights at the same time (without using ZHA groups).

Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:

  1. Switch to HA Groups - Feels like its a step back.

  2. Go back to Conbee II - Feels like its a step back also especially as SkyConnect is newer. And a massive hassle to redo the network.

  3. Hope for fix

@cityeyes
Copy link

Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:

  1. Switch to HA Groups - Feels like its a step back.
  2. Go back to Conbee II - Feels like its a step back also especially as SkyConnect is newer. And a massive hassle to redo the network.
  3. Hope for fix

Yep, I'm still in the same boat. Things work, but not as well as they did on my earlier device. I really do miss the instant-on that I'd get with my earlier Zigbee device when I was able to use ZHA groups properly, and even though things work, more or less, with standard HA groups, reaction times are slower and I do still get errors occasionally when controlling the entire house (not as often as with the new dongle + ZHA groups, but still more often than when I controlled all of my lights in ZHA groups on the older dongle).

I don't want to give up the additional features that the new dongle brings, but it's still a bit of a bummer when stuff doesn't work perfectly/instantly when controlling large groups of lights.

@ABEIDO
Copy link

ABEIDO commented Nov 28, 2023

Yes i been thinking about it but as i see i have 3 options and im thinking mostly of option 2:

  1. Switch to HA Groups - Feels like its a step back.
  2. Go back to Conbee II - Feels like its a step back also especially as SkyConnect is newer. And a massive hassle to redo the network.
  3. Hope for fix

Yep, I'm still in the same boat. Things work, but not as well as they did on my earlier device. I really do miss the instant-on that I'd get with my earlier Zigbee device when I was able to use ZHA groups properly, and even though things work, more or less, with standard HA groups, reaction times are slower and I do still get errors occasionally when controlling the entire house (not as often as with the new dongle + ZHA groups, but still more often than when I controlled all of my lights in ZHA groups on the older dongle).

I don't want to give up the additional features that the new dongle brings, but it's still a bit of a bummer when stuff doesn't work perfectly/instantly when controlling large groups of lights.

As for now i switched to HA groups, and yeah its a bit slower. And as many say ZHA groups is the way to go but as it is with SkyConnect its not behaving optimal.

I cannot really find any diffrence on the setup except the dongle itself. I have the same extension cord, channel, location of the device and so on, its just when using Skyconnect(no multicontrol) ZHA groups its acting up and not with the ConbeeII and as my automation showed , i only turn on one ZHA group together with a indivdual light once and i get error(not really spamming in my book).

Im on the way to migrate my HA from RPI to MicroPC so during that i most likely will go back to ConbeeII, atleast to continue ts.

@sjors-lemniscap
Copy link

sjors-lemniscap commented Dec 12, 2023

The under network layer 802.15.4 have broadcast storm protection so all routers is only handling 9 broadcast in 8 seconds and if its more they ignoring them.

I found this the best explanation from @MattWestb. I'm running the SkyConnect 7.3.2.0 beta firmware and created an easy script where 2 lights, which I created a ZHA group for, blink on and off for 5 times each with a duration of 1 second. This would multicast 5x off and 5x on to the ZHA group.

At the 9th time I receive the network busy error meaning that the broadcast storm protection kicked-in. As stated before Conbee II and other EmberZNet Serial Protocol (EZSP) controllers might not stick to the protocol specification, however Skyconnect does and I can understand if they decide to not lift the broadcast storm protection as this might cause overloading the Zigbee network.

Decided for now to create smaller ZHA groups, and where needed a HA group to "bypass" the multicast storm protection since HA groups are sending commands to each individual light respectively.

@MattWestb
Copy link
Contributor

All EZSP firmware is having the same broadcast setting then its locked in the GSDK but can being changed but need one special patch from Silabs and then it cant being Zigbee certified then its out of standard.
TI coordinator firmware is patched of Z2M and is going outside standard but its making problems with routing the broadcast is not working and you is getting no route to devices that is commingling with unicast.

@wernerhp
Copy link

wernerhp commented Jan 20, 2024

Getting EmberStatus.NETWORK_BUSY: 161 when calling a HA Helper Group that contains two Zigbee Groups (Kitchen 6 downlights; Sculler 2 downlights Running Sky Connect on standard firmware on a Home Assistant Blue.

I also have an 8 downlight group in the Living Room and two Sonoff Basic ZBR3s in different locations to aid in routing, but devices drop off very frequently. Such a pain. Any advice?

I also get EmberStatus.DELIVERY_FAILED: 102 when controlling some of the lights individually. According to ZHA's Network view, the device is offline, but it's on the same circuit where other lights work, so what gives.

@codyc1515
Copy link
Contributor

I also faced this issue trying to set the colour then turn off two lights in the same group.

@dmulcahey
Copy link
Contributor

Getting EmberStatus.NETWORK_BUSY: 161 when calling a HA Helper Group that contains two Zigbee Groups (Kitchen 6 downlights; Sculler 2 downlights Running Sky Connect on standard firmware on a Home Assistant Blue.

I also have an 8 downlight group in the Living Room and two Sonoff Basic ZBR3s in different locations to aid in routing, but devices drop off very frequently. Such a pain. Any advice?

I also get EmberStatus.DELIVERY_FAILED: 102 when controlling some of the lights individually. According to ZHA's Network view, the device is offline, but it's on the same circuit where other lights work, so what gives.

Don’t add groups to groups like this. Create an additional zigbee group and add all devices to it for this sort of thing.

@dmulcahey
Copy link
Contributor

I also faced this issue trying to set the colour then turn off two lights in the same group.

Put ZHA in debug mode and make the error happen. Then disable debug mode and attach the downloaded log file here.

@MattWestb
Copy link
Contributor

Something spooky have happening with broadcast in latest bug fix release then i dont using groups so much then IKEA have taking it away in all first gen controllers but still the system si doing much brodcast for routing discovery and so on.
Now i getting this in the log more times a day im my production system:

Logger: zigpy.topology
Source: /usr/local/lib/python3.11/site-packages/zigpy/topology.py:84
First occurred: 10:12:37 (3 occurrences)
Last logged: 22:36:49

Topology scan failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 78, in _scan_loop
    await self.scan()
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 96, in scan
    await self._scan_task
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 221, in _scan
    await self._find_unknown_devices(neighbors=self.neighbors, routes=self.routes)
  File "/usr/local/lib/python3.11/site-packages/zigpy/topology.py", line 253, in _find_unknown_devices
    await self._app._discover_unknown_device(nwk)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 945, in _discover_unknown_device
    return await zigpy.zdo.broadcast(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/device.py", line 623, in broadcast
    return await app.broadcast(
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 921, in broadcast
    await self.send_packet(
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 912, in send_packet
    raise zigpy.exceptions.DeliveryError(
zigpy.exceptions.DeliveryError: Failed to enqueue message after 3 attempts: <EmberStatus.NETWORK_BUSY: 161>

@HenrikBClausen
Copy link

I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0.

The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network.

@ABEIDO
Copy link

ABEIDO commented Feb 11, 2024

I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0.

The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network.

Any reason of not doing it the easy way via webflasher?
https://skyconnect.home-assistant.io/firmware-update/

What version did you upgrade from?

@wernerhp
Copy link

I disabled Multiprotocol support and it seems to have made a difference.

@codyc1515
Copy link
Contributor

I had the same issue with the zha and the network becoming congested - most often without an obvious cause. Following some of the advice above, the only thing I have done is to flash the SkyConnect to the latest beta version v7.4.0.0.
The way to get there is not trivial, but following the instructions for running the add-on for ssh access and flashing the SkyConnect has made a huge difference to the stabillity of my Zigbee network.

Any reason of not doing it the easy way via webflasher? https://skyconnect.home-assistant.io/firmware-update/

What version did you upgrade from?

I was on 7.3.1.0 but was never prompted to upgrade to latest version 7.3.2.0. Anyway, I have tried your recommendation and upgraded to 7.4.0.0. To do this, you need to download the relevant .gbl file and choose the Change Firmware option in that Web UI.

@wernerhp
Copy link

I'm also on 7.3.1. Will give 7.4.0 a shot

@ABEIDO
Copy link

ABEIDO commented Feb 11, 2024

I'm on 7.3.2.0. For you guys updating to the 7.4.0.0b please come back with feedback here if you can

@evelant
Copy link

evelant commented Apr 25, 2024

I'm on 7.4.0.0 skyconnect firmware (zigbee only, no multiprotocol) and am facing this same issue. It even happens when controlling a single ZHA group manually from the iOS app with no automations involved. I suspect that the hass UI isn't doing any sort of debouncing so if your finger moves 10 pixels on the color chooser it seems like it spams the network with 20+ broadcasts and locks things up. My network is solid, no device is more than 10ft from a repeater and I've made sure my channel is clear and not conflicting with my wifi. I'll have to give plain groups a try.

This issue is quite frustrating! It's very easy to trigger and usually results in lights being in all sorts of inconsistent states. I've avoided wifi devices because I thought zigbee would be more reliable but I've had nothing but trouble with reliability. I've tried zha and z2m with skyconnect, sonoff-p, and sonoff-e. Different unreliabilities with all of them. None of them have worked reliably like zwave does. Might be time to give up on zigbee and set up a dedicated ssid on an isolated subnet to make wifi devices secure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests