Skip to content

load_network_info → xncp_get_route_table_entry burst during periodic backup causes ASH ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT / NcpFailure loop (bellows 0.49.1, Python 3.14, ZBT-2) #724

@mc-k

Description

@mc-k

Summary

During every periodic zigpy network backup, load_network_info walks the
coordinator's route table via repeated xncp_get_route_table_entry calls. This XNCP
burst exceeds the ASH ACK-timeout budget (ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT),
drops the NCP connection, and ZHA reinitialises into a crash loop (~8 s cycles) until
zigpy's backup retry backs off. Setting backup_enabled: false stops it completely.

Related prior report (different trigger, same error code): #617

Versions

  • bellows 0.49.1
  • zigpy 1.4.1
  • zha 1.3.1 / zhaquirks 1.2.0
  • Home Assistant Core 2026.5.4, Home Assistant OS 17.x
  • Python 3.14

Hardware / firmware

  • Home Assistant Connect ZBT-2 (EFR32MG24), EmberZNet 7.5.1.0, USB-attached
  • ~25 Zigbee devices, channel 15
  • Also reproduced identically on the HA Yellow's internal EFR32MG21 radio (different
    silicon, different transport — internal UART) before migrating to the ZBT-2. The
    fault is host-side: same failure on two different chips and two different transports.

Feature flag context

The route-table walk only runs when FirmwareFeatures.RESTORE_ROUTE_TABLE is present
in self._ezsp._xncp_features. On ZBT-2 / EmberZNet 7.5.1.0 this feature IS
advertised, so the code path runs. EmberZNet's source-route table holds up to 200
entries, so the walk can issue ~200 XNCP customFrame calls in rapid succession —
which is what overwhelms the ASH link.

Problem description

ZHA starts and runs normally. When the periodic backup fires, create_backup
load_network_info(load_devices=True) walks the route table via
xncp_get_route_table_entry(index=index) and the ASH link exceeds its max
ACK-timeout count:

Backup-path traceback:
WARNING [zigpy.backups] Failed to create a network backup
zigpy/backups.py _backup_loop → create_backup
zigpy/backups.py create_backup → load_network_info(load_devices=...)
bellows/zigbee/application.py:438 → ezsp.xncp_get_route_table_entry(index=index)
bellows/ezsp/init.py:832 → send_xncp_frame(GetRouteTableEntryReq(index=index))
bellows/ezsp/init.py:749 → customFrame(...)
bellows/ezsp/protocol.py:124 → send_data
bellows/uart.py:26 → send_data
bellows/ash.py:683 _send_data_frame → await ack_future
bellows.ash.NcpFailure

Companion failure during reinit (pre_permit / setPolicy), same ASH layer:
ERROR [homeassistant] Error doing job: Task exception was never retrieved
bellows/ezsp/v8/init.py:50 pre_permit → setPolicy
bellows/ezsp/protocol.py:124 → send_data
bellows/uart.py:26 → send_data
bellows/ash.py:735 send_data → asyncio.shield(...)
bellows/ash.py:660 _send_data_frame → raise NcpFailure(ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT)
bellows.ash.NcpFailure: NcpResetCode.ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT

Steps to reproduce

  1. ZBT-2 on EmberZNet 7.5.1.0 with FirmwareFeatures.RESTORE_ROUTE_TABLE advertised.
  2. ZHA running normally; devices communicate without issue.
  3. Wait for the periodic network backup to fire (~first occurrence ~76 s after startup).
  4. Backup fails as above; NCP connection drops; ZHA reinitialises and re-fires the
    backup → repeating ~8 s crash loop, CPU 60–90%.

Expected vs actual

Expected: route-table walk completes or fails gracefully without losing the NCP
connection.

Actual: ASH ACK-timeout NcpFailure, connection drop, reinitialisation loop.

What I've already ruled out

  • Not a duplicate coordinator (a prior ROUTE_ERROR_ADDRESS_CONFLICT storm from a
    second radio on the same PAN was a separate fault, now resolved).
  • zigbee.db PRAGMA integrity_checkok.
  • Not host storage/recorder: DB on SSD, CPU is low except during this loop.
  • Single accessor: one ZHA instance, USB-attached ZBT-2, no Z2M or CLI tool on the
    port.

Workaround

zha:
  zigpy_config:
    backup_enabled: false

This prevents load_network_info from being called by the backup loop and stops the
crash entirely.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions