Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z-Stack_3.x.0 coordinator 20230507 feedback #439

Closed
Koenkk opened this issue Apr 1, 2023 · 340 comments
Closed

Z-Stack_3.x.0 coordinator 20230507 feedback #439

Koenkk opened this issue Apr 1, 2023 · 340 comments

Comments

@Koenkk
Copy link
Owner

Koenkk commented Apr 1, 2023

Please provide your feedback to the Z-Stack_3.x.0 coordinator 20230507 firmware here.

I hope this solves the NWK_TABLE_FULL many people were experiencing.

Changelog

20230507

  • Enable child aging to fix issues like #13478 (but not for older Xiaomi devices as they do not implement child aging correctly which gets them kicked out of the network)
  • Increase message timeout from 7 to 8 seconds to increase message delivery success rate for devices using a 7.5 seconds poll interval (#13478)
  • Improve performance with larger network
    • Optimize table sizes
    • Increase stack_size from 1024 to 8192
  • Add firmware for CC1352P7
  • SimpleLink SDK 7.10.00.98

Download

@guillaume042
Copy link

Ok Thanks !
you want some specific tests ? Do you need that i repair or just flash and see?

@lux73
Copy link

lux73 commented Apr 1, 2023

when using plain z2m install and not ZHA Addon you could flash and test without issue

@petergebruers
Copy link

Upgrade from CC2652R_coordinator_20220219.hex to 20230401 without issues.

End devices: 28
Router: 9

Uptime (only) 30 minutes, but did check if all sensors and actors are OK... and they are doing just fine.

@lux73
Copy link

lux73 commented Apr 1, 2023

just updated from 20221226 to 20230401

End devices: 35
Router: 17

update went smooth with cc2538-bsl.py Script

@3v1n0
Copy link

3v1n0 commented Apr 1, 2023

I'm not sure if it's related, and I'm using ZHA, but since I upgraded to previous version my slaesh dongle, the Ikea button devices aren't firing events anymore.

I can pair them and they immediately disappear (they are marked as unavailable) and no event is fired.

@guillaume042
Copy link

Upgrade done 6 hours ago.
At first the network was a little 'laggy'.
But so far so good it seems better now. Devices stays online and no more table full msg.
I still keep an eye on it.

Regards

@guillaume042
Copy link

During the evening, i got some table full message and one router get offline.
I also feel latency in the network.
It may be because all my lights are motion triggered when the luminosity is low ? So more messages on the network ?
Thank you

@guillaume042
Copy link

guillaume042 commented Apr 1, 2023

Something is happening.
3 routers down.
(on the log don"t use jardin_arrière_lampe it is a device i've disconnected physicely.)

``
zigbee2mqtt | Zigbee2MQTT:error 2023-04-01 20:25:13: Publish 'set' 'state' to 'salon-salleamanger-inter-plafonnier' failed: 'Error: Command 0x5c0272fffe16b298/2 genOnOff.on({}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":3077,"options":0,"radius":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'
zigbee2mqtt | Zigbee2MQTT:error 2023-04-01 20:26:03: Publish 'set' 'state' to 'salon-salleamanger-inter-plafonnier' failed: 'Error: Command 0x5c0272fffe16b298/2 genOnOff.on({}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":3077,"options":0,"radius":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'
zigbee2mqtt | Zigbee2MQTT:error 2023-04-01 20:26:43: Publish 'set' 'color_temp' to 'salon-ampoule-plafond' failed: 'Error: Command 0xa4c13819c3ce7ca2/1 lightingColorCtrl.moveToColorTemp({"colortemp":350,"transtime":10}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":7587,"options":0,"radius":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:26:54: Failed to ping 'jardin_arrière_lampe' (attempt 1/2, Read 0x00124b0023441baa/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC no ack' (233)))
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:27:16: Failed to ping 'jardin_arrière_lampe' (attempt 2/2, Read 0x00124b0023441baa/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'MAC no ack' (233)))
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:27:33: Failed to ping 'wc_haut_inter' (attempt 1/2, Read 0x60a423fffeaae633/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 62666 - 1 - 8 - 0 - 1 after 10000ms))
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:27:46: Failed to ping 'wc_haut_inter' (attempt 2/2, Read 0x60a423fffeaae633/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":62666,"options":0,"radius":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)')))
zigbee2mqtt | Zigbee2MQTT:error 2023-04-01 20:32:03: Publish 'set' 'state' to 'salon-salleamanger-inter-plafonnier' failed: 'Error: Command 0x5c0272fffe16b298/2 genOnOff.on({}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (SREQ '--> ZDO - extRouteDisc - {"dstAddr":3077,"options":0,"radius":30}' failed with status '(0xc7: NWK_TABLE_FULL)' (expected '(0x00: SUCCESS)'))'
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:33:11: Failed to ping 'sdb_bas_inter' (attempt 1/2, Read 0x60a423fffeaaf15a/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:33:38: Failed to ping 'sdb_bas_inter' (attempt 2/2, Read 0x60a423fffeaaf15a/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:35:04: Failed to ping 'parents_inter_plafonniers' (attempt 1/2, Read 0x60a423fffee2a71b/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":true,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))
zigbee2mqtt | Zigbee2MQTT:warn 2023-04-01 20:35:35: Failed to ping 'parents_inter_plafonniers' (attempt 2/2, Read 0x60a423fffee2a71b/1 genBasic(["zclVersion"], {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Data request failed with error: 'No network route' (205)))

``

@sjorge
Copy link
Sponsor Contributor

sjorge commented Apr 1, 2023

Been running for a few hours now, no offline/dropped devices yet, however it does feel slow to react at times.
This seems to usually co-inside after all my lights reported there color change for my adaptive lighting group. It seems to recover a bit afterwards, not a big issues though.

Total: 92
Router: 50
End devices: 42

@guillaume042
Copy link

Been running for a few hours now, no offline/dropped devices yet, however it does feel slow to react at times. This seems to usually co-inside after all my lights reported there color change for my adaptive lighting group. It seems to recover a bit afterwards, not a big issues though.

Total: 92
Router: 50
End devices: 42

I got the same feeling that issues appear when traffic is high.

@sjorge
Copy link
Sponsor Contributor

sjorge commented Apr 1, 2023

I got the same feeling that issues appear when traffic is high.

In this case there are 27 lights in the group that post a colorTemperature attributeReport. I guess some memory table will be full of those device entries that slowly age out making things faster again until the next report batch.

I'm guessing if a lot of end devices send an update at the same time it will be a similar effect, It's not really an issue (yet) just noticable if you are paying attention to any funkyness/change.

@guillaume042
Copy link

Little update.
Situation is better but not perfect :
2 routers offline. One of them get back onine clicking on reconfigure. The other not.
Some end device (online) which don"t send information (presence detector). (but some are ok).
@Koenkk could you do a little more magic ? :)

@lux73
Copy link

lux73 commented Apr 2, 2023

for me nothing changed with this FW Update - all my Devices staying online & reliable, anything is nice such as FW 20221226 was

Coordinator CC2652RB (slaesh)

@Koenkk
Copy link
Owner Author

Koenkk commented Apr 2, 2023

@guillaume042 let's wait for some feedback from more people first.

@rolf-tx
Copy link

rolf-tx commented Apr 2, 2023

Sonoff Zigbee 3.0 Plus
I had two eWeLink MS01 motions sensors stop sending events.
Restart HA or reboot Yellow did not re-activate the motion sensors.
Reinstalled 20220219 and one of them came back. The last one required reset and re-add to get it working.

@guillaume042
Copy link

@guillaume042 let's wait for some feedback from more people first.

No problem. I still experiment NWK full, router disconnection and end device 'muted' but it is better than previously.
Waiting for next step and i can give you whatever log you need.

@johnlento
Copy link

johnlento commented Apr 3, 2023

I am getting a lot of NWK_TABLE_FULL listings in my logs, 154 today. I never really looked into it before this firmware update but they are there. I am also still getting a lot of the coordinator disconnects, 2,601.

2023-04-03 13:23:43.727 DEBUG (MainThread) [zigpy_znp.api] Received command: ZDO.ExtRouteDisc.Rsp(Status=<Status.NWK_TABLE_FULL: 199>)

2023-04-03 03:20:28.332 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0x9B47:1:0x0702]: async_initialize: all attempts have failed: [DeliveryError('Coordinator is disconnected, cannot send request'), DeliveryError('Coordinator is disconnected, cannot send request'), DeliveryError('Coordinator is disconnected, cannot send request'), DeliveryError('Coordinator is disconnected, cannot send request')]

Let me know if I can help provide anymore. I use ZHA on Home Assistant and have a fair amount of devices.

@Venom84
Copy link

Venom84 commented Apr 4, 2023

I tried flashing with the Flash Programmer 2 ver 1.8.0 and im getting the following errors and nothing flashes,

Overlapping flash area in page: 0, offset address 0x0000
Reset target ...
Reset of target successful.

not sure why this is happening :-(

@nepozs
Copy link

nepozs commented Apr 4, 2023

not sure why this is happening :-(

I think firmware file is bigger than flash - for me it is impossible to flash CC2652P (Egony v4) with CC1352P2_CC2652P_other_coordinator_20230401.hex
CC2652P_flash_size_2023-04-04_03-18

@guillaume042
Copy link

@guillaume042 let's wait for some feedback from more people first.

No problem. I still experiment NWK full, router disconnection and end device 'muted' but it is better than previously. Waiting for next step and i can give you whatever log you need.

i don"t know if it is revelant, but it is not always the same 2/3 routers offline. Sometimes one get back while another one get off. Same for end device it is not always the same 5/6 which stop sending infos.

@Venom84
Copy link

Venom84 commented Apr 5, 2023

I think firmware file is bigger than flash - for me it is impossible to flash CC2652P (Egony v4) with CC1352P2_CC2652P_other_coordinator_20230401.hex

Ahh right, yeah this new FW is about 100kb bigger than the 20221226 hex file.
Aww man :-( so does this mean we can never get anything newer than coordinator FW 20221226?

Im using the zzh (CC2652R Stick), whats the best zigbee stick to use then going forward?

Im going to try and flash with cc2538-bsl.py and see if that makes any difference.

@nepozs
Copy link

nepozs commented Apr 5, 2023

@Venom84
It doesn't matter how would you flash - file can't be bigger than flash size (and even with exception for protected areas it could not be so big).
There are some TI chipsets with bigger flash size - eg. CC1352P7 (but I haven't seen any Zigbee dongle based on this chipset).

CC2652R, CC2652P, CC1352P2 have 352kB available for user program.

BTW date of publication 20230401 looks like April Fools' Day

@Koenkk
Copy link
Owner Author

Koenkk commented Apr 5, 2023

@Venom84 can you check if 20230405 fixes this?

@nepozs
Copy link

nepozs commented Apr 5, 2023

@Koenkk
for me (CC2652P Egony v4) 20230405 firmware flash still impossible
file size is 493 KB (505 284 bytes) exactly equal as 20230401 version
comparing to latest stable (and working) it is really bigger

last stable 20221226 has 400 KB (409 630 B), 20221220 (working for me old dev) the same size
older stable 20220219 has 388 KB (397 940B)

TI Flash Programmer log (I've renamed hex file, as all my archival hex files are in the same folder)

>Initiate access to target: COM4 using 2-pin cJTAG.
>Reading file: D:/zigbee/egony_v4_ebyte_koenkk/CC1352P2_CC2652P_other_coordinator_20230405alfa2.hex.
>Unknown record type: 3.
>Reset target ...
>Reset of target successful.

BTW
Log from successful flashing 20221226 looks like in reality there is plenty of free space in flash (I've checked Verify by readback), so maybe file format isn't so simple I thought.

>Reading file: D:/zigbee/egony_v4_ebyte_koenkk/CC1352P2_CC2652P_other_coordinator_20221226.hex.
>Start flash erase ...
>Erase finished successfully.
>Start flash programming ...
>Programming finished successfully.
>Start flash verify ...
>Page: 0 verified OK.
>Page: 1 verified OK.
>Page: 2 verified OK.
>Page: 3 verified OK.
>Page: 4 verified OK.
>Page: 5 verified OK.
>Page: 6 verified OK.
>Page: 7 verified OK.
>Page: 8 verified OK.
>Page: 9 verified OK.
>Page: 10 verified OK.
>Page: 11 verified OK.
>Page: 12 verified OK.
>Page: 13 verified OK.
>Page: 14 verified OK.
>Page: 15 verified OK.
>Page: 16 verified OK.
>Page: 17 verified OK.
>Page: 18 verified OK.
>Page: 19 verified OK.
>Page: 20 verified OK.
>Page: 21 verified OK.
>Skip verification of unassigned page: 22.
>Skip verification of unassigned page: 23.
>Skip verification of unassigned page: 24.
>Skip verification of unassigned page: 25.
>Skip verification of unassigned page: 26.
>Skip verification of unassigned page: 27.
>Skip verification of unassigned page: 28.
>Skip verification of unassigned page: 29.
>Skip verification of unassigned page: 30.
>Skip verification of unassigned page: 31.
>Skip verification of unassigned page: 32.
>Skip verification of unassigned page: 33.
>Skip verification of unassigned page: 34.
>Skip verification of unassigned page: 35.
>Skip verification of unassigned page: 36.
>Skip verification of unassigned page: 37.
>Skip verification of unassigned page: 38.
>Skip verification of unassigned page: 39.
>Skip verification of unassigned page: 40.
>Skip verification of unassigned page: 41.
>Skip verification of unassigned page: 42.
>Page: 43 verified OK.
>Verification finished successfully.
>Reset target ...
>Reset of target successful.

@Ricox1975
Copy link

Here the same problem with CC2652P Ebyte E72-2G4M20S1E - 20dBm
I'm back on CC1352P2_CC2652P_other_coordinator_20221226.hex

Since the disconnections are only about every 2 weeks.
Bildschirmfoto 2023-08-22 um 12 05 12

@azsystem
Copy link

azsystem commented Aug 24, 2023

Zigbee2MQTT version
1.32.2 commit: 1ec1e572

Coordinator type
zStack3x0

Coordinator revision
20230716

Ran flawless for approx a week, now lag and drops.
After restarting Zigbee2MQTT the lag is gone (using the button in the frontend).

Using sonoff stick.

[edit]

When restarting the Zigbee2MQTT software fixes the problem temporary, is it legit to conclude that the issue is not due to the stick's firmware but to the Zigbee2MQTT software?

@epower53
Copy link

Link: coordinator_20230716.zip

I was experiencing mass dropouts by part of my network ~ 1x / wk while running 0507... every single Inovelli VZM31-SN would suddenly show as offline in HA. Since updating to 0716 two weeks ago, this hasn't happened.

Edit: 1 week later and still fine running 0716.

Now at over 1 month on 20230716 and everything is still rock solid - faster than 20221226 and far more reliable than 20230507. Mid-sized network, so I'm not running into NWK_TABLE_FULL errors like some... very happy with this build for my not-so-large network.

image

@ahd71
Copy link

ahd71 commented Aug 25, 2023

Firmware 20230716 and tonight's latest dev release gave below error when I start it, is it because I have more than 200 devices in my network?

update 1: after reflashing, deleting the coordinator_backup device entries (but keeping the top part) it seems like it starts again. Looks like I had 202 active devices (but more configured), so not sure what happend (and if it will happend again if I restart). Will keep trying and see what breaks and works :-)

debug 2023-08-25 21:16:15: Loaded state from file /app/data/state.json info 2023-08-25 21:16:15: Logging to console and directory: '/app/data/log/2023-08-25.21-16-15' filename: log.txt debug 2023-08-25 21:16:15: Removing old log directory '/app/data/log/2023-08-25.21-12-37' info 2023-08-25 21:16:15: Starting Zigbee2MQTT version 1.32.2-dev (commit #1439f4c) info 2023-08-25 21:16:15: Starting zigbee-herdsman (0.18.4) debug 2023-08-25 21:16:15: Using zigbee-herdsman with settings: '{"adapter":{"concurrent":null,"delay":null,"disableLED":true},"backupPath":"/app/data/coordinator_backup.json","databaseBackupPath":"/app/data/database.db.backup","databasePath":"/app/data/database.db","network":{"channelList":[11],"extendedPanID":[221,221,221,221,221,221,221,221],"networkKey":"HIDDEN","panID":12495},"serialPort":{"adapter":"zstack","path":"/dev/ttyUSB0","rtscts":false}}' error 2023-08-25 21:16:36: Error while starting zigbee-herdsman error 2023-08-25 21:16:36: Failed to start zigbee error 2023-08-25 21:16:36: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start.html for possible solutions error 2023-08-25 21:16:36: Exiting... error 2023-08-25 21:16:36: Error: target adapter tclk table size insufficient (size=200) at AdapterBackup.restoreBackup (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/adapter-backup.ts:309:35) at ZnpAdapterManager.beginRestore (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/manager.ts:299:9) at ZnpAdapterManager.start (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/manager.ts:80:17) at Controller.start (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:132:29) at Zigbee.start (/app/lib/zigbee.ts:60:27) at Controller.start (/app/lib/controller.ts:101:27) at start (/app/index.js:107:5)

@MattWestb
Copy link

Is some knowing how many TC-Link key the firmware can storing in the NVM ?
With the 2023 Zigbee 3 fix for end devices more is updating the TC-Link key and before they was not doing it then was direct children to the coordinator and was using Zigbee HA 1.x paring = no TC-Link key used.
If its 200 possible its being problem then the 201 device is joining the network and like / must updating the TC-Link key.

@ahd71
Copy link

ahd71 commented Aug 25, 2023

"Error: target adapter tclk table size insufficient (size=200)" indicate it is 200 - but if that is a hard or soft limit I don't know.
I read somewhere that 200 is the max number of zigbee 3 devices but others could be a lot more.

@pannal
Copy link

pannal commented Aug 25, 2023

After trying out older versions I've finally settled on 20220219 to be the by far most stable firmware for the zzh!. No dropouts, no delay, stable for three weeks.

@MattWestb
Copy link

@ahd71 Its one hard limit how mush keys the NCP can storing in the NVM and then its end.
If having devices that is not requesting TC-Link key you can having 0xffff (65535) devices like in Zigbee HA 1.X.
EZSP is little different then using one hashed lTC-Link keys in Z2M and ZHA (IEEE + hash key = TC-Link key only hash key is storage for the network) and is not using the key storage / device so its theoretical have max 0xffff real Zigbee 3 devices in security network (R19 and newer devices).

Make one backup of the coordinator and looking in the file how many IEEE TC-Link keys is stored in it and you is knowing it.

@adekloet
Copy link

@adekloet please try https://github.com/Koenkk/Z-Stack-firmware/files/12063461/coordinator_20230716.zip

Started running the 20230716 today. Do we get a new Feedback Z-Stack_3.x.0 topic for this one?

Thanks,
Alex

Zigbee2MQTT version
1.32.2 commit: unknown

Coordinator type
zStack3x0

Coordinator revision
20230716

Coordinator IEEE Address
0x00124b002298904d

Frontend version
0.6.133

Stats

    Total 96
    By device type
    Router: 64
    End devices: 32
    By power source
    Mains (single phase): 64
    Battery: 30
    undefined: 2

@popy2k14
Copy link

@Koenkk sadly also i experienced slow downs with 1607. Did'nt investigate further.

Ricox1975

@Ricox1975 since yout also have a rather large network. Which version do you use, which has no slow downs and no issues with dropping devices?

thx

@cloudbr34k84
Copy link

For me this has been the best firmware! I had one drop out but for the last few weeks it's been really good.
123 devices

@Matthy1810
Copy link

For me version 20220219 is also the most stable one.

@snippem
Copy link

snippem commented Aug 31, 2023

Hello all

Not the most usefull post here but i have been running the 20230507 build since it was released.
What i have noticed is that is runs stable but after lets say a day or 3 to 4 the network slows down.
It is not that it becomes unresponsive but when a button is pushed to activate a light is turns on after a second sometimes more. But when i compare the build to older versions it has some advantages for instance with firmware updates the get loaded and the network stays responsive "enough". What i noticed the past few day's when i moved all of my plugs to ZHA and my network is reduces from 62 routers to 34 routers the whole network feels much more stable.(total 108 devices now on my zigbee2mqtt). Now installed the latest firmware 20130716 and with the latest router changes hoping the network stays responsive for a longer time.

@sjorge
Copy link
Sponsor Contributor

sjorge commented Aug 31, 2023

Had a 3rd crash on 20230716 so far :( It's certainly less stable than the one I was using before that had the occasional slow down.

Nothing interesting in the logs, it seems it just stopped responding at some point:

debug 2023-08-31 08:54:16: Received MQTT message on 'zigbee2mqtt/adaptive_ligthing/set' with data '{"color_temp":379,"transition":3}'
debug 2023-08-31 08:54:16: Publishing 'set' 'color_temp' to 'adaptive_ligthing'
error 2023-08-31 08:54:22: Publish 'set' 'color_temp' to 'adaptive_ligthing' failed: 'Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":379,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)'
debug 2023-08-31 08:54:22: Error: Command 10100 lightingColorCtrl.moveToColorTemp({"colortemp":379,"transtime":30}) failed (SRSP - AF - dataRequestExt after 6000ms)
    at Timeout._onTimeout (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/utils/waitress.ts:64:35)
    at listOnTimeout (node:internal/timers:569:17)
    at processTimers (node:internal/timers:512:7)

Briefly before that it still received some data:

debug 2023-08-31 08:53:55: Received Zigbee message from 'light/livingroom/main/bulb1', type 'commandQueryNextImageRequest', cluster 'genOta', data '{"fieldControl":0,"fileVersion":16785162,"imageType":276,"manufacturerCode":4107}' from endpoint 11 with groupID 0

I wonder if somehow the fw is just crashing/running out of memory.

Restarting z2m doesn't help, unplug and replugging the USB does. All device got pinged and came back except one:

0xcc86ecfffee6193e, which is a Niko Single connectable switch,10A

Seems there are no missing devices in the nvm of the coordiantor

{
  "data": {
    "missing_routers": []
  },
  "status": "ok"
}

@BradleyFord
Copy link

BradleyFord commented Aug 31, 2023 via email

@pdecat
Copy link

pdecat commented Aug 31, 2023

Unplugging/re-plugging back is what I've resorted to be doing for a month, until I managed to properly downgrade after a month or so. Even installed a remotely controllable USB Hub to automate this.

Traced everything in #439 (comment) FWIW.

@Ricox1975
Copy link

@popy2k14
currently 20221226, problem-free for almost two weeks.
I also reduced my network to 170 actors

@popy2k14
Copy link

popy2k14 commented Sep 1, 2023

@pdecat thx a lot for the detailed tracing in your comment above. Sadly i lost the track :-)
Do you also use 20221226 now?
Is the NVRAM backup/restore necessary?

@Ricox1975 thx for the info.
Did you also use NVRAM backup/restore on downgrade?

My plan is to try (from top to bottom):

  • 20221226 (NVRAM Backup/Restore)?
  • 20220219

@okastl
Copy link

okastl commented Sep 1, 2023

Restarting z2m doesn't help, unplug and replugging the USB does.

Which USB stick do you use as coordinator? I have solved this plug-problem by writing a TCP/Serial converter in Python, which holds RTS (causing a reset with most sticks) as soon as z2m drops the tcp connection. This may not work with all sticks, but it works fine with Slaesh and Sonoff-P.
Maybe @Koenkk can add this "hard reset" as an option to herdsman.
I have attached my modified serial server in case you want to try it.
EDIT: I have attached the wrong file.
elamatcp.zip
For a Sonoff-P I use this command line
elamatcp.py --rtsreset --dtr 0 -I 127.0.0.1 <serial_port>

z2m configuration looks like this:

serial:
  port: "tcp://127.0.0.1:7777"

@pdecat
Copy link

pdecat commented Sep 1, 2023

Do you also use 20221226 now?

Yes.

Is the NVRAM backup/restore necessary?

I think so, but it was not enough in my case to get my network stable again, the only working way was ZHA migration, not sure why...

@Ricox1975
Copy link

@popy2k14
no, i use this option
Bildschirmfoto 2023-09-01 um 13 45 50

@popy2k14
Copy link

popy2k14 commented Sep 1, 2023

@pdecat @Ricox1975 thx. DOwngraded to 20221226 without the nvm backup/restore.
Will see how it goes.

thx

@SargonofAssyria
Copy link

SargonofAssyria commented Sep 1, 2023

Just my 2 cents. I am running 20230716 on a Texas Instruments CC1352P2 CC2652P launchpad coordinator for a month now, and found no problem at all. No device dropped off the mesh and network is fast.
Stats
Total 91
By device type
End devices: 49
Router: 42
By power source
Battery: 47
Mains (single phase): 43
DC Source: 1

@Koenkk
Copy link
Owner Author

Koenkk commented Sep 2, 2023

I've compiled 2 new test firmwares, let's continue in #474

@Koenkk Koenkk closed this as completed Sep 2, 2023
@Koenkk Koenkk unpinned this issue Sep 2, 2023
Repository owner locked as resolved and limited conversation to collaborators Sep 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests