-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast #16061
Comments
It looks like you're running out of heap space. A value of 14 is low and the exception seems to occur during heap allocation for a flash memcpy_P action.
What do you mean by this? As it only takes 20 minutes I suggest you enable heap logging (command |
I've migrated from old DNS to new one, and changed DHCP settings accordingly. But that shaped not affect anything This is console log from tdmgr during reboot
|
Wow |
Find the cause of executing command status 8 every 5 seconds. The comand is executed from MQTT. Remove this and see if stability returns. |
Removed Status call but this did not help
|
Thx. So we still see heap degrading probably due to a memory leak.... For now I have no idea why considering you use the standard development release with a simple template. I'll dive into the stack trace again tomorrow to see if it contains further hints. NOTE I'm not sure if you use the standard tasmota.bin development binary or use your own compiled bimary with additional features. Pls explain. |
i'm using binary from http://ota.tasmota.com/tasmota/release/tasmota.bin.gz |
Alas, I cannot find any Tasmota related cause for this. The stacktrace is long and only contains SDK network calls:
It's a WiFi network issue eating the heap but I would expect a RSSI of 78 should still be enough for a stable connection. Perhaps the wifi hardware is broken leading to bad connection attempts. |
Actually, all the devices with issues use 3 Mikrotik AP. On other site i have AP of the same model but newer firmware and have no issues. Will try to upgrade Wifi firmware to newer version and report back |
Did the update Mikrotik AP to 6.49.6, connected additional AP from ASUS, Installed fresh Sonoff mini tasmotized to 12.0.2 but sonoffs continue to reboot about every 20 minutes because eating the heap. Other network devices like cameras work normally. What additional info i need to provide to continue investigation of network part? |
First remove the ASUS AP. It is worst thing you can use in combination with espressif devices. Endless story.... |
1 similar comment
First remove the ASUS AP. It is worst thing you can use in combination with espressif devices. Endless story.... |
Sorry accidently closed, internet connection problems.. |
Looks like i found the issue. During reconfiguration i've added ntp server that send multicast and manycast packets (https://wiki.mikrotik.com/wiki/Manual:System/Time). After i disabled it, heap stopped to grow. Looks like there is some issue on network stack for working with multicast/manycast. I will investigate further during weekend and report back. |
This is great news! I looked for this issue in the old issue list and came across mikrotik and ntp server misconfigurations. As I rewrote the ntp client part since then I couldn't link it to your issue. Pls find time to investigate and let me know what to change to finally solve this. |
Did you also experience heap degrading on ESP32 based devices? I searched for UDP multicast packet handling on ESP8266 and find this esp8266/Arduino#7907 (comment). Obviously the dropping of overflow packets doesn't seem to work within the current core code. |
I noticed for ESP8266 we use our own UDP listener driver. It might well be this driver doesn't test for overflow packets. I will dive in. |
I would like to report that I have got the same issues. Our policy doesn't allow using public NTP servers for time resolving, so our DHCP scope assigns internal NTP servers. These are Server 2019 boxes and may not be working as expected. For testing, I'll open port 123 for the Tasmota devices and see if the problem disappears when running stock firmware. Niels |
Well, removing the ntp settings from the DHCP pool did not work as expected. Running I fetch the list of NTP servers from an other site of Tasmos and injectected the NTP-config backlog NtpServer1 pool.ntp.org; NtpServer2 nl.pool.ntp.org; NtpServer3 0.nl.pool.ntp.org However, after 21 minutes the devices rebooted. It is happening across multiple devices. Running several different 11.x and 12.x tasmo releases, but still not able to predict why this is happening. Niels |
Now what is reported in this Issue is a problem with some NTP servers sending multicast/manycast UDP packets. |
Hi @barbudor I enabled multicast/manycast on my ntp server and Tasmota devices started to reboot. When i disabled multicast/manycast on ntp reboots stopped. the issue is that some device in network (in my case it was ntp) sending lots of multicast/manycast packets causing excessive heap usage in Tasmota. When heap goes to ~0 Tasmota reboots. |
I'm OoO. When I'm back I'll dive into the UDP multicast problem. Tasmota uses different UDP receive drivers so first issue is finding out which one is the culprit. I should be able to debug with either a mikrotik virtual router or a cheap hardware one. |
@barbudor, Unfortunately moving to " pool.ntp.org" servers and disabling the NTP server option from the DHCP server did not resolve the issue. Still spot-on 20 minutes a restart of the devices. Strangely it doesn't seem to be affecting all boxes, for now it is pinned to that network. I've got two other networks running tasmota devices (also windows dhcp servers) and don't show the issue. For testing purposes I am going to do the following. See if that does something. NIels |
For testing purposes, I am going to do the following. See if that does something. Not ideal, but for me, this works as a workaround for now. ==-== |
@darkdragondraco I know, @NielsPiersma |
Sorry
Clarification
Problem is gone (for all tasmos) when using fixed up addresses
Problems surfaces on 'some ' tasmos when set to DHCP .
Have not figured out the root cause.
Will update all tasmos to development version. So they all have the same firmware
Then I will enable DHCP again and see what is happening.
There must be some config causing this to happen.
I've over 100 tasmos running on various hardware but only at this site am I having this issue. It must be related to DHCP.
Also got some new d1 that work fine, so likely will move over the sensors and reprovision them.
Niels
…________________________________
From: Barbudor ***@***.***>
Sent: Wednesday, August 3, 2022 5:30:48 PM
To: arendst/Tasmota ***@***.***>
Cc: NielsPiersma ***@***.***>; Mention ***@***.***>
Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)
@darkdragondraco<https://github.com/darkdragondraco> I know,
my message was addressed to @NielsPiersma<https://github.com/NielsPiersma> who reported similar problem as yours
So the question "Does your NTP need multicast/manycast ?" is exactly what I asked @NielsPiersma<https://github.com/NielsPiersma>
@NielsPiersma<https://github.com/NielsPiersma>
So that's interresting, you found a configuration where the heap doesn't drop anymore and one where it does.
Would it means the problem is not actually related or not fully related to multicast/manycast ....
—
Reply to this email directly, view it on GitHub<#16061 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQYDR54SOYGIFFPBZITGI53VXKGCRANCNFSM54PQYJ3Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
After few hours problem reappeared again, moved all tasmotas to another VLAN, dedicated to tasmotas. Continue to monitor situation |
My Tamotas are in a dedicated VLAN by design and have been for over 3 years. The VLAN only houses ESP8266 based devices running Tasmota, 1 raspberry Pi, and the router/gateway (this being a Cisco 3750 switch). Still, I have the same issues when the Tasmotas are on DHCP. The reboot is predictable, every 20 minutes. No NTP server is offered by DHCP. I am thinking about capturing all data from the VLAN for 25 minutes; that may shed some light on what is going on. Niels |
@NielsPiersma |
A Pi in the network does no harm in general. I have 3 Pis running together in the same network with all my Tasmota devices. |
The pi is running flightaware so just port ,80 open and posting to flightradar.
I am not expecting this being any issue. Just want to be so complete as possible
…________________________________
From: Georgiy Brodskiy ***@***.***>
Sent: Sunday, August 7, 2022 4:15:59 PM
To: arendst/Tasmota ***@***.***>
Cc: NielsPiersma ***@***.***>; Mention ***@***.***>
Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)
@NielsPiersma<https://github.com/NielsPiersma>
What are the services hosted on pi? As i also have several pi's in network. Moving to other lan did not completely help, some devices still reboot. Disabled NTP server. Monitoring for now.
—
Reply to this email directly, view it on GitHub<#16061 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQYDR557ZELPHWVKA634ATLVX7AJ7ANCNFSM54PQYJ3Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I like to jump into this thread as that I experience the same problems. Got a couple of Sonoff devices but also Shelly's. The devices connect to Unifi AP's There is a major change in my vlan, and I'm guessing that this triggered the situation. I moved the router IP from a Juniper switch to VyOs router. Still troubleshooting. |
Unifi aps here as well .
However one site is experiencing this issue. The other unifi AP site is not.i will compare settings.
Niels
…________________________________
From: Silvan Raijer ***@***.***>
Sent: Tuesday, August 16, 2022 4:37:33 PM
To: arendst/Tasmota ***@***.***>
Cc: NielsPiersma ***@***.***>; Mention ***@***.***>
Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)
I like to jump into this thread as that I experience the same problems. Got a couple of Sonoff devices but also Shelly's.
They all reboot every 20 minutes. Devices where running version 8.x - 10.x. I upgraded them all to version 12.0.2 yesterday but that didn't solve the problem for me.
The devices connect to Unifi AP's
There is a major change in my vlan, and I'm guessing that this triggered the situation. I moved the router IP from a Juniper switch to VyOs router.
Still troubleshooting.
—
Reply to this email directly, view it on GitHub<#16061 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQYDR5YA5F34QUHIHRP2KBDVZORS3ANCNFSM54PQYJ3Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Update,
Besides the SSID and the WPA password, the site settings are identical.
(IGMP, etc)
Still didn't find time to do a full Wireshark capture. Hopefully will get
it sorted out later this week.
NIels
Op di 16 aug. 2022 om 16:37 schreef Silvan Raijer ***@***.***
…:
I like to jump into this thread as that I experience the same problems.
Got a couple of Sonoff devices but also Shelly's.
They all reboot every 20 minutes. Devices where running version 8.x -
10.x. I upgraded them all to version 12.0.2 yesterday but that didn't solve
the problem for me.
The devices connect to Unifi AP's
There is a major change in my vlan, and I'm guessing that this triggered
the situation. I moved the router IP from a Juniper switch to VyOs router.
Still troubleshooting.
—
Reply to this email directly, view it on GitHub
<#16061 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQYDR5YA5F34QUHIHRP2KBDVZORS3ANCNFSM54PQYJ3Q>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'm positive that I found the issue on my network what was causing all devices to reboot every ~20 minutes. I removed the ntp-server option and all devices are running longer than 2 hours now. They are allowed to query the time on the internet. This thread helped me to find the issue. Thanks |
@rayzilt could you do a test by changing the below code in file
to
to see if it stops heap degrading. |
I'll test later this day when I get home from work |
@arendst An update on your patch request. Before trying your patch I wanted to have the same situation. Unfortunately I'm unable to regain the same situation by restoring the old settings that I had when all devices where running out of memory. I've had it running for a day now. The setting that I restored are:
I'm quite puzzled. I'll let this faulty configuration running for a time being. |
This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions. |
Keep active |
This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions. |
That is what I am seeing as well. Not all devices in my network are affected. Some work as expected.
Niels
…________________________________
From: Silvan Raijer ***@***.***>
Sent: Thursday, August 18, 2022 7:54:46 PM
To: arendst/Tasmota ***@***.***>
Cc: NielsPiersma ***@***.***>; Mention ***@***.***>
Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)
@arendst<https://github.com/arendst> An update on your patch request.
Before trying your patch I wanted to have the same situation. Unfortunately I'm unable to regain the same situation by restoring the old settings that I had when all devices where running out of memory. I've had it running for a day now.
The setting that I restored are:
* DHCP ntp-server set to a internal IP that is not reachable;
* DNS server set to a different IP that is not reachable.
I'm quite puzzled. I'll let this faulty configuration running for a time being.
—
Reply to this email directly, view it on GitHub<#16061 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQYDR52WKXHAXAR2O5IXIC3VZZ2GNANCNFSM54PQYJ3Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
This issue has been automatically marked as stale because it hasn't any activity in last few weeks. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue was automatically closed because of being stale. Feel free to open a new one if you still experience this problem. |
Hello, I have moved all my ESP devices from Tuya to Tasmota. At this moment I have 5 devices running and all of them are restarting every 20min due to the free memory being exhausted throwing Exception 29. I was following this thread, couple weeks ago realized I had a container with NTP running (in a different subnet), disabled and removed the container and the devices stopped restarting. Unfortunately after 10, maybe 14 days the issue re-appeared (after a power outage) and now I can't seem to figure out what the root cause is. My wifi is provided by multiple Mikrotik APs. Analyzing network traffic I can see the Tasmota devices are connecting to the NTP server specified in DHCP option 42... Kindly requesting help. Mike |
Mike,
Just for eliminating the obvious, can you pls configure a static IP address and configure the ntp server address manually and retest.
If the issue persists it is not related to the DHCp issue that was resolved last year .
Kind regards
Niels
…________________________________
From: kristoficko ***@***.***>
Sent: Wednesday, January 10, 2024 10:42:15 AM
To: arendst/Tasmota ***@***.***>
Cc: NielsPiersma ***@***.***>; Mention ***@***.***>
Subject: Re: [arendst/Tasmota] Sonoff devices reboot every ~20 minutes with "Exception":29 caused by NTP multicast/manycast (Issue #16061)
Hello,
I have moved all my ESP devices from Tuya to Tasmota. At this moment I have 5 devices running and all of them are restarting every 20min due to the free memory being exhausted throwing Exception 29.
I was following this thread, couple weeks ago realize I had a container with NTP running, disabled and removed the container and the devices stopped restarting.
Unfortunately after 10 days or so the issue re-appeared and I can't seem to figure out what the root cause is.
My wifi is provided by multiple Mikrotik APs.
Analyzing network traffic I can see the Tasmota devices are connecting to the NTP server specified in DHCP option 42...
Kindly requesting help.
Mike
—
Reply to this email directly, view it on GitHub<#16061 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQYDR5Z7MCZIFIJFPILUDBTYNZWAPAVCNFSM54PQYJ32U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBYGQ3DAMJRGI3Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi Niels, many thanks for the suggestion, I did setup static address on all 5 devices. Although it's a bit early to tell, I can already see the following improvement:
It doesn't seem to make a difference whether I setup my own internal NTP server or leave the 3 default ones. One more observation, which may be relevant to this issue - there seems to be a massive difference in the frequency of NTP requests:
Kind regards Mike |
DHCP option 42 for Timeserver is since a few major versions not used anymore in Tasmota. |
PROBLEM DESCRIPTION
Sonoff devices after upgrade to Tasmota 12.0.2 reboot every ~20 minutes with StatusSTK":{"Exception":29,"Reason":"Exception","EPC":["40252ca8",...
REQUESTED INFORMATION
Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!
Backlog Template; Module; GPIO 255
:Backlog Rule1; Rule2; Rule3
:Status 0
:weblog
to 4 and then, when you experience your issue, provide the output of the Console log:TO REPRODUCE
Steps to reproduce the behavior:
Reboots every ~20 minutes
EXPECTED BEHAVIOUR
A clear and concise description of what you expected to happen.
Does not reboot without clear reason
SCREENSHOTS
If applicable, add screenshots to help explain your problem.
ADDITIONAL CONTEXT
Add any other context about the problem here.
Started few days ago and only in one location. in other location sonoff devices with same firmware work ok. Problem devices use different power lines some of them are UPSed. Tried to downgrade to 11.1 without clearing config but it did not help.
The only this was changed at environment were DNS and DHCP server reconfiguration.
(Please, remember to close the issue when the problem has been addressed)
The text was updated successfully, but these errors were encountered: