Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange reset on Feather M0 after random minutes #9

Closed
no-go opened this issue Mar 25, 2018 · 32 comments
Closed

strange reset on Feather M0 after random minutes #9

no-go opened this issue Mar 25, 2018 · 32 comments
Assignees

Comments

@no-go
Copy link

no-go commented Mar 25, 2018

Hi Adafuit.
I use your lib on Adafruit Feather M0 and getting randomly a reset after many minutes. The idea was to use:
Watchdog.sleep(250)
instead of
delay(250)
to save power. It works and runs 3 times longer, but it was resetting my device after a random time of minutes.

For debugging I adopt your code into my project: no-go/featherM0_ssd1331_watch@e403386

It worked but it was still resetting device after minutes :-(

I compared my attiny wdt sleep code with your code and decide to add #include <wdt.h> and rename the ISR:
void WDT_Handler(void) {
to
ISR(WDT_vect) {

but it does not fix the bug. I am sorry, but I add a lot of stuff to my feather M0 and I did not have a second, fresh one for testing with a simple, easy code. With delay(250) instead of Watchdog.sleep(250) and without your lib, I did not have a reset issue. I think, running the wdt in sleep mode for more than 1h couse a reset on a feather m0 - maybe a device bug or a bug in your lib.

I hope, you have the possibility to reproduce my issue and you can fix it.

:-D

@akubaa
Copy link

akubaa commented Apr 10, 2018

I also have strange behavior while using Watchdog.sleep() with my Adafruit Feather M0

I'm logging two serial interfaces to two different files on a SD card using "threads", but after i have included all libraries necessary for SleepyDog, the two serial interfaces start mismatching with the threads and dump interface1 buffert into interface2 file and so on

Feels like a lot of strange issues happens while using this library with M0

@no-go
Copy link
Author

no-go commented Apr 10, 2018

I build a very simple code with a LED "blink" in setup, and a sleep in the main loop. It blinks for example in 70min, 12min, 20min, 5min, 3min ... The code was similar to this:

#include <Adafruit_SleepyDog.h>
void setup() {
  pinMode(13, OUTPUT);
  digitalWrite(13, HIGH);
  delay(500);
  digitalWrite(13, LOW);  
}
void loop() {
  Watchdog.sleep(250);
}

@daterdots
Copy link

daterdots commented Jan 15, 2019

I'll add that our group is also seeing random resets on the Adafruit Adalogger M0. We have the sleep set to sleep for Watchdog.sleep(5000), and we see resets 0-6 times per day. There does not seem to be any clear pattern of resets in terms of time interval or duration after power up.

Also, we think but are not sure that these resets are associated with some strange data coming out of the Adalogger's serial TX pin. We have our Adalogger M0 hooked up to a device that receives commands over serial, and sometimes when the M0 resets our device will receive garbage commands. Strangely, if we intentionally use the watchdog to reset every 5000ms using Watchdog.enable(5000), we do not see any junk data come out of the serial port. The strange serial data issue is similar to the issue that @akubaa describes #9 (comment)

@daterdots
Copy link

As a sanity check, we tried only replacing Watchdog.sleep(5000) with delay(5000), and we have not seen a reset in about 6 observed days.

@phec
Copy link

phec commented Jan 29, 2019

I also get random resets and note that the frequency of the resets varies between 2 different M0 LoRa Feathers, I find that one resets infrequently - after 50000 sleeps or so while the other resets every 8000 sleeps or so. As noted by Daterdots, if I replace sleep(1000) with delay(1000) the problem goes away.
Both Feathers are battery powered and are reporting a few parameters by LoRa radio. They are maintaining a count of the number of messages sent which is how I spot the resets. The message counter is declared as volatile and updated after a nointerrupts() call.

@daterdots
Copy link

We dug into the library a bit, and we modified the library so that WatchDog.sleep() does not use the window mode. However, with everything stripped down, the problem still persists. Has anyone heard of a general problem with these ATMEL M0s having reset issues in general? I am starting to think this is an M0 issue more than a library issue.

@phec
Copy link

phec commented Feb 1, 2019

I wonder whether it is a hardware problem (given the variation between Feathers) - maybe a poor connection between the reset pullup resistor and the reset line. I have tied the reset pin high and have had no spurious resets in 3 days. There again given the low frequency of the intermittent fault I could be kidding myself.

@phec
Copy link

phec commented Feb 4, 2019

Sadly, tying the reset line high didn't prevent a reset. Having had none for 3 days I had two in the last four hours.
Strangely, another Feather M0 LoRa bought at the same time, so presumably one of the same batch, has been running the identical software for 15 days now without a reset.

@daterdots
Copy link

We also thought it might be a hardware problem, but we have no issues if we replace Watchdog.sleep() with delay(), so I don't think it's a hardware issue

@killer4king
Copy link

killer4king commented Mar 3, 2019

I'm seeing this issue also running an a feather M0 RF96 (868 radio),
so I wrote this sketch to see if I could see a patern here (in the number of sleepdog calls... at this moment (just 2 runs for now)
they seem to be random
(currently 65926, 17458 and 6548 cycles)

my code:
`#include <SPI.h>
#include <Wire.h>
#include <Adafruit_SleepyDog.h>
#include <Arduino.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>

int wait;
int sleepMS;
long counter;

//Declaration for an SSD1306 display connected to I2C (SDA, SCL pins)
#define OLED_RESET 4 // Reset pin # (or -1 if sharing Arduino reset pin)
#define SCREEN_WIDTH 128 // OLED display width, in pixels
#define SCREEN_HEIGHT 64 // OLED display height, in pixels (32 or 64)

Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET);

void setup() {
Serial.begin(115200);
while ((!Serial) & (millis() < 5000));
if (Serial) Serial.println(F("started the setup"));
counter = 0;
pinMode(10, INPUT);
pinMode(11, INPUT);
while (digitalRead(10)); //press button to start, so we can see what is still on screen at moment of restart
if (!display.begin(SSD1306_SWITCHCAPVCC, 0x3C)) { // Address 0x3C for 128x32
Serial.println(F("SSD1306 allocation failed"));
for (;;); // Don't proceed, loop forever
}
display.clearDisplay();
display.setTextSize(1); // Normal 1:1 pixel scale
display.setTextColor(WHITE); // Draw white text
Serial.println(F("going for it!"));
wait=50;
}

void loop() {
counter += 1;
display.setCursor(0, 0);
display.print("counter: ");
display.println(counter);
display.print("sleep: ");
display.println(sleepMS);
display.print("time passed: ");
display.println(counter*wait);
display.display();
display.clearDisplay();
sleepMS = Watchdog.sleep(wait);
}`
I will update when I have more values...

so been running this one a few times:
65926
17458
6548
3150
19923
2445
9952
4907
28978
4161
12978
940285
11634
20551
3408

.. seems there is no constant in there..
could it be that some discharge is triggering an input?! (only have I2C bus and 2 inputs in use (buttons with pull up resistors (and an LED) according to the blink plug from jeelabs)

@killer4king
Copy link

ps, I also don't have the issue when replacing the watchdog by delay(), then original code runs for months (indoor air monitor)

@killer4king
Copy link

is there anybody that has the sleepdog lib up and running without any issue on the M0?! .. do we miss somthing in setup?!

@phec
Copy link

phec commented Mar 21, 2019

Still no joy with Sleepydog but I'm trying the Arduino Zero RTC library and so far it hasn't caused the Feather to reset. It provides alarm (and implicitly timer) functions but goes nowhere near watchdog timing.
I have modified the RTC library to use the internal 32k clock rather than the external one to avoid a reported interaction with the Dallas library.
ForceTronics provided the library mod:
https://www.youtube.com/watch?v=wmWqkJ97Zsc
In brief, find the
RTCZero-master/src/RTCZero.cpp
file in your library directory make two changes:
comment out the line
//config32kOSC();
find the line:
GCLK->GENCTRL.reg = (GCLK_GENCTRL_GENEN | GCLK_GENCTRL_SRC_XOSC32K | GCLK_GENCTRL_ID(2) | GCLK_GENCTRL_DIVSEL );
and change to:
GCLK->GENCTRL.reg = (GCLK_GENCTRL_GENEN | GCLK_GENCTRL_SRC_OSCULP32K | GCLK_GENCTRL_ID(2) | GCLK_GENCTRL_DIVSEL );
These two changes swap the RTC library over to using the internal, less accurate but lower power timer.
Fingers crossed that this will do the job - so far so good.

@killer4king
Copy link

Hi Phec, I found that video also, but did not dear to use it yet.. now since you did, I will also start performing it ..
accuracy is not my biggest problem, since I also have a DS3231 mudule connected and synch once a day
I will let you know my experience with the RTC zero lib and sleep mode

@mattvenn
Copy link

I'm wondering if this has something to do with serial comms. I've found that I can get the watchdog to trigger with serial writes/reads. Increasing the timeout to something much larger helped (like 250ms) but still not gone completely.

@S2Doc
Copy link

S2Doc commented May 23, 2019

@killer4king I have used the SleepyDog library for almost a year now on multiple Feather M0s and it has behaved fine. I am only using it as a watchdog, however. I am not using the sleep functionality.

@ravelab
Copy link

ravelab commented Aug 20, 2019

this might be caused by this issue
https://www.avrfreaks.net/forum/samd21-samd21e16b-sporadically-locks-and-does-not-wake-standby-sleep-mode
; systick interrupt needs to be disabled in sleep():

// Disable systick interrupt
SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;
SCB->SCR |= SCB_SCR_SLEEPDEEP_Msk;
__DSB();
__WFI();
// Enable systick interrupt
SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;

disabling it fixed my issue.

@bt20304
Copy link

bt20304 commented Feb 7, 2020

I found the issue with M0 Random resets, it's in how the library defines Sleep mode.

in /utility/WatchdogSAMD.cpp Line#119-124
WDT->INTENSET.bit.EW = 1; // Enable early warning interrupt
WDT->CONFIG.bit.PER = 0xB; // Period = max
WDT->CONFIG.bit.WINDOW = bits; // Set time of interrupt
WDT->CTRL.bit.WEN = 1; // Enable window mode
while(WDT->STATUS.bit.SYNCBUSY); // Sync CTRL write
should have been
WDT->INTENSET.bit.EW = 1; // Enable early warning interrupt
WDT->CONFIG.bit.PER = 0xB; // Period = max
WDT->EWCTRL.bit.EWOFFSET= bits; // Set time of interrupt
WDT->CTRL.bit.WEN = 0; // Disable window mode
while(WDT->STATUS.bit.SYNCBUSY); // Sync CTRL write

M4 version above it also needs to be checked

ALSO USERS SHOULD NOTE
you cannot use sleep mode for 16 seconds, Early warning is usable only up to 8 seconds for sleep.

@daterdots
Copy link

@bt20304 we tried disabling window mode, and it did not help us #9 (comment):

We dug into the library a bit, and we modified the library so that WatchDog.sleep() does not use the window mode. However, with everything stripped down, the problem still persists. Has anyone heard of a general problem with these ATMEL M0s having reset issues in general? I am starting to think this is an M0 issue more than a library issue.

@kjloope
Copy link

kjloope commented May 20, 2020

I'm also having this issue, using an itsybitsy m0 express, logging analogRead data to the built-in SPI flash, sleeping for 500ms in between reads. It typically runs for about an hour, but then resets. Did you guys ever figure out a solution @daterdots? Thanks!

@daterdots
Copy link

Unfortunately not, @kjloope - sorry for the bad news!

@NAPtime2
Copy link

NAPtime2 commented Jun 9, 2020

Actually, I think ravelab has the answer. I set up my M0 adalogger to sleep for 100 ms, then flash the built in LED on wakeup, so about 10 sleep cycles per second. Using the sleepydog library as is, the board would reset in about 30 minutes.

To check ravelabs answer, I went into the WatchdogSAMD.cpp file, and added
SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;
after
int actualPeriodMS = enable(maxPeriodMS, true); // true = for sleep on line 188.

and
SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;
after the large comment section on line 214.

I've tried the same code after changing this, and it has run fine for the last 20 hours.

@daterdots
Copy link

interesting @NAPtime2 and @ravelab! maybe someone at Adafruit can check this out?

@daterdots
Copy link

@NAPtime2 do you want to make the PR, or should I?

@ladyada
Copy link
Member

ladyada commented Jun 9, 2020

PR's welcome :) thanx!

@NAPtime2
Copy link

NAPtime2 commented Jun 9, 2020

Go for it! (not even sure what you mean by PR tbh)

@daterdots
Copy link

@skot and @underpickled maybe you can help me with this PR when we have a sec?

@underpickled
Copy link

@daterdots sure, I'll take a look tomorrow

@ghost
Copy link

ghost commented Jun 17, 2020

Disabling the systick timer interrupt during sleep has previously been discussed on the Microchip/Atmel community forum: https://community.atmel.com/comment/2625116#comment-2625116.

After an enquiry on the forum about the WDT issue, Microchip/Atmel responded:

We have identified the issue with WDT reset. It happens due to SysTick timer.

Issue is that WDT initiates the wake up, but then SysTick interrupt starts to get handled first before system is actually ready. Basically, what happens is that SysTick interrupt does not wait for the RAM to properly wake up from sleep.

So if you wake up from WDT, the system will wait for the RAM, but the core clock will actually be running, so SysTick interrupt may happen too. SysTick interrupt does not wait on the RAM, so the core attempts to run the SysTick handler and fails, since RAM is not ready. This causes a Hard Fault (in our testing SRAM is so slow to wake up even Hard Fault handler).

This mean that device wakes up, getting into the Hard Fault, stay there until WDT fully expires.

You can reproduce the issue quicker by running SysTick timer faster, and WDT wake ups also quicker.

The solution for the customer is to disable SysTick interrupt before going to sleep and enable it back after the sleep.


SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;

// Deep sleep
sleepmgr_sleep(SLEEPMGR_STANDBY);

// Enable systick interrupt
SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;

@ghost
Copy link

ghost commented Jun 18, 2020

I've issued a pull request here: #24.

It addresses the issue by disabling the SysTick timer interrupts before putting the SAMD21 to sleep, then enabling them again after waking up.

A second issue is that the SAMD20 and SAMD21 definitions doesn't exsist, meaning that the line of code that prevents the flash from powering down never gets called:

NVMCTRL->CTRLB.bit.SLEEPPRM = NVMCTRL_CTRLB_SLEEPPRM_DISABLED_Val;

I've replaced the SAMD20 and SAMD21 definitions with SAMD20_SERIES and SAMD21_SERIES, these cover all chip variants.

@Jrwise
Copy link

Jrwise commented May 18, 2022

Martin

This fixed it for me. I don't understand why this hasn't been merged yet.

THANKS!

@PaintYourDragon
Copy link
Contributor

Thanks for the discussion and PR! This has been merged, and version # bumped to 1.6.1; should be present in Arduino Library manager after that percolates for an hour or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests