Skip to content
M Hightower edited this page May 18, 2023 · 9 revisions

AbendInfo Wiki

AbendInfo - Abnormal End Information

Objective/Goal: To get a better understanding of an ESP8266 crash. In particular, the frustrating inexplicable Hardware WDT Reset and the Software WDT Reset.

  1. Collect information about the crash not present in the stack trace.
  2. Convert some crash conditions that result in Hardware WDT Reset into ill, exccause 0, exception event.
  3. Recover the source of some Software WDT Resets
  4. Buffer/capture the last ets_printf just before: 2 and 3.
  5. Convert unhandled breakpoints to ill, exccause 0, exception event.
  6. Replace unattended Boot ROM exceptions with the SDK's general exception handler.
  7. Without 5 and 6 their default logic would lead to HWDT Resets.

Generally speaking, Hardware WDT Reset and Software WDT Reset crashes are often the results of an Infinite Loop. While the sketch looping and failing to let the system run is often the assumed culprit, that is not always the case. Once you know you don't have any Infinite Loops in your code, what is the next move? This library may help there.

Other sources of HWDT Resets

  1. Some are the results of unhandled breakpoints in the SDK or Boot ROM, and others result from unhandled exceptions which fall into the unhandled breakpoint path. These breakpoints can usually be caught with gdb. However, nobody runs their sketch all the time with gdb, and some failures are so infrequent that gdb is not a viable monitoring tool for the crash.
  2. Another group of causes for WDT Reset crashes is Deliberate Infinite Loops. In "C" code, these are often written as while(true){/* empty */}. In asm, this compiles down to loop: j loop. Which has the byte pattern 0x06, 0xff, 0xff. If you search through SDK v3.0.5, you can find about 95 of these. 94 of these Infinite Loops are preceded by ets_printf. Which does nothing unless you have a console connected and have debug printing enabled. When you think about it, Deliberate Infinite Loops is not an unreasonable way to handle an unrecoverable event; however, not providing a clue for the WDT reset is intolerable. And, if you were to see the "last gasp" error message, they are not documented. When interrupts are enabled, these Infinite Loops present as Software WDT Resets, which can give you some clue of the location of the crash; however, when interrupts are disabled, you get zero details, only a Hardware WDT Reset.

I suspect the root cause or stimulus for these is low heap space with OOM events.

To be continued ...