Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when using interrupt on pins + webserver #3337

Closed
tuxedo0801 opened this issue Jun 9, 2017 · 16 comments
Closed

Crash when using interrupt on pins + webserver #3337

tuxedo0801 opened this issue Jun 9, 2017 · 16 comments

Comments

@tuxedo0801
Copy link
Contributor

Basic Infos

Hardware

Hardware: Sonoff Pow - https://www.itead.cc/sonoff-pow.html (looks ESP12 compatible)
Core Version: via Platform.IO - Espressif 8266 1.3.0

Description

I'm using the HLW8012 library from https://bitbucket.org/xoseperez/hlw8012.
The sensor does some measurements (voltage/current/power/...) and provides the result via square wave signal on two pins to the ESP.
Both pins will be read via interrupt. The frequency is somewhat below 1Hz up to a few hundred Hz.

The measurement is presented to the user via the webserver and a simple webpage.
If I load/reload/refresh the page frequently (no matter if its 5sek or 30sek), the ESP sooner or later crashs (see decoded stack).

The issue happens with almost 0Hz as well as with >1Hz on the pins.

The issue #1020 looks like the same issue. I tried to suggested solution (disable the interrupts before actually handling the interrupt and enabling it after interrupt handler is done), but this did not solve the issue.

Settings in IDE

Module: ESP12
Flash Size: 4MB
CPU Frequency: 80Mhz
Flash Mode: qio
Flash Frequency: 40Mhz
Upload Using: SERIAL
Reset Method: ck

Sketch

Sketch shortened to just show whats done ... It's not compileable or copy&paste-usable...

#include <ESP8266WiFi.h>
#include "HLW8012.h"
#include <DNSServer.h>
#include <ESP8266WebServer.h>

// HLW lib
HLW8012 _hlw8012;

ESP8266WebServer webserver(80);

void setup() {
  // Initialize HLW8012
  _hlw8012.begin(GPIO_HLW8012_CF, GPIO_HLW8012_CF1, GPIO_HLW8012_SEL, CURRENT_MODE, true);
  _hlw8012.setResistors(CURRENT_RESISTOR, VOLTAGE_RESISTOR_UPSTREAM, VOLTAGE_RESISTOR_DOWNSTREAM);

  // attach interrupts for HLW library
  attachInterrupt(GPIO_HLW8012_CF1, hlw8012_cf1_interrupt, CHANGE);
  attachInterrupt(GPIO_HLW8012_CF, hlw8012_cf_interrupt, CHANGE);


  webserver.on("/", handleRoot);
  webserver.onNotFound(handleNotFound);

  webserver.begin();
}

int _activePower = 0;
int _voltage = 0;
double _current = 0;
int _apparentPower = 0;
int _reactivePower = 0;
int _powerFactor = 0;

void loop() {
  static unsigned long last = millis();

  if ((millis() - last) > UPDATE_TIME) {

    last = millis();

    // get measured values, store them in variables
    _activePower = _hlw8012.getActivePower();
    _voltage = _hlw8012.getVoltage();
    _current = _hlw8012.getCurrent();
    _apparentPower = _hlw8012.getApparentPower();
    _reactivePower = _hlw8012.getReactivePower();
    _powerFactor = (int) (100 * _hlw8012.getPowerFactor());

    Serial.print("[HLW] Active Power (W)    : "); Serial.println(_activePower);
    Serial.print("[HLW] Voltage (V)         : "); Serial.println(_voltage);
    Serial.print("[HLW] Current (A)         : "); Serial.println(_current);
    Serial.print("[HLW] Apparent Power (VA) : "); Serial.println(_apparentPower);
    Serial.print("[HLW] Reactive Power (VA) : "); Serial.println(_reactivePower);
    Serial.print("[HLW] Power Factor (%)    : "); Serial.println(_powerFactor);
    Serial.println();
  }

  // provide variables in webserver to user
 webserver.handleClient();
}

// redirect interrupt to HLW library
void hlw8012_cf1_interrupt() {
  _hlw8012.cf1_interrupt();
}
void hlw8012_cf_interrupt() {
  _hlw8012.cf_interrupt();
 }

For reference: the methods in the HLW lib handling the interrupt (nothing spectacular ...):

void HLW8012::cf_interrupt() {
    unsigned long now = micros();
    _power_pulse_width = now - _last_cf_interrupt;
    _last_cf_interrupt = now;
}

void HLW8012::cf1_interrupt() {

    unsigned long now = micros();
    unsigned long pulse_width;

    if ((now - _first_cf1_interrupt) > _pulse_timeout) {

        if (_last_cf1_interrupt == _first_cf1_interrupt) {
            pulse_width = 0;
        } else {
            pulse_width = now - _last_cf1_interrupt;
        }

        if (_mode == _current_mode) {
            _current_pulse_width = pulse_width;
        } else {
            _voltage_pulse_width = pulse_width;
        }

        _mode = 1 - _mode;
        digitalWrite(_sel_pin, _mode);
        _first_cf1_interrupt = now;

    }

    _last_cf1_interrupt = now;

}

Decoded Stack Messages

0x40107314: interrupt_handler at ?? line ?
0x4010078f: ppProcessTxQ at ?? line ?
0x4020521e: HLW8012::cf1_interrupt() at ?? line ?
0x4010078f: ppProcessTxQ at ?? line ?
0x401072dc: interrupt_handler at ?? line ?
0x4020856c: hlw8012_cf1_interrupt() at ?? line ?
0x40107378: interrupt_handler at ?? line ?
0x40107362: interrupt_handler at ?? line ?
0x401048f9: ets_timer_disarm at ?? line ?
0x401072dc: interrupt_handler at ?? line ?
0x402101f0: pm_get_sleep_type at ?? line ?
0x40104e5e: spi_flash_read at ?? line ?
0x401077b4: pvPortZalloc at ?? line ?
0x4020fcb5: pm_set_sleep_time at ?? line ?
0x40210156: pm_get_sleep_type at ?? line ?
0x4021c7c8: tcpip_tcp_timer at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/core/timers.c line 81
0x40210203: pm_get_sleep_type at ?? line ?
0x40212edd: ets_timer_handler_isr at ?? line ?
0x40212f22: ets_timer_handler_isr at ?? line ?
@Quiqui64
Copy link

Quiqui64 commented Jun 9, 2017

I solved the issues that I had by using iram with my isr.

#1388

@tuxedo0801
Copy link
Contributor Author

Does not solve my issue.

I added the IRAM attribute to the interrupt handler methods within the power-measurement library I use:

void ICACHE_RAM_ATTR HLW8012::cf_interrupt() {
    unsigned long now = micros();
    _power_pulse_width = now - _last_cf_interrupt;
    _last_cf_interrupt = now;
}

void ICACHE_RAM_ATTR HLW8012::cf1_interrupt() {

    unsigned long now = micros();
    unsigned long pulse_width;

    if ((now - _first_cf1_interrupt) > _pulse_timeout) {

        if (_last_cf1_interrupt == _first_cf1_interrupt) {
            pulse_width = 0;
        } else {
            pulse_width = now - _last_cf1_interrupt;
        }

        if (_mode == _current_mode) {
            _current_pulse_width = pulse_width;
        } else {
            _voltage_pulse_width = pulse_width;
        }

        _mode = 1 - _mode;
        digitalWrite(_sel_pin, _mode);
        _first_cf1_interrupt = now;

    }

    _last_cf1_interrupt = now;

}

and called those methods as before from my sketch:

void setup() {
...
  attachInterrupt(GPIO_HLW8012_CF1, hlw8012_cf1_interrupt, CHANGE);
  attachInterrupt(GPIO_HLW8012_CF, hlw8012_cf_interrupt, CHANGE);
...
}

// redirect interrupt to HLW library
void  hlw8012_cf1_interrupt() {
  _hlw8012.cf1_interrupt();
}
void  hlw8012_cf_interrupt() {
  _hlw8012.cf_interrupt();
}

Still crashes after a few time reloading the webpage.
I also tried to move the attribute to the handler-methods in my sketch. Does not make any difference. Still crashing.

What I found out so far:
The measurement library requires two interrupts. But only one of them (cf1_interrupt()) is causing the crash. I can run "forever" with just the cf_interrupt() handler. But as soon as the other one is active (no matter if it's the only handler or if there are two), it crashes...

:-( Any further ideas?

@igrr
Copy link
Member

igrr commented Jun 12, 2017

The first few lines of the crash dump should contain the PC (program counter) at the location of the exception. Using the PC value, you can identify the exact line of code where the exception happens:

xtensa-lx106-elf-addr2line -pfia <your_sketch_file_name.elf> <PC value>

@tuxedo0801
Copy link
Contributor Author

tuxedo0801 commented Jun 12, 2017

I tried this:

C:\Users\d463\AppData\Local\Arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\1.20.0-26-gb404fb9-2\bin\xtensa-lx106-elf-addr2line.exe -pfia c:\dev\SmartPlugFW\.pioenvs\esp12e\firmware.elf 0x40107314

but got:

C:\Users\d463\AppData\Local\Arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\1.20.0-26-gb404fb9-2\bin\xtensa-lx106-elf-addr2line.exe: 'a.out': No such file

No idea what a.out file is missing?! I get the same output even if I remove all the arguments, just using the .exe ...

@igrr
Copy link
Member

igrr commented Jun 12, 2017

Sorry, there should also be a -e flag before the filename. a.out is the default elf file name, if -e flag is not provided.

@igrr
Copy link
Member

igrr commented Jun 12, 2017

Can you also post the actual line with the exception you get? Based on PC value, it is in IRAM, so it's likely that the exception is not related to interrupt code being in Flash.

@tuxedo0801
Copy link
Contributor Author

tuxedo0801 commented Jun 12, 2017

C:\Users\d463\AppData\Local\Arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\1.20.0-26-gb404fb9-2\bin\xtensa-lx106-elf-addr2line.exe -pfia -e c:\dev\SmartPlugFW\.pioenvs\esp12e\firmware.elf 0x40107314
0x40107314: interrupt_handler at ??:?

So it's giving me exactly the same data as the decoded stack message I posted in first post... No line information :-(

@tuxedo0801
Copy link
Contributor Author

Complexte crash information:

Exception (0):
epc1=0x40208550 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: sys
sp: 3ffffc30 end: 3fffffb0 offset: 01a0


>>>stack>>>
3ffffdd0:  40107314 80be5999 00000000 00000002
3ffffde0:  ffffffff 00000020 00000001 00000000
3ffffdf0:  00000000 4010078f 00000000 00000022
3ffffe00:  3fffc200 401072dc 3fffc258 4000050c
3ffffe10:  40004376 00000030 00000016 ffffffff
3ffffe20:  60000200 00000008 70017000 80000000
3ffffe30:  20000000 3fff5478 80000000 203fc160
3ffffe40:  00000000 3fffc6fc 3fff011c 3fff547c
3ffffe50:  00000194 003fc160 60000600 00000030
3ffffe60:  00000135 4010307d 00000000 40208544
3ffffe70:  40107378 00080000 00000000 40107362
3ffffe80:  ffffffff 00000020 00000000 4000050c
3ffffe90:  00000000 00000000 0000001f 401048f9
3ffffea0:  4000050c 401072dc 3fffc258 4000050c
3ffffeb0:  40000f68 00000030 0000001c ffffffff
3ffffec0:  40000f58 00000000 00000020 00000000
3ffffed0:  00000013 402101c8 3fff011c 00000001
3ffffee0:  ffffffff 3ffebb34 3fff011c 3fffdab0
3ffffef0:  00000000 3fffdcb0 3fff0150 00000030
3fffff00:  00000000 400042db 0000007d 60000600
3fffff10:  40004b31 3fff531c 000002f4 003fc000
3fffff20:  40104e5e 3fff0140 3ffeffa0 401077b4
3fffff30:  4020fc8d 3ffeffa0 3fff0140 0b34e0fa
3fffff40:  3fff531c 00001000 4021012e 00000008
3fffff50:  4021c7a0 00000000 402101db 3fff0054
3fffff60:  3fff0140 02e24023 3fff0140 60000600
3fffff70:  40212eb5 3fff0054 3fff0140 0b34da23
3fffff80:  40212efa 3fffdab0 00000000 3fffdcb0
3fffff90:  3fff0158 00000000 40000f65 3fffdab0
3fffffa0:  40000f49 00016990 3fffdab0 40000f49
<<<stack<<<

 ets Jan  8 2013,rst cause:1, boot mode:(1,7)

@igrr
Copy link
Member

igrr commented Jun 12, 2017

Could you please put 0x40208550 into xtensa-lx106-elf-addr2line? That's the address where exception happens.

@tuxedo0801
Copy link
Contributor Author

Exception (0):
epc1=0x40208560 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000
 
ctx: sys
sp: 3ffffc30 end: 3fffffb0 offset: 01a0
 
>>>stack>>>
3ffffdd0:  40107314 40100ad9 00000002 4021d5a0
3ffffde0:  ffffffff 00000020 7fffffff 00000000
3ffffdf0:  0000011e 4010078f 00000002 00000022
3ffffe00:  3fffc200 401072dc 3fffc258 4000050c
3ffffe10:  40004378 00000030 00000016 ffffffff

3ffffe20:  60000200 00000008 00021100 80000000
3ffffe30:  20000000 3fff5060 80000000 203fc220
3ffffe40:  80000000 3fffc6fc 3fff011c 3fff5064
3ffffe50:  000000d4 003fc220 60000600 00000030
3ffffe60:  3fffc200 0000000d 00000000 4020856c
3ffffe70:  40107378 00000030 0000001b 40107362
3ffffe80:  ffffffff 00000020 00000000 3feffe00
3ffffe90:  00000000 00000000 0000001f 401048f9
3ffffea0:  4000050c 401072dc 3fffc258 4000050c
3ffffeb0:  40000f68 00000030 0000001c ffffffff
3ffffec0:  40000f58 00000000 00000020 00000000
3ffffed0:  00000013 402101f0 3fff011c 00000000
3ffffee0:  ffffffff 3fff3000 3fff011c 3fffdab0
3ffffef0:  00000000 3fffdcb0 3fff0168 00000030
3fffff00:  00000000 400042db 00000064 60000600
3fffff10:  40004b31 3fff4e44 000002f4 003fc000
3fffff20:  40104e5e 3fff0140 3ffeffa0 401077b4
3fffff30:  4020fcb5 3ffeffa0 3fff0140 02096203
3fffff40:  3fff4e44 00001000 40210156 00000008
3fffff50:  4010561c 00000000 40210203 3fff0054
3fffff60:  3fff0140 00a2c7f0 3fff0140 60000600
3fffff70:  40212edd 3fff0054 3fff0140 02093fcd
3fffff80:  40212f22 3fffdab0 00000000 3fffdcb0
3fffff90:  3fff0150 00000000 40000f65 3fffdab0
3fffffa0:  40000f49 000315e0 3fffdab0 40000f49
<<<stack<<<

 ets Jan  8 2013,rst cause:1, boot mode:(3,4)
 
load 0x4010f000, len 1384, room 16
tail 8

chksum 0x2d
csum 0x2d
v09f0c112
~ld

Result with exception decoder:

0x40107314: interrupt_handler at ?? line ?
0x40100ad9: ppEnqueueRxq at ?? line ?
0x4021d5a0: etharp_output at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/netif/etharp.c line 995
0x4010078f: ppProcessTxQ at ?? line ?
0x401072dc: interrupt_handler at ?? line ?
0x4020856c: hlw8012_cf1_interrupt() at ?? line ?
0x40107378: interrupt_handler at ?? line ?
0x40107362: interrupt_handler at ?? line ?
0x401048f9: ets_timer_disarm at ?? line ?
0x401072dc: interrupt_handler at ?? line ?
0x402101f0: pm_get_sleep_type at ?? line ?
0x40104e5e: spi_flash_read at ?? line ?
0x401077b4: pvPortZalloc at ?? line ?
0x4020fcb5: pm_set_sleep_time at ?? line ?
0x40210156: pm_get_sleep_type at ?? line ?
0x4010561c: igmp_timer at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/core/timers.c line 217
0x40210203: pm_get_sleep_type at ?? line ?
0x40212edd: ets_timer_handler_isr at ?? line ?
0x40212f22: ets_timer_handler_isr at ?? line ?

xtensa-lx106-elf-addr2line on the exception address:

xtensa-lx106-elf-addr2line.exe -pfia -e c:\dev\SmartPlugFW\.pioenvs\esp12e\firmware.elf 0x40208560
0x40208560: _Z21hlw8012_cf1_interruptv at ??:?

@igrr
Copy link
Member

igrr commented Jun 12, 2017

It means that hlw8012_cf1_interrupt is not placed into IRAM. You have placed HLW8012::cf1_interrupt into IRAM, but not its wrapper, hlw8012_cf1_interrupt.

@tuxedo0801
Copy link
Contributor Author

With the latest stack/exeption, nothing was placed into IRAM. I removed the IRAM attribute as it did not work/solve my issue.

So I'm a bit confused about your finding?!

Should I place both in IRAM to make it work?

@igrr
Copy link
Member

igrr commented Jun 12, 2017

I wrote "You have placed HLW8012::cf1_interrupt into IRAM" based on the source code you had posted above. I didn't know you have later removed IRAM attribute from that function.

In any case, yes, you should place both into IRAM to make it work. If it still doesn't work, please have a look at "Exception (?): epc1=..." line again. If the exception number is 0, put the value of excvaddr into addr2line, this will give you the location of the function which is not yet placed into IRAM.
If the exception is something different (not 0), post the exception info along with the decoded stack dump here please.

@tuxedo0801
Copy link
Contributor Author

Now I have both methods, the wrapper and the final handler with IRAM attribute. Looks stable so far. I observe this the next hour and report back ...

Is there some documentation about that mysterious IRAM, so that I can read about what I now have setup?

@igrr
Copy link
Member

igrr commented Jun 12, 2017

There isn't any for Arduino, however the requirement for the ISR handlers to be placed into IRAM is mentioned in the ESP8266 non-OS SDK programming guide.
The issue about missing documentation in Arduino is here: #1388.

@devyte
Copy link
Collaborator

devyte commented Sep 6, 2017

Considering that this specific issue is fixed, and that the underlying cause is user error due to lack of documentation, and that the request for documentation is covered elsewhere, I'm closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants