Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PubSubClient on ESP32 overflows the stack #9

Closed
prototypicalpro opened this issue Apr 28, 2020 · 28 comments
Closed

PubSubClient on ESP32 overflows the stack #9

prototypicalpro opened this issue Apr 28, 2020 · 28 comments
Labels
bug Something isn't working

Comments

@prototypicalpro
Copy link
Member

Working with @jhnwmr on #8 I discovered that the using PubSubClient with SSLClient causes a stack overflow on the ESP32:

Error Log
t.cpX:337] _eventCallback(): Event: 2 - STA_START
..[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:353] _eventCallback(): Reason: 2 - AUTH_EXPIRE
....[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:353] _eventCallback(): Reason: 201 - NO_AP_FOUND
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 7 - STA_GOT_IP
[D][WiFiGeneric.cpp:381] _eventCallback(): STA IP: 192.168.137.94, MASK: 255.255.255.0, GW: 192.168.137.1
.Attempting MQTT connection...(SSLClient)(SSL_WARN)(connect): Using a raw IP Address for an SSL connection bypasses some important verification steps. You should use a domain name (www.google.com) whenever possible.
connected

Backtrace: 0x4c103f95:0x3ffbe160 0x229cfdfd:0x3ffbe180 0x40089037:0x3ffbe1a0 0x4008b74d:0x3ffbe1c0 0x40084b46:0x3ffbe1d0 0x4014a2df:0x3ffbc200 0x400e321b:0x3ffbc220 0x4008a72d:0x3ffbc240 0x40088f49:0x3ffbc260


Backtrace: 0x7dc000e6:0x7dc000e6

Rebooting...
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x3fff001c,len:1044
load:0x40078000,len:8896
load:0x40080400,len:5816
entry 0x400806ac
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 0 - WIFI_READY
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 2 - STA_START
.[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 7 - STA_GOT_IP
[D][WiFiGeneric.cpp:381] _eventCallback(): STA IP: 192.168.137.94, MASK: 255.255.255.0, GW: 192.168.137.1
.Attempting MQTT connection...(SSLClient)(SSL_WARN)(connect): Using a raw IP Address for an SSL connection bypasses some important verification steps. You should use a domain name (www.google.com) whenever possible.
connected

Backtrace: 0x78f076a5:0x3ffbe160 0x08a73f7d:0x3ffbe180 0x40089037:0x3ffbe1a0 0x4008b74d:0x3ffbe1c0 0x40084b46:0x3ffbe1d0 0x4014a2df:0x3ffbc200 0x400e321b:0x3ffbc220 0x4008a72d:0x3ffbc240 0x40088f49:0x3ffbc260


Backtrace: 0x4008c777:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0 0x4008c8f5:0x3ffbdfe0 0x4008c982:0x3ffbe060 0x4008cc75:0x3ffbe080 0x400848be:0x3ffbe0a0 0x78f076a2:0x3ffbe160


Backtrace: 0x4008c777:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20 0x4008c8f5:0x3ffbde40 0x4008c982:0x3ffbdec0 0x4008cc75:0x3ffbdee0 0x400848be:0x3ffbdf00 0x4008c774:0x3ffbdfc0


Backtrace: 0x4008c777:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80 0x4008c8f5:0x3ffbdca0 0x4008c982:0x3ffbdd20 0x4008cc75:0x3ffbdd40 0x400848be:0x3ffbdd60 0x4008c774:0x3ffbde20


Backtrace: 0x4008c777:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0 0x4008c8f5:0x3ffbdb00 0x4008c982:0x3ffbdb80 0x4008cc75:0x3ffbdba0 0x400848be:0x3ffbdbc0 0x4008c774:0x3ffbdc80


Backtrace: 0x4008c777:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940 0x4008c8f5:0x3ffbd960 0x4008c982:0x3ffbd9e0 0x4008cc75:0x3ffbda00 0x400848be:0x3ffbda20 0x4008c774:0x3ffbdae0


Backtrace: 0x4008c777:0x3ffbc900 0x4008c8f5:0x3ffbc920 0x4008c982:0x3ffbc9a0 0x4008cc75:0x3ffbc9c0 0x400848be:0x3ffbc9e0 0x4008c774:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0 0x4008c8f5:0x3ffbd7c0 0x4008c982:0x3ffbd840 0x4008cc75:0x3ffbd860 0x400848be:0x3ffbd880 0x4008c774:0x3ffbd940


Backtrace: 0x4008c777:0x3ffbc760 0x4008c8f5:0x3ffbc780 0x4008c982:0x3ffbc800 0x4008cc75:0x3ffbc820 0x400848be:0x3ffbc840 0x4008c774:0x3ffbc900 0x4008c8f5:0x3ffbc920 0x4008c982:0x3ffbc9a0 0x4008cc75:0x3ffbc9c0 0x400848be:0x3ffbc9e0 0x4008c774:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600 0x4008c8f5:0x3ffbd620 0x4008c982:0x3ffbd6a0 0x4008cc75:0x3ffbd6c0 0x400848be:0x3ffbd6e0 0x4008c774:0x3ffbd7a0


Backtrace: 0x4008c777:0x3ffbc5c0 0x4008c8f5:0x3ffbc5e0 0x4008c982:0x3ffbc660 0x4008cc75:0x3ffbc680 0x400848be:0x3ffbc6a0 0x4008c774:0x3ffbc760 0x4008c8f5:0x3ffbc780 0x4008c982:0x3ffbc800 0x4008cc75:0x3ffbc820 0x400848be:0x3ffbc840 0x4008c774:0x3ffbc900 0x4008c8f5:0x3ffbc920 0x4008c982:0x3ffbc9a0 0x4008cc75:0x3ffbc9c0 0x400848be:0x3ffbc9e0 0x4008c774:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460 0x4008c8f5:0x3ffbd480 0x4008c982:0x3ffbd500 0x4008cc75:0x3ffbd520 0x400848be:0x3ffbd540 0x4008c774:0x3ffbd600


Backtrace: 0x4008c777:0x3ffbc420 0x4008c8f5:0x3ffbc440 0x4008c982:0x3ffbc4c0 0x4008cc75:0x3ffbc4e0 0x400848be:0x3ffbc500 0x4008c774:0x3ffbc5c0 0x4008c8f5:0x3ffbc5e0 0x4008c982:0x3ffbc660 0x4008cc75:0x3ffbc680 0x400848be:0x3ffbc6a0 0x4008c774:0x3ffbc760 0x4008c8f5:0x3ffbc780 0x4008c982:0x3ffbc800 0x4008cc75:0x3ffbc820 0x400848be:0x3ffbc840 0x4008c774:0x3ffbc900 0x4008c8f5:0x3ffbc920 0x4008c982:0x3ffbc9a0 0x4008cc75:0x3ffbc9c0 0x400848be:0x3ffbc9e0 0x4008c774:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0 0x4008c8f5:0x3ffbd2e0 0x4008c982:0x3ffbd360 0x4008cc75:0x3ffbd380 0x400848be:0x3ffbd3a0 0x4008c774:0x3ffbd460


Backtrace: 0x4008c777:0x3ffbc280 0x4008c8f5:0x3ffbc2a0 0x4008c982:0x3ffbc320 0x4008cc75:0x3ffbc340 0x400848be:0x3ffbc360 0x4008c774:0x3ffbc420 0x4008c8f5:0x3ffbc440 0x4008c982:0x3ffbc4c0 0x4008cc75:0x3ffbc4e0 0x400848be:0x3ffbc500 0x4008c774:0x3ffbc5c0 0x4008c8f5:0x3ffbc5e0 0x4008c982:0x3ffbc660 0x4008cc75:0x3ffbc680 0x400848be:0x3ffbc6a0 0x4008c774:0x3ffbc760 0x4008c8f5:0x3ffbc780 0x4008c982:0x3ffbc800 0x4008cc75:0x3ffbc820 0x400848be:0x3ffbc840 0x4008c774:0x3ffbc900 0x4008c8f5:0x3ffbc920 0x4008c982:0x3ffbc9a0 0x4008cc75:0x3ffbc9c0 0x400848be:0x3ffbc9e0 0x4008c774:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120 0x4008c8f5:0x3ffbd140 0x4008c982:0x3ffbd1c0 0x4008cc75:0x3ffbd1e0 0x400848be:0x3ffbd200 0x4008c774:0x3ffbd2c0


Backtrace: 0x4008c777:0x3ffbc0e0 0x4008c8f5:0x3ffbc100 0x4008c982:0x3ffbc180 0x4008cc75:0x3ffbc1a0 0x400848be:0x3ffbc1c0 0x4008c774:0x3ffbc280 0x4008c8f5:0x3ffbc2a0 0x4008c982:0x3ffbc320 0x4008cc75:0x3ffbc340 0x400848be:0x3ffbc360 0x4008c774:0x3ffbc420 0x4008c8f5:0x3ffbc440 0x4008c982:0x3ffbc4c0 0x4008cc75:0x3ffbc4e0 0x400848be:0x3ffbc500 0x4008c774:0x3ffbc5c0 0x4008c8f5:0x3ffbc5e0 0x4008c982:0x3ffbc660 0x4008cc75:0x3ffbc680 0x400848be:0x3ffbc6a0 0x4008c774:0x3ffbc760 0x4008c8f5:0x3ffbc780 0x4008c982:0x3ffbc800 0x4008cc75:0x3ffbc820 0x400848be:0x3ffbc840 0x4008c774:0x3ffbc900 0x4008c8f5:0x3ffbc920 0x4008c982:0x3ffbc9a0 0x4008cc75:0x3ffbc9c0 0x400848be:0x3ffbc9e0 0x4008c774:0x3ffbcaa0 0x4008c8f5:0x3ffbcac0 0x4008c982:0x3ffbcb40 0x4008cc75:0x3ffbcb60 0x400848be:0x3ffbcb80 0x4008c774:0x3ffbcc40 0x4008c8f5:0x3ffbcc60 0x4008c982:0x3ffbcce0 0x4008cc75:0x3ffbcd00 0x400848be:0x3ffbcd20 0x4008c774:0x3ffbcde0 0x4008c8f5:0x3ffbce00 0x4008c982:0x3ffbce80 0x4008cc75:0x3ffbcea0 0x400848be:0x3ffbcec0 0x4008c774:0x3ffbcf80 0x4008c8f5:0x3ffbcfa0 0x4008c982:0x3ffbd020 0x4008cc75:0x3ffbd040 0x400848be:0x3ffbd060 0x4008c774:0x3ffbd120

Guru Meditation Error: Core  0 panic'ed (Unhandled debug exception)
Debug exception reason: Stack canary watchpoint triggered (8) 

This error persists despite increasing the stack size to >16kb, suggesting that this error is not simply due to a shortage of memory. My best guess is a bug in the BearSSL implementation of the ChaCha/Poly cipher suite, however it is too early so say for sure.

This error can temporarily be worked around by flushing SSLClient's buffer using SSLClient::flush after every write to the network. I have updated the examples to include this workaround, however It would definitely be best if this issue was addressed with a more permanent fix for the future.

@prototypicalpro prototypicalpro added the bug Something isn't working label Apr 28, 2020
@bleckers
Copy link
Contributor

I am getting the same thing with the latest PubSubClient. However, when I put in the work around to flush after each publish, it locks up for 20 seconds and reconnects:

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_WARN)(m_run_until): Terminating because the ssl engine closed

(SSLClient)(SSL_ERROR)(flush): Could not flush write buffer!

Attempting MQTT connection...(SSLClient)(SSL_INFO)(connect): Base client connected!

(SSLClient)(SSL_INFO)(m_get_session_index): xxxxxx.amazonaws.com

(SSLClient)(SSL_INFO)(getSession): Using session index: 

(SSLClient)(SSL_INFO)(getSession): 0

(SSLClient)(SSL_INFO)(m_start_ssl): Set SSL session!

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_start_ssl): Connection successful!

connected

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVAPP

@bleckers
Copy link
Contributor

bleckers commented May 22, 2020

So it seems (I could be wrong though) that after a publish it doesn't go into the BR_SSL_RECVAPP state (a flush puts things in a run until BR_SSL_SENDAPP target) which causes this operation to time out. Will continue to investigate, but maybe forcing the flush instead straight away might work?

Note this isn't looking to fix the issue at hand, just the workaround.

@bleckers
Copy link
Contributor

bleckers commented May 22, 2020

Ah, running client.loop() after a client.publish seems to solve this problem.

Edit: spoke too soon, this still causes the stack overflow issues.

Edit2: doing the following with pubsub in the loop without the flushes (set to run every 30 seconds) causes it to crash the second time around the loop:

      client.publish("topic.out", out);

      client.subscribe("topic.in");

My MQTT_MAX_PACKET_SIZE is set to 1024
So this might be a way to speed up debugging.

@bleckers
Copy link
Contributor

bleckers commented May 22, 2020

OH WOW.

I set #define MQTT_MAX_TRANSFER_SIZE 80 in PubSubClient.h (note I always had ETHERNET_LARGE_BUFFERS set in Ethernet.h) and this fixed the above Edit2. I'm going to leave it running without the subscribe function (and without flush) in there and see if the original problem goes away.

Edit: Looking good so far. Will leave things overnight.

@prototypicalpro
Copy link
Member Author

I added some more error logging to SSLClient::flush on master branch, which should give us more diagnostic information. Try reproducing the issue with master latest and see if it helps.

@bleckers
Copy link
Contributor

bleckers commented May 23, 2020

It crashed about 8 hours later without flushes and with MQTT_MAX_TRANSFER_SIZE 80 in PubSubClient.h set. Normally it would run for about 5-30 minutes. With flushes it actually runs just fine, it just has the long timeout issue.

Trying with the latest and flushes re-enabled:

###Startup###
Attempting MQTT connection...(SSLClient)(SSL_INFO)(connect): Base client connected!

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_start_ssl): Connection successful!

connected

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVAPP

###MQTT Publish and Flush####

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5                                     <<< Pauses here

(SSLClient)(SSL_WARN)(m_run_until): Terminating because the ssl engine closed

(SSLClient)(SSL_ERROR)(flush): Could not flush write buffer!

Attempting MQTT connection...(SSLClient)(SSL_INFO)(connect): Base client connected!

(SSLClient)(SSL_INFO)(m_get_session_index): xxxx.amazonaws.com

(SSLClient)(SSL_INFO)(getSession): Using session index: 

(SSLClient)(SSL_INFO)(getSession): 0

(SSLClient)(SSL_INFO)(m_start_ssl): Set SSL session!

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_start_ssl): Connection successful!

connected

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVREC

   SENDAPP

(SSLClient)(SSL_INFO)(m_run_until): Expected bytes count: 

(SSLClient)(SSL_INFO)(m_run_until): 5

(SSLClient)(SSL_INFO)(m_run_until): m_run changed state:

(SSLClient)(SSL_INFO)(m_run_until): State: 

   RECVAPP


@bleckers
Copy link
Contributor

I tried running the flush operation (without MQTT_MAX_TRANSFER_SIZE set) after an .available() call and wasn't getting pauses (noticed in available() that it will force a flush if stuck in SENDAPP):

ethClientSSL.available();
ethClientSSL.flush();

However, this still causes the overflow to occur shortly afterwards.

@bleckers
Copy link
Contributor

bleckers commented May 23, 2020

Edit: Actually disregard below, client.available gets called in the PubSub loop method. Calling client.loop(); after each network operation would have the same effect.


One thing I noticed in PubSubClient.cpp was that there isn't an available() call in the write function to the client, which goes against this library's recommendations concerning the write buffering. This is leading you to call the flush process manually as a fix for this.

boolean PubSubClient::write(uint8_t header, uint8_t* buf, uint16_t length) {
    uint16_t rc;
    uint8_t hlen = buildHeader(header, buf, length);

#ifdef MQTT_MAX_TRANSFER_SIZE
    uint8_t* writeBuf = buf+(MQTT_MAX_HEADER_SIZE-hlen);
    uint16_t bytesRemaining = length+hlen;  //Match the length type
    uint8_t bytesToWrite;
    boolean result = true;
    while((bytesRemaining > 0) && result) {
        bytesToWrite = (bytesRemaining > MQTT_MAX_TRANSFER_SIZE)?MQTT_MAX_TRANSFER_SIZE:bytesRemaining;
        rc = _client->write(writeBuf,bytesToWrite);
        result = (rc == bytesToWrite);
        bytesRemaining -= rc;
        writeBuf += rc;
    }
    return result;
#else
    rc = _client->write(buf+(MQTT_MAX_HEADER_SIZE-hlen),length+hlen);
    lastOutActivity = millis();
    return (rc == hlen+length);
#endif
}

I have placed "while (!_client->available()) {}" at the end of this function and will see how things go with the original issue at hand.

This is weird though, because originally even though this available() function isn't being called, data is still being sent to the server (even without flush()). The publish call is the only network operation I have going.

@bleckers
Copy link
Contributor

bleckers commented May 23, 2020

Aha! If I run a client.loop() before a ethClientSSL.flush() then flush works correctly without delay.

Edit: it still caused crashes, but if I run client.loop as well before the publish, all is well!

Basically doing this fixes everything:

      client.loop();
      client.publish("update.topic", out);
      client.loop();
      ethClientSSL.flush();
      client.loop();

Edit2: Looking into this further, it seems it might be a way buffers have been implemented around the place - esp8266/Arduino#3002

Not sure if it's a similar problem here (but it cropped up when the contributor was developing SSL in PubSubClient using WifiSecureClient - knolleary/pubsubclient#251), but definitely something to look at.

@prototypicalpro
Copy link
Member Author

That's interesting, I'm glad you found a fix. I suspect that one of the root causes is that BearSSL crashes/malfunctions when attempting to encrypt a 100% full buffer, but I will need to investigate further.

@bleckers
Copy link
Contributor

bleckers commented May 24, 2020

Unfortunately spoke too soon. It does crash after several hours, so it's back around what it was like when I had MQTT_MAX_TRANSFER_SIZE set, except flushes now don't pause.

I guess it's worth mentioning that I'm using the W6100, which has a double sized buffer (32k rather than 16k on the w5100). I have it set to 2 sockets, and the tx/rx buffers get dynamically set to maximise space when ETHERNET_LARGE_BUFFERS is set (i.e. each rx/tx buffer is 8k from a total of 32k). I am sending 180 bytes and my MQTT message size is 1024.

I have a watchdog in there to reset as a workaround, but it's not the most ideal solution.

Edit: I have this running all day for 10 hours with MQTT_MAX_TRANSFER_SIZE set to 256 as well as the .loop() call additions above and it's been rock solid so far (have an uptime counter keeping track). Will update tomorrow.

Edit2: still running with both these changes 19 hours later.

@bleckers
Copy link
Contributor

bleckers commented May 25, 2020

Uptime: Days: 1, Hours: :02, Minutes: :26, Seconds: :19

Still going strong!

Edit: crashed at:

Uptime: Days: 1, Hours: :12, Minutes: :23, Seconds: :19

Much better than before though.

@bleckers
Copy link
Contributor

bleckers commented Jun 5, 2020

A bit of an update after some further dev and discovery.

So it seems that running client.loop (note "client" is with regard to the defines in the MQTT example) as many times as possible without delay prevents the crashing I was having. Looking at the example I actually noticed you don't have a delay in your loop, whereas I had a delay(1) in there (which I have since now replaced with sleep code and interrupts on Ethernet), so that'd be why your flushes worked originally, whereas mine didn't.

I have a check in the code that checks if ethClientSSL.available(), in which it sets a flag to run without delay/sleep until no longer available. Things have been rock solid for a good week now.

So the flushes definitely prevent the aforementioned stack overflows, provided you run the client loop without delay (me big dumb dumb). Is it possible the original issue is related to frame corruption/data loss when not processing fast enough or if it's interrupted, which wreaks havoc with the SSL libraries?

@prototypicalpro
Copy link
Member Author

Maybe, but I don't think so. BearSSL's data should be completely isolated from the rest of the code, and SSLClient will not fill the buffer until BearSSL is finished processing. I think the most likely suspect is a buffer overflow happening somewhere in BearSSL or the Ethernet library.

@bleckers
Copy link
Contributor

bleckers commented Jun 11, 2020

I know it's ESP8266 related, but I'm guessing the API implementation might be similar. They are possibly having the same issues over there - esp8266/Arduino#6143

Edit: ah nevermind, you mentioned you increased the stack size.

@bleckers
Copy link
Contributor

bleckers commented Jun 15, 2020

Trying to debug why the flushes work and we get crashes without. It seems the crash occurs very reliably when the MQTT client loop() writes an MQTT keepAlive ping while it hasn't flushed the previous publish down the client stack yet (denoted* by size 2 here):

publish: {"deviceID":"000102030405","pa":101972,"temp":22,"lux":7}
write()
size:93 cur_idx:0
publish: {"deviceID":"000102030405","pa":101973,"temp":22,"lux":7}
write()
size:93 cur_idx:0
write()
size:2 cur_idx:0
##Crash##

So it seems if there is an output (or possibly input) buffer when publishing that hasn't been processed by bearSSL yet, it affects the way bearSSL works. I've never seen the crash occur during a regular publish which doesn't coincide with a ping (which is why it seems random).

This would be why we're seeing this with PubSub.

*Those calls are in SSLClient::write():

    while (cur_idx < size) {
		//Serial.print(size);
		//Serial.print(" ");
		//Serial.print(cur_idx);
		//Serial.flush();

@bleckers
Copy link
Contributor

bleckers commented Jun 16, 2020

I've been experimenting with running:
if (sslClient->available()) { sslClient->flush(); }
instead of just sslClient->flush();
With this, the crashes also don't happen either, so it seems it's something to do with the incoming bytes?

@prototypicalpro
Copy link
Member Author

@bleckers are you changing the buffer size in PubSubClient? It's possible that this issue is related to knolleary/pubsubclient#764.

@bleckers
Copy link
Contributor

bleckers commented Oct 5, 2020

No, not changing at runtime. Currently set to #define MQTT_MAX_PACKET_SIZE 1024

The workarounds have been rock solid for a good few months now without fail, provided you run flush() enough (those latest comments were more to see why it might be happening without the flush().

Also of note for those that might be using the Ethernet libraries. There's some issues with them that will cause lockups under certain network situations.

arduino-libraries/Ethernet#78

arduino-libraries/Ethernet#137

Also there's a fix if using SSLClient as well that sorts out a few issues there too - #13

@1simone0
Copy link

1simone0 commented Feb 4, 2021

Hello,
I'm not really an expert about TLS and Ethernet but I'm trying to connect to AWS using EthernetAWSIoT example.
I'm not using ethernet.h but ethernetlarge.h instead and PubSubclient for MQTT (so that I should not modify libraries.... is it right?).
MQTT_PACKET_SIZE = 1024
Board: ESP32 with an ethernet module (I don't know the name but tested without SSL and it works).

I got this error on serial print:

Initialize Ethernet with DHCP:
My IP address: 192.168.1.109
Attempting MQTT connection...(SSLClient)(SSL_ERROR)(m_run_until): SSL internals timed out! This could be an internal error, bad data sent from the server, or data being discarded due to a buffer overflow. If you are using Ethernet, did you modify the library properly (see README)?
(SSLClient)(SSL_ERROR)(connected): Not connected because write error is set
(SSLClient)(SSL_ERROR)(m_print_ssl_error): SSL_BR_WRITE_ERROR
(SSLClient)(SSL_ERROR)(m_start_ssl): Failed to initlalize the SSL layer
(SSLClient)(SSL_ERROR)(m_print_br_error): Unknown error code: 0
failed, rc=-2 try again in 5 seconds
(SSLClient)(SSL_ERROR)(connected): Not connected because write error is set
(SSLClient)(SSL_ERROR)(m_print_ssl_error): SSL_BR_WRITE_ERROR
Attempting MQTT connection...(SSLClient)(SSL_ERROR)(connected): Not connected because write error is set
(SSLClient)(SSL_ERROR)(m_print_ssl_error): SSL_BR_WRITE_ERROR
(SSLClient)(SSL_ERROR)(connected): Not connected because write error is set
(SSLClient)(SSL_ERROR)(m_print_ssl_error): SSL_BR_WRITE_ERROR
(SSLClient)(SSL_ERROR)(m_run_until): SSL internals timed out! This could be an internal error, bad data sent from the server, or data being discarded due to a buffer overflow. If you are using Ethernet, did you modify the library properly (see README)?
(SSLClient)(SSL_ERROR)(connected): Not connected because write error is set
(SSLClient)(SSL_ERROR)(m_print_ssl_error): SSL_BR_WRITE_ERROR
(SSLClient)(SSL_ERROR)(m_start_ssl): Failed to initlalize the SSL layer
(SSLClient)(SSL_ERROR)(m_print_br_error): Unknown error code: 0
failed, rc=-2 try again in 5 seconds


Do you believe is related to the known issue? or maybe I'm missing something?

Also I have some doubts about why the sketch make a connection call to "arduinoClient" in function reconnect()?
Is it something standard or should I modify it with some string related to my aws account/thing?
Any help is appreciate!

BR,
Simone

@prototypicalpro
Copy link
Member Author

prototypicalpro commented Mar 22, 2021

@bleckers It's been awhile!

I may have fixed the issue you were seeing. In v1.6.11 I fixed a bug where SSLClient would attempt to send zero bytes, causing a stack overflow crash with BearSSL (see in #30). This problem would happen seemingly randomly since there was a very precise condition that had to be met before it was triggered, and I strongly suspect that it was the reason for the crashes. If you're still working on your project, would you give it a shot?

@prototypicalpro
Copy link
Member Author

prototypicalpro commented Mar 22, 2021

@1simone0 It looks like that issue is with your network module. Does MQTT work without SSLClient, or does the EthernetHTTPS example work? If not, would you mind opening a new issue so discussion can continue in that thread?

@bleckers
Copy link
Contributor

More than happy to give the latest a shot later in the week. I'll retry without the mitigations in place and see what happens.

@genotix
Copy link

genotix commented Mar 22, 2021

@bleckers It's been awhile!

I may have fixed the issue you were seeing. In v1.6.11 I fixed a bug where SSLClient would attempt to send zero bytes, causing a stack overflow crash with BearSSL (see in #30). This problem would happen seemingly randomly since there was a very precise condition that had to be met before it was triggered, and I strongly suspect that it was the reason for the crashes. If you're still working on your project, would you give it a shot?

Testing right now with the client.flush() removed.
At 3000 messages now and I didn't have a single disconnect of the TLS connection and no stack overflow so far! (I am a very happy man right now!)

Thank you!

@genotix
Copy link

genotix commented Mar 24, 2021

At 40k messages now; I'd say consider this issue fixed!

@michaelhwhitten
Copy link

I completely agree. I've tested this now for tens of thousand of publishes on M5Stack core 1's, M5Stack atom lites, and Adafruit esp32-based feathers. This at two different sites on different networks. Not. one. hiccup. Great job y'all!

@bleckers
Copy link
Contributor

All seems perfectly fine with these changes. I reckon you can close this one.

@prototypicalpro
Copy link
Member Author

Awesome! Took awhile but I'm glad it's fixed now🎉.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants