Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HEAP CORRUPT #1101

Closed
smacyas opened this issue Feb 12, 2018 · 22 comments
Closed

HEAP CORRUPT #1101

smacyas opened this issue Feb 12, 2018 · 22 comments
Labels
Status: Stale Issue is stale stage (outdated/stuck)

Comments

@smacyas
Copy link

smacyas commented Feb 12, 2018

I use Arduino-IDE and ESPAsyncWebServer library.

Periodically, when i try to load page, ESP32 restarts with:
"CORRUPT HEAP: Bad head at 0x3ffe2ec0. Expected 0xabba1234 got 0x3ffe33b0
.assertion "head != NULL" failed: file "/Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/heap/./multi_heap_poisoning.c", line 199, function: multi_heap_free
.abort() was called"

What does it mean?

Decoding 11 results
0x40087f48: invoke_abort at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/esp32/./panic.c line 578
0x40088047: abort at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/esp32/./panic.c line 578
0x400f2e9b: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c line 63 (discriminator 8)
0x40087c59: multi_heap_free at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/heap/./multi_heap_poisoning.c line 284
0x40083d6e: heap_caps_free at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/heap/./heap_caps.c line 136
0x400842f9: _free_r at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/newlib/./syscalls.c line 42
0x40112bed: tcp_close_shutdown at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/lwip/core/tcp.c line 225
0x40112caf: tcp_close at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/lwip/core/tcp.c line 305
0x400e5031: _tcp_close_api(tcpip_api_call*) at C:\Users\matsys_s\Documents\Arduino\hardware\espressif\esp32\libraries\AsyncTCP\src/AsyncTCP.cpp line 704
0x4010e6a1: tcpip_thread at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/lwip/api/tcpip.c line 474

@stickbreaker
Copy link
Contributor

@smacyas The error is telling you that somewhere in your code you are overwriting a memory buffer.

"CORRUPT HEAP: Bad head at 0x3ffe2ec0. Expected 0xabba1234 got 0x3ffe33b0
.assertion "head != NULL" failed:

This is telling you that the HEAP control structure (heap head) has been overwritten. The HEAP control buffer has a marker (a long int value of 0xabba1234) it is expecting to be at memory address 0x3ffe2ec0.
But, instead it found the value of 0x3ffe33b0 stored there.

Some where in your program you are writing data outside of a memory buffer.

Chuck.

@XYZVector
Copy link

I am seeing the same issue and the same results. I have tried to detect null's in the PCB and the message. Still no dice. This error is frustrating...

@smacyas
Copy link
Author

smacyas commented Mar 7, 2018

Yes, this error often occurs when you frequently (or from multiple PCs) refresh a web page that is loaded from an ESP32

@XYZVector
Copy link

I suspect the lwip library is not thread safe... I slow down the update rate, and my crashes go away.

@bbasil2012
Copy link

I have exactly the same issue with this library (ESPAsyncWebServer).
It falls even without any calls to the server.

@lefedor
Copy link

lefedor commented Mar 8, 2018

I can confirm the issue.

@jamesbaber
Copy link

Is the only workaround to reduce the number of requests/s?

@djmcmath
Copy link

@XYZVector -- any details on slowing down the update rate? Can you clarify where you're inserting delays to slow things down, and how slow it has to be?

I'm updating a client web page every second or so, a few hundred bytes. I'm also using WiFiClient to update a remote database, just passing a single short URL (something like 200 bytes), every 3 minutes. Last night, it crashed dozens of times, often lasting little more than a few minutes between crashes.

Is there any other workaround? Anything that anyone's done to make this behave more reliably?

@renatocarusos
Copy link

HEAP CORRUPT too, using a lora library ...

@XYZVector
Copy link

DJMCMATH the way my system works is it generates a JSON packet to send out to it's clients on a interval, This interval is around 5 seconds, to 10 seconds. I used to have it on 1 second intervals. Now on a 5 to 10 second interval it does still crash however it takes so long to crash that it really does not interfere with the operation of my device. A reboot is not that costly, and doesn't effect my application much anyways. However on 1 second intervals the crash would happen every 100-600 updates. That is around 1 minute to 6 minutes. Also on my application the clients disconnect and are only connected for a minute to 10 minutes at a time.

Digging down with a debugger I have found that the multi threading model is not quite supported correctly, and one of the buffers don't get freed correctly. The issue is below the AsyncSocketServer class, as the error occurs deep in the bowls of the TCP/IP library, and on further research it was not designed to be thread safe. MeNoDev will have his hands full trying to fix this as it is not just an issue with the the AsyncSocketServer class but how you have to handle the TCP/IP library of the ESP dev kit. I am not trying to gloss over the issue here but it is not a simple bug fix to correct this issue. This would require a complete re-write, and a lot of the Library would have to change to make it work properly. You would have to write thread safe handlers, and other details.

So you can mitigate this error by slowing down the transactions you send back and forth, or you can look else where for another socket library. However this is still the fastest library on the ESP32, and it does work most the time.

-XYZVector

@djmcmath
Copy link

djmcmath commented Aug 2, 2018

XYZVector, thanks for the details.

I'm at a point where I pretty much need to do what I can to not crash ever. Granted, a reboot almost always comes back up cleanly 30 seconds later, but crashes are pretty embarrassing, even if infrequent. I may try to slow the update rate (though the users are particularly fond of the high-speed updates), or attempt to integrate a different library for sockets.

Any recommendations for an alternate socket library that integrates cleanly with this server?

Thanks,
Dan

@XYZVector
Copy link

XYZVector commented Aug 2, 2018 via email

@djmcmath
Copy link

djmcmath commented Aug 2, 2018

That's what I was afraid of. I spent a bunch of last Saturday looking for something that works a bit better, but couldn't find anything that's as clean and elegant. (sigh) Oh, well. Thanks for the help.

@XYZVector
Copy link

XYZVector commented Aug 3, 2018 via email

@me21
Copy link

me21 commented Aug 12, 2018

So LWIP library should be rewritten to fix this?

@stale
Copy link

stale bot commented Aug 1, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Aug 1, 2019
@stale
Copy link

stale bot commented Aug 15, 2019

This stale issue has been automatically closed. Thank you for your contributions.

@stale stale bot closed this as completed Aug 15, 2019
@ab1nash
Copy link

ab1nash commented Nov 2, 2019

I was facing this problem while using HTTPclient and ArduinoJSON libraries but when I returned from my createCI function, e.g. to a string, it worked.

@woodlist
Copy link

Having used following libraries
#include <Arduino.h>
#include <WiFi.h>
#include <AsyncTCP.h>
#include <SPIFFS.h>
#include <ESPAsyncWebServer.h>
#include <HTTPClient.h>
opon several minutes well working at 300 miliseconds asyncronous refreshing rate got this message:
17:58:05.361 -> CORRUPT HEAP: Bad tail at 0x3ffd68d8. Expected 0xbaad5678 got 0xbaad5600
17:58:05.361 -> assertion "head != NULL" failed: file "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/heap/multi_heap_poisoning.c", line 214, function: multi_heap_free
17:58:05.361 -> abort() was called at PC 0x4012c3e7 on core 1

@woodlist
Copy link

woodlist commented Apr 26, 2020

another crash, on same sketch:
18:16:07.486 -> Guru Meditation Error: Core 1 panic'ed (LoadStoreAlignment). Exception was unhandled.
18:16:07.521 -> Core 1 register dump:
18:16:07.521 -> PC : 0x4016b6bb PS : 0x00060830 A0 : 0x800d3499 A1 : 0x3ffd1460
18:16:07.521 -> A2 : 0x5678dead A3 : 0x3f402e22 A4 : 0x00000000 A5 : 0x0000ff00
18:16:07.521 -> A6 : 0x00ff0000 A7 : 0xff000000 A8 : 0x00000000 A9 : 0x00000000
18:16:07.521 -> A10 : 0xffffffff A11 : 0x3f402e22 A12 : 0x00000001 A13 : 0x3ffd14c2
18:16:07.521 -> A14 : 0x00000036 A15 : 0x00000000 SAR : 0x0000000a EXCCAUSE: 0x00000009
18:16:07.555 -> EXCVADDR: 0x5678deb1 LBEG : 0x4000c2e0 LEND : 0x4000c2f6 LCOUNT : 0xffffffff

@atanisoft
Copy link
Collaborator

@woodlist open a new issue rather than commenting on a long closed issue. However, from what little you have posted you seem to be possibly running low on memory or you have written past some buffer somewhere in your code.

@woodlist
Copy link

Serial montor non using and the browser exchange from Crome to device's native helped too much.

vortigont added a commit to vortigont/espem that referenced this issue Apr 24, 2021
Wrapper class for PZEM004T/PZEM004Tv30 libs, controlled with USE_PZEMv3 definition

Note: Async Server has lot's of issues under esp32

me-no-dev/ESPAsyncWebServer#876
me-no-dev/ESPAsyncWebServer#900
espressif/arduino-esp32#1101
me-no-dev/ESPAsyncWebServer#324
me-no-dev/ESPAsyncWebServer#932

Signed-off-by: Emil Muratov <gpm@hotplug.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale Issue is stale stage (outdated/stuck)
Projects
None yet
Development

No branches or pull requests