Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in AsyncWebSocket::textAll(String const&) #900

Closed
0xFEEDC0DE64 opened this issue Dec 12, 2020 · 22 comments
Closed

Crash in AsyncWebSocket::textAll(String const&) #900

0xFEEDC0DE64 opened this issue Dec 12, 2020 · 22 comments
Labels

Comments

@0xFEEDC0DE64
Copy link

After a few hours of uptime and lots of calls to textAll, it sometimes crashes with the following stack:

0x401125f4 is in AsyncWebSocket::_cleanBuffers() (.pio/libdeps/homeplus_ota/ESP Async WebServer/src/AsyncWebSocket.cpp:1235).
1235      for(AsyncWebSocketMessageBuffer * c: _buffers){
0x40112eef is in AsyncWebSocket::textAll(AsyncWebSocketMessageBuffer*) (.pio/libdeps/homeplus_ota/ESP Async WebServer/src/AsyncWebSocket.cpp:968).
968      _cleanBuffers(); 
0x40112f07 is in AsyncWebSocket::textAll(char const*, unsigned int) (.pio/libdeps/homeplus_ota/ESP Async WebServer/src/AsyncWebSocket.cpp:974).
974        textAll(WSBuffer); 
0x40112f2d is in AsyncWebSocket::textAll(String const&) (/home/feedc0de/.platformio/packages/framework-arduinoespressif32/cores/esp32/WString.h:308).
308            inline const char *buffer() const { return (const char *)(isSSO() ? sso.buff : ptr.buff); }
0x400fcd70 is in goe::http_broadcast(String const&) (src/http_websocket.cpp:145).
145        ws.textAll(msg);

Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception was unhandled.

@zekageri
Copy link

Iam experiencing the same issue. Sending char with text all every sec results in this exact same error and restart

@0xFEEDC0DE64
Copy link
Author

I'm starting to remove all ESPAsyncWebServer code from my project to replace it with espressif provided APIs, I see no future with this webserver if it keeps crashing all over the place after long runtimes, I need rock solid 24/7 ontime, seems like most arduino related library suffer of poor coding quality and bad memory management.

In my investigations on this webserver library here I saw handwritten linked lists in a very strange coding style with callbacks and so on instead of using perfectly fine C++ provided containers. WTF. I am wondering how it kept working so good for so long in my development cycle so far

@lorol
Copy link

lorol commented Dec 16, 2020

Good for you!
Write your own then.
I believe you should first get even a fraction of the popularity of this code before saying "WTF".

@0xFEEDC0DE64
Copy link
Author

I'm just saying the problems could all have been avoided by not writing another container implementation, chances are very low to write a bug free version, c++ std library does a very good job at it with big comittee behind it, porting arduino containers over to std containers would fix crashes for everyone, im under time pressure to finish the project and therefore cannot do it right now

@0xFEEDC0DE64
Copy link
Author

My long journey of this crash finally found an end, I got rid of all the crashes 👍
First of all, I switched from platformio to esp-idf, which freed around 40kb of heap memory (which already took days to port all code over, but I'm still using arduino and ESPAsyncWebserver).
Then I started replacing all the linked lists with C++ provided ones, committed my hacks to a branch, and used it as a esp-idf component.
I don't know if I broke any of the features I don't use of the webserver, but the websocket and SPIFFS provided html page are rock solid now and never crashed again in the last few days.
If anyone is interested in the branch of the webserver, i pushed it to github and might do further cleanups, but I guess a pull-request will never accepted, so I won't really bother creating one in the first place. (please forgive me the branch name, but i was in a little rage when I saw the original implementations):

https://github.com/me-no-dev/ESPAsyncWebServer/compare/master...0xFEEDC0DE64:dumbfixes?expand=1

@sergeymaysak
Copy link

sergeymaysak commented Dec 22, 2020

Hey Daniel,
First I've been very skeptic on your mood regarding c++ stl and usage of custom linked lists and array in this server. It is widely used opinion (proven by practice indeed) that stl is not optimized for embedded platform due to its dynamic allocation nature. Also usually usage of stl containers significantly increases size of binary (even on desktops) - so usage of linked lists makes a lot of sense in context of esp32 for example.
Also I feel your pain for 200% as already spent year ago say 1 month of bumping into crashes with web socket impl you depicted and have to rollback to blocking sync version of WebSockets for Arduino...

And you know - I tried your version and wow - it actually decreases size of my binary on 9% (170k). Well it took me 30 minutes to switch from default gnu++11 to gnu++17 and still _queue = {} way to clear the queue does not work for me and was replaced with explicit constructors of empty queue, but... it works! I continue to test web server I have and so far it seems no features lost... I would need some time to switch back to async websocket again ) Also want to say that I verified if actual switch to c++17 helped with binary size - and no - it does not help...

Thank you for efforts and decision to write here with your results. Also imo your memory management and threading improvements should be merged from your pull request for everybody for sure...

@proddy
Copy link

proddy commented Dec 23, 2020

Interesting thread. I may try out @0xFEEDC0DE64's port too.

In defence to the custom linked-list argument, don't forget how much available DRAM is available on an ESP8266 so any short-cut helps in that area. I recently replaced a few of of the Arduino STL containers I was using in one of my projects with homegrown ones. I wrote a small sketch that loaded up 200 elements into a std::list/std::queue/std::vector and measured the heap memory it used on an ESP8266 as well as the fragmentation. The overhead is huge and by replacing with a light-weight alternative classes I was able to reduce the heap overhead by 15-20% in most cases. I also love using std::function and std::bind and lambda's but choose to replace those with ugly old-fashioned C-style function pointers which free'd up another 5%.

Also I used this port of AsyncWebServer https://github.com/sascha432/ESPAsyncWebServer which saves an additional 2K by moving all strings to Flash/PROGMEM.

For ESP32 only check out https://github.com/fhessel/esp32_https_server

@avillacis
Copy link

I am very interested in any fixes to the stability of this webserver. So, in order to do some code review/cleanup, and also to make the code compatible with the default C++11 setup in Arduino/ESP32, I went through every patch in the dumbfixes branch of @0xFEEDC0DE64 . The result is here: yuboxfixes-0xFEEDC0DE64-cleanup. Besides making the patches C++11 compatible, I wanted to lay out every single fix and understand why each one of them was necessary. I am performing a torture test right now with the yuboxfixes-0xFEEDC0DE64-cleanup branch, and it has not crashed once yet in 3 hours so far (previous attempts crashed with various memory corruption backtraces).

@proddy
Copy link

proddy commented Jan 4, 2021

@avillacis thanks for doing that! Have you noticed any differences other than the web-socket stability, such as changes in the memory heap used or firmware/sketch size?

@avillacis
Copy link

@avillacis thanks for doing that! Have you noticed any differences other than the web-socket stability, such as changes in the memory heap used or firmware/sketch size?

I have not checked RAM usage, but the firmware file is 1025344 bytes before the changes by @0xFEEDC0DE64 , and 1026064 after.

@justoke
Copy link

justoke commented Jan 15, 2021

My long journey of this crash finally found an end, I got rid of all the crashes 👍
First of all, I switched from platformio to esp-idf, which freed around 40kb of heap memory (which already took days to port all code over, but I'm still using arduino and ESPAsyncWebserver).
Then I started replacing all the linked lists with C++ provided ones, committed my hacks to a branch, and used it as a esp-idf component.
I don't know if I broke any of the features I don't use of the webserver, but the websocket and SPIFFS provided html page are rock solid now and never crashed again in the last few days.
If anyone is interested in the branch of the webserver, i pushed it to github and might do further cleanups, but I guess a pull-request will never accepted, so I won't really bother creating one in the first place. (please forgive me the branch name, but i was in a little rage when I saw the original implementations):

https://github.com/me-no-dev/ESPAsyncWebServer/compare/master...0xFEEDC0DE64:dumbfixes?expand=1

I'm not sure if I missed something obvious, but I can't compile using the commit referenced here. Please could you outline any changes that are needed to use this branch. Here are some of the issues I ran into - there were a lot of compile errors, so I assume something else needs doing and not just adding the git url to my platformio.ini file. h2zero/NimBLE-Arduino#167 (comment)

@garageeks
Copy link

garageeks commented Jan 19, 2021

Kudos to @0xFEEDC0DE64 for the patch and @avillacis for the code cleanup!
I was plagued by random crashes right from the start of using this library to something very simple, obtain a JSON from a POST request and deserialize it through ArduinoJSON. It doesn't use websockets.
With the latest AsyncWebServer build in PlatformIO I had a crash between 10-120 mins.
After replacing it with the patched version, no crashes whatsoever for 24 hours.

Here's the test code:

//
// A simple server implementation showing how to:
//  * serve static messages
//  * read GET and POST parameters
//  * handle missing pages / 404s
//

#include <Arduino.h>
#include <WiFi.h>
#include "src/ESPAsyncWebServer/AsyncTCP.h"
#include "src/ESPAsyncWebServer/ESPAsyncWebServer.h"
#include "src/ESPAsyncWebServer/AsyncJson.h"
#include <ArduinoJson.h>


AsyncWebServer server(80);

// Set your Static IP address
IPAddress local_IP(192, 168, 1, 2);  
// Set your Gateway IP address
IPAddress gateway(192, 168, 1, 1);

IPAddress subnet(255, 255, 255, 0);
IPAddress primaryDNS(8, 8, 8, 8);     //optional
IPAddress secondaryDNS(8, 8, 4, 4);   //optional

const char* ssid = "xxx";
const char* password = "yyy";

const char* PARAM_MESSAGE = "message";

const int stationID = 98;


long tableUptime[3];
uint16_t tableError[3];
byte tableUsbCharging[3];
byte tableUsbCounter[3];
long tableUsbTime[3];
byte tableQiCharging[3];
byte tableQiCounter[3];
long tableQiTime[3];
float tableInputVoltage[3];
float tableUsb1Voltage[3];
float tableUsb2Voltage[3];
float tableQiVoltage[3];
float tableUsbPower[3];
float tableQiPower[3];

bool tableUsbStatus[3];
bool tableQiStatus[3];
uint16_t tableMinWiFiDelay;
uint16_t tableMaxWiFiDelay;
uint16_t tableSleepDelay;

long previousMillis;

void notFound(AsyncWebServerRequest *request) {
    request->send(404, "text/plain", "Not found");
}

void setup() {

  Serial.begin(115200);
  Serial.print("Setting AP (Access Point)…");
  // Remove the password parameter, if you want the AP (Access Point) to be open

  // Configures static IP address
  if (!WiFi.config(local_IP, gateway, subnet, primaryDNS, secondaryDNS)) {
    Serial.println("STA Failed to configure");
  }

  Serial.print("Connecting to AP...");

  int wait_time = 0;

  

  WiFi.mode(WIFI_STA);

    WiFi.begin(ssid, password);
    delay(150);
    char devicename[20]="Station-";
      char tempValue[5];
      itoa(stationID,tempValue,10);
      strcat(devicename, tempValue);
    WiFi.setHostname(devicename);

  delay(500);

  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
    if(wait_time > 10) break;
    wait_time++;
    feedLoopWDT();
  }

  if(WiFi.status() == WL_CONNECTED) {     //WiFi AP is available, go ahead
    Serial.println("succeeded");

  } else {                                //WiFi not available, increment failed counter
    Serial.println("failed");
    WiFi.mode(WIFI_OFF);            //switch it off since it didn't connect. Will try again later   
  }

  webServer();
}

void loop() {

  long curMillis = millis();

  if(curMillis - previousMillis > 10000) {
    previousMillis = curMillis;
    Serial.print("Uptime: ");
    Serial.println(curMillis);
  }
}


void webServer() {

  // Add a handler to process the JSON contained in POST request table-update
  AsyncCallbackJsonWebHandler *handler = new AsyncCallbackJsonWebHandler("/table-new", [](AsyncWebServerRequest *request, JsonVariant &json) {
    StaticJsonDocument<1024> receivedDoc;
    //DynamicJsonDocument receivedDoc(1024);
    if (json.is<JsonArray>()) {
      receivedDoc = json.as<JsonArray>();

    } else if (json.is<JsonObject>()) {
      receivedDoc = json.as<JsonObject>();
    }

    int tableID = receivedDoc["ID"];
    if(tableID > 0 && tableID < 4) {        //tables are numbered 1,2,3
      tableID=tableID-1;    //subtract 1 as we store data in arrays
      //we aren't doing data validation as ArduinoJson returns zero if the variable is not big enough to hold the information
      tableUptime[tableID] = receivedDoc["uptime"];     
      tableError[tableID] = receivedDoc["error"];             //when it is zero there are no errors
      tableUsbCharging[tableID] = receivedDoc["usbCharging"];
      tableUsbCounter[tableID] = receivedDoc["usbCounter"];
      tableUsbTime[tableID] = receivedDoc["usbTime"];
      tableQiCharging[tableID] = receivedDoc["qiCharging"];
      tableQiCounter[tableID] = receivedDoc["qiCounter"];
      tableQiTime[tableID] = receivedDoc["qiTime"];
      tableInputVoltage[tableID] = receivedDoc["inputVoltage"];
      tableUsb1Voltage[tableID] = receivedDoc["usb1Voltage"];
      tableUsb2Voltage[tableID] = receivedDoc["usb2Voltage"];
      tableQiVoltage[tableID] = receivedDoc["qiVoltage"];
      tableUsbPower[tableID] = receivedDoc["usbPower"];
      tableQiPower[tableID] = receivedDoc["qiPower"];
      request->send(200);     //reply with a HTTP 200 OK message
    } else {
            Serial.println("[HTTP] Error: missing table ID in table-update request");
      request->send(400);     //reply with a HTTP 400 Bad Request message
        }


    });

    server.addHandler(handler);

  // Reply to table-command GET request with a JSON containing the table commands
    server.on("/table-send", HTTP_GET, [] (AsyncWebServerRequest *request) {
        if (request->hasParam("id")) {
      int tableID = request->getParam("id")->value().toInt();     //we need to convert a String value into an integer
      Serial.print("[HTTP] sending commands to table ID ");
      Serial.println(tableID);
      tableID=tableID-1;    //subtract 1 as we store data in arrays
      //DynamicJsonDocument sentDoc(512);
      StaticJsonDocument<512> sentDoc;
      if(tableUsbStatus[tableID]) {
        sentDoc["usb"] = "on";
      } else {
        sentDoc["usb"] = "off";
      }
      if(tableQiStatus[tableID]) {
        sentDoc["qi"] = "on";
      } else {
        sentDoc["qi"] = "off";
      }   
      sentDoc["minWiFiDelay"] = tableMinWiFiDelay;
      sentDoc["maxWiFiDelay"] = tableMaxWiFiDelay;
      sentDoc["sleepDelay"] = tableSleepDelay;

      char response[512];
      serializeJson(sentDoc, response);
      request->send(200, "application/json", response); //reply with a HTTP 200 OK message
      Serial.println(response);
  
        } else {
            Serial.println("[HTTP] Error: missing table ID in request");
      request->send(400);     //reply with a HTTP 400 Bad Request message
        }
    });

    server.onNotFound(notFound);

    server.begin();
}

@wachidsusilo
Copy link

i have the same issue here. The crash is random and unpredictable.

@oaklandGman
Copy link

Any progress on this issue? I am finding it easy to reproduce using textAll to send small websocket messages (approx 35-40 bytes) to a client, at approximately 5hz (200ms delay between messages), after about five minutes the ESP32 will crash with a corrupt heap error.

@0xFEEDC0DE64
Copy link
Author

I completely ditched the ESPAsyncWebserver from my project and started using the espressif provided httpd, which also forced to ditch platformio for the esp-idf, and since them I have never encountered any bugs or any problems anymore. The refactoring was huge to switch to the C apis from espressif but was worth all the efforts :)

@gnalbandian
Copy link

gnalbandian commented Apr 9, 2021

@0xFEEDC0DE64 and @avillacis, thanks for the reworked library. It's a pity the official repo is outdated and not taking pull request anymore.

@avillacis, would you mind enabling issues in your repo, so as to keep the library evolving somewhere?
If you are willing to:
Go to the Settings page of your fork.
Check the box next to Issues

I have tried to test the library but it seems that the class AsyncWebSocketMessageBuffer has been removed.
Any ideas why? For sure for a good reason, but, could you please explain how to circumvent the use of it so I can test the library? Thanks!

vortigont added a commit to vortigont/espem that referenced this issue Apr 24, 2021
Wrapper class for PZEM004T/PZEM004Tv30 libs, controlled with USE_PZEMv3 definition

Note: Async Server has lot's of issues under esp32

me-no-dev/ESPAsyncWebServer#876
me-no-dev/ESPAsyncWebServer#900
espressif/arduino-esp32#1101
me-no-dev/ESPAsyncWebServer#324
me-no-dev/ESPAsyncWebServer#932

Signed-off-by: Emil Muratov <gpm@hotplug.ru>
@stale
Copy link

stale bot commented Jun 9, 2021

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 9, 2021
@avillacis
Copy link

@0xFEEDC0DE64 and @avillacis, thanks for the reworked library. It's a pity the official repo is outdated and not taking pull request anymore.

@avillacis, would you mind enabling issues in your repo, so as to keep the library evolving somewhere?
If you are willing to:
Go to the Settings page of your fork.
Check the box next to Issues

Sorry for the late reply. I have been busy with work. Issues on the fork are now enabled.

I have tried to test the library but it seems that the class AsyncWebSocketMessageBuffer has been removed.
Any ideas why? For sure for a good reason, but, could you please explain how to circumvent the use of it so I can test the library? Thanks!

As commented in branch yuboxfixes-0xFEEDC0DE64-cleanup , commit yubox-node-org@4963ce9 removes AsyncWebSocketMessageBuffer in order to use instead a std::shared_buffer implementation. This achieves a significant code simplification as well as removing a homebrewed implementation of a smart pointer to a buffer. I am not sure what you mean by "circumvent". What scenario or code path do you intend to exercise?

@stale
Copy link

stale bot commented Jun 9, 2021

[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.

@stale stale bot removed the stale label Jun 9, 2021
@jjwbruijn
Copy link

@0xFEEDC0DE64 and @avillacis (and others) many thanks for this improved fork! I've found it to be a tremendous improvement in stability with websockets. I hope other people will find this post earlier than I did.

@stale
Copy link

stale bot commented Sep 3, 2021

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 3, 2021
@stale
Copy link

stale bot commented Sep 19, 2021

[STALE_DEL] This stale issue has been automatically closed. Thank you for your contributions.

@stale stale bot closed this as completed Sep 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests