Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory exhaustion due to erroneous read size #26

Closed
Jonty opened this issue Aug 3, 2019 · 6 comments
Closed

Memory exhaustion due to erroneous read size #26

Jonty opened this issue Aug 3, 2019 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@Jonty
Copy link

Jonty commented Aug 3, 2019

I've found yet another bug while working on receiving extremely large websocket messages, however I'm not entirely convinced it's size related.

The same test program as from the previous large message issue should reproduce the problem:

#include <ArduinoWebsockets.h>
#include <WiFi.h>

const char* ssid = "SSID";
const char* password = "PASSWORD";

const char* websockets_connection_string = "wss://ws.blockchain.info/inv";

using namespace websockets;

void onMessageCallback(WebsocketsMessage message) {
    Serial.print("Got Message: ");
    Serial.println(message.data());
}

WebsocketsClient client;

void onEventsCallback(WebsocketsEvent event, String data) {
    if(event == WebsocketsEvent::ConnectionOpened) {
        Serial.println("ConnectionOpened");
    }
}

void setup() {
    Serial.begin(115200);

    WiFi.begin(ssid, password);
    for(int i = 0; i < 10 && WiFi.status() != WL_CONNECTED; i++) {
        Serial.print(".");
        delay(1000);
    }

    client.onMessage(onMessageCallback);
    client.onEvent(onEventsCallback);

    client.setCACert(nullptr);
    client.connect(websockets_connection_string);

    // Results in total memory exhaustion, regardless of the amount of free memory available
    // ESP Panics
    client.send("{\"op\":\"ping_block\"}");
}

void loop() {
    client.poll();
}

I obtained an ESP32 WROVER module with 8mb of ram, which I know all the messages should easily fit into. At first I thought the allocator was failing to use the external memory as it carried on crashing, but after some digging it's revealed an odd bug.

It seems that the following segment of code in websockets_endpoint.cpp is the cause of the memory exhaustion:

uint32_t numReceived = socket.read(buffer, to_read);
for (uint64_t i = 0; i < numReceived; i++) {
data[done_reading + i] = static_cast<char>(buffer[i]);
}

Inserting a print statement to show the size of numReceived reveals normal behaviour right up until the crash, at which point the value of numReceived is set to the maximum value for uint32_t (4294967295) and the loop reads garbage data until the device runs out of memory.

Changing uint32_t to uint64_t for the return of client.read() resulted in the max value for uint64_t being returned instead.

This is a log & backtrace of the above program running, with websockets_endpoint.cpp patched to print out the size of numReceived:

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:852
load:0x40078000,len:8424
load:0x40080400,len:5880
entry 0x4008069c
.[E][WiFiClient.cpp:309] setOption(): fail on fd -1, errno: 9, "Bad file number"
ConnectionOpened
size: 512
size: 512
size: 512
size: 512
size: 512
size: 512
size: 512
size: 498
size: 512
size: 512
size: 512
size: 512
size: 512
size: 512
size: 512
size: 498
size: 512
size: 512
size: 512
size: 512
size: 326
size: 512
size: 512
size: 512
size: 172
size: 512
size: 512
size: 373
size: 4294967295
Guru Meditation Error: Core  1 panic'ed (LoadStoreError). Exception was unhandled.
Core 1 register dump:
PC      : 0x401421ca  PS      : 0x00060c30  A0      : 0x8014228c  A1      : 0x3ffb1bf0
0x401421ca: websockets::internals::readData[abi:cxx11](websockets::network::TcpClient&, unsigned long long) at /Users/jonty/code/esp/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/bits/shared_ptr_base.h:659

A2      : 0x00000575  A3      : 0x3ffbb9b8  A4      : 0x00000ff2  A5      : 0xffffffff
A6      : 0x3ffb1e44  A7      : 0x00000000  A8      : 0x0001f627  A9      : 0x40000000
A10     : 0x00000000  A11     : 0x000000fe  A12     : 0x3ffb1c0c  A13     : 0x00000000
A14     : 0x00000ff2  A15     : 0x3ffe636c  SAR     : 0x00000004  EXCCAUSE: 0x00000003
EXCVADDR: 0x40000000  LBEG    : 0x40098705  LEND    : 0x40098715  LCOUNT  : 0xffffffff
0x40098705: strlen at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/machine/xtensa/../../../../.././newlib/libc/machine/xtensa/strlen.S:84

0x40098715: strlen at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/machine/xtensa/../../../../.././newlib/libc/machine/xtensa/strlen.S:96


Backtrace: 0x401421ca:0x3ffb1bf0 0x40142289:0x3ffb1e30 0x401433fd:0x3ffb1e80 0x4013f7e3:0x3ffb1ef0 0x4013bdfa:0x3ffb1f90 0x401207e1:0x3ffb1fb0 0x4008b799:0x3ffb1fd0
0x401421ca: websockets::internals::readData[abi:cxx11](websockets::network::TcpClient&, unsigned long long) at /Users/jonty/code/esp/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/bits/shared_ptr_base.h:659

0x40142289: websockets::internals::WebsocketsEndpoint::_recv() at /Users/jonty/code/esp/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/bits/shared_ptr_base.h:659

0x401433fd: websockets::internals::WebsocketsEndpoint::recv() at /Users/jonty/code/chain/components/arduino/libraries/ArduinoWebsockets/src/websockets_endpoint.cpp:312

0x4013f7e3: websockets::WebsocketsClient::poll() at /Users/jonty/code/chain/components/arduino/libraries/ArduinoWebsockets/src/websockets_client.cpp:499

0x4013bdfa: loop() at /Users/jonty/code/chain/main/main.cpp:93

0x401207e1: loopTask(void*) at /Users/jonty/code/chain/components/arduino/cores/esp32/main.cpp:19

0x4008b799: vPortTaskWrapper at /Users/jonty/code/esp/esp-idf/components/freertos/port.c:355 (discriminator 1)

to_read is always a sane value, so I also tried adding an extra check to the read loop to ensure that i <= to_read and while this prevented the memory exhaustion it resulted in being disconnected from the server shortly afterward, so clearly the session state is messed up.

This should be easily reproducible on an ESP32 with limited memory as the exhaustion trigger happens before any significant memory usage, although I've not tested this.

Let me know if you need any more information, or want me to test things.

@gilmaimon
Copy link
Owner

Interesting.

The MAX value you are seeing is actually -1 but is shown as max value because the variable is signed. -1 indicates an error in receiving data from the socket. I will look into it now.

Also, I've added a c++ tag to your code snipped (edited your post) so it's easier to read, I hope you don't mind.

I will look at it and will keep you updated.

Gil.

@Jonty
Copy link
Author

Jonty commented Aug 3, 2019

I thought that might be the case but didn't want to presume! Let me know if I can help with the debugging.

@gilmaimon
Copy link
Owner

Sure,
Few experiments you could make that will help pin down the problem:

  • Run the exact same thing with the highest debug settings. Meaning set esp32 logs to Verbose. This should tell us a lot about what's going in inside the esp32 libraries.
  • Run the code with the latest Arduino-ESP32 libraries. I've found out recently that the Espressif team working on that is doing a lot of changes without pushing them as stable releases. Can you try cloning the master branch and run the same code?
  • Last thing, which will help if this is actually a memory issue. Is to add this line before connect:
client.setFragmentsPolicy(FragmentsPolicy_Notify);

Websocket servers often use fragmented messages to send large amounts of data. I aggregate those fragments inside the library's memory and then pass it as a complete message. Adding that line will cause each fragment to be passed to the callback, thus (hopefully) saving plenty of memory. If the esp32 is running out of space, this might help.

I had to reset my PC this weekend, so I'm still setting everything up.

Thanks,
Gil.

@gilmaimon gilmaimon added the bug Something isn't working label Aug 3, 2019
@gilmaimon gilmaimon self-assigned this Aug 8, 2019
@gilmaimon
Copy link
Owner

I've worked on your issue, I believe it is fixed now. Please look at the branch bugfix/unchecked_reads_issue_26

It should really improve error handling and memory management.

Please let me know if it's working for you (with your new board) before I merge it.

Gil.

@Jonty
Copy link
Author

Jonty commented Aug 9, 2019

This works beautifully! Apologies I didn't have time to help you debug it.

Thank you so much for all your help. If you have a wishlist/patreon/similar, please let me know.

@gilmaimon
Copy link
Owner

gilmaimon commented Aug 9, 2019

Thank you! Iv'e released version 0.4.10 with the patch.

Your time invested in opening your issues and helping the library evolve is all I need! (You could give the library a star, it might help others find it 😅)

Have an awesome day,
Gil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants