Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler/linker should do a better job handling global arrays instead of crashing at runtime before setup() runs #2567

Open
marcmerlin opened this Issue Mar 11, 2019 · 9 comments

Comments

Projects
None yet
2 participants
@marcmerlin
Copy link

marcmerlin commented Mar 11, 2019

Hardware:

Board: NODEMCU 32S ESP32
Core Installation version: commit fff1783 Tue Aug 14 13:01:07 2018
IDE name: Arduino IDE
Flash Frequency: 80
PSRAM enabled: no
Upload Speed: 115200
Computer OS: linux

So, this is not a new problem, where there are 3 issues actually:

  1. if I allocate a 12KB array on the stack (inside the function), the program just dies with no warning or traceback because the stack gets smashed. The compiler/linker should be able to detect that the array is too big for whatever stack is going to be configured and give some kind of error when the final binary is generated
    I totally understand that if the stack happens to be 10KB and I allocate 9KB, depending on what's in it, I may still smash the stack without the linker being easily able to know it, but if it's compiling code where I try to store 11KB in a 10KB stack, it should immediately be able to tell that it's not going to work.

  2. If I allocate a global array that is too big, the "right" thing happens
    xtensa-esp32-elf/bin/ld: region `dram0_0_seg’ overflowed by 12040 bytes
    This is both good and bad. Good because it's a warning that gives me some clue and doesn't create a binary that will crash
    Bad, because ultimately I was not creating that many arrays (less than 50KB more or less), which totally fits in RAM (and works if I convert to malloc), but somehow the compiler/linker fails to generate code that just works (it works on teensy). I had to spend non trivial time converting a library from arrays to mallocs before it worked. This is apparently a known problem on ESP32 as per
    https://community.pixelmatix.com/t/esp32-runs-out-of-some-ram-when-using-64x96/394/3?u=marcmerlin

  3. if my arrays aren't too big, then I get a binary that crashes immediately at start, which is bad.
    This was fixed by changing all arrays to mallocs
    marcmerlin/AnimatedGIFs@c2dc4c7

Compile output:
Global variables use 110860 bytes (33%) of dynamic memory, leaving 216820 bytes for local variables. Maximum is 327680 bytes.

crash before changing to malloc:
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
0x40088960: invoke_abort at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/esp32/panic.c line 140
0x40088b63: abort at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/esp32/panic.c line 149
0x40082b72: start_cpu0_default at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/esp32/cpu_start.c line 381
0x40082d04: call_start_cpu0 at /Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/esp32/cpu_start.c line 213

Long story short, all this code worked great on teensy and ESP32's inability to use arrays that aren't huge, when stocked on heap (global vars) is both unexpected, and not handled by the linker as an error condition, when arguably it should be.

@marcmerlin

This comment has been minimized.

Copy link
Author

marcmerlin commented Mar 11, 2019

further discussion https://gitter.im/espressif/arduino-esp32?at=5c8557a7d1e7281f09087d91
If these problems really require standalone code to diagnose, I can try to make shorter code for this bug, but from what I've heard, these are well known problems.

@marcmerlin

This comment has been minimized.

Copy link
Author

marcmerlin commented Mar 11, 2019

I think this bug is tangentially related although not a dupe #1297

@projectgus

This comment has been minimized.

Copy link
Member

projectgus commented Mar 14, 2019

Hi @marcmerlin ,

Most of this behaviour is explained by a restriction in ESP-IDF on static memory size. Unfortunately, the maximum static binary RAM usage is quite a bit less than the runtime available RAM. I replied on the forum with more details, here:
https://esp32.com/viewtopic.php?f=19&t=9612&p=40109#p40109

if I allocate a 12KB array on the stack (inside the function), the program just dies with no warning or traceback because the stack gets smashed. The compiler/linker should be able to detect that the array is too big for whatever stack is going to be configured and give some kind of error when the final binary is generated
I totally understand that if the stack happens to be 10KB and I allocate 9KB, depending on what's in it, I may still smash the stack without the linker being easily able to know it, but if it's compiling code where I try to store 11KB in a 10KB stack, it should immediately be able to tell that it's not going to work.

There's two possible ways for stack to overflow here:

  • Stack smashing. ie you allocate an array of size 1KB on the stack but write to 2KB of it. This is kind-of-detectable, usually at runtime only, with a flag like -fstack-protector. We offer this option in ESP-IDF, not sure if it's enabled in the Arduino config, but it's not the problem described here.

  • Simple stack overflow - you allocate an array of size 12KB on the stack but your task stack size is only 8KB. It's not possible for a C compiler/linker to know much memory is allocated to the stack of the FreeRTOS task that the code happens to be running in. We do some tricks to try and detect stack overflows at runtime during context switches, but this code also has to run in the microcontroller - so if the stack overflow has corrupted something else important, it can get stuck before this check can run.[*] I'm guessing this is what happened here.

[*] I think on "big" OSes you can use -fstack-limit to detect this at runtime more reliably than the checking we do now. I'll look into whether we can support this in ESP-IDF.

@marcmerlin

This comment has been minimized.

Copy link
Author

marcmerlin commented Mar 14, 2019

Hi @projectgus , nice to hear from you.
So, in my previous array in the stack issue, I was unclear. I have gotten an ESP32 to crash by simply allocating a 12KB array in the stack. My code didn't overrun the array, it just blew up because the stack couldn't contain said array. After countless hours lost over many days, someone told me the stack was likely too small and I moved the array as a global, and everything worked.
As for the single stack overflow, the compiler/linker comes in a suite of files from esp32 git. Why does it not know what the stack is going to be at compile time and tell the linker not to allow allocations that won't fit? Or are you saying FreeRTOS chooses how big the stack is at runtime and there is no way to know at compile time what it'll be?

Then, you addressed 2) in my original bug which basically seems to be a FreeRTOS shortcoming that it can't allocate bigger static arrays even though there is many times that amount of RAM still free. It was not clear from your reply whether this can or will be fixed?
As a dev, it's quite a pain to have to modify existing libraries in to find all the arrays and move them to mallocs.

But for 3) espressif/esp-idf#1934 (comment) documents what the limitations seem to be at this time like you said (thanks) but is there an existing bug to handle the (apparently new) case of arrays fitting at compile time but taking enough space that FreeRTOS crashes at startup? (assert on malloc fail as you pointed out).
If not, should this bug be renamed to become the bug to track this one issue?
Thanks, Marc

@projectgus

This comment has been minimized.

Copy link
Member

projectgus commented Mar 14, 2019

Hi @marcmerlin ,

So, in my previous array in the stack issue, I was unclear. I have gotten an ESP32 to crash by simply allocating a 12KB array in the stack. My code didn't overrun the array, it just blew up because the stack couldn't contain said array. After countless hours lost over many days, someone told me the stack was likely too small and I moved the array as a global, and everything worked.

I understand. Of the two kinds of stack bugs I mentioned, this is the second type. I just mentioned the first type for clarity, because people often confuse "stack smashing" (first type) with "valid C code uses too much stack memory for the runtime environment" (second type).

As for the single stack overflow, the compiler/linker comes in a suite of files from esp32 git. Why does it not know what the stack is going to be at compile time and tell the linker not to allow allocations that won't fit? Or are you saying FreeRTOS chooses how big the stack is at runtime and there is no way to know at compile time what it'll be?

Yes, and yes.

C, as a language, only thinks about stack as a fairly abstract concept. You can make gcc warn if a function uses more than a fixed X bytes of stack, but how does the compiler know how many bytes of stack are available when that function is called?

FreeRTOS is a multitasking RTOS, so every task has its own stack with an individual stack size. By default, the stack for each task is allocated by malloc() from the heap. For Arduino, the task that runs loop() is allocated 8KB of stack: https://github.com/espressif/arduino-esp32/blob/master/cores/esp32/main.cpp#L33

Possibly an advanced static analyzer could look at the possible tasks which could run which possible functions in the call-graph at link time and come up with a foolproof stack usage threshold for every task. Such a static analyzer does not exist for ESP32, and I'm not aware of something like this on other platforms either. :(

Then, you addressed 2) in my original bug which basically seems to be a FreeRTOS shortcoming that it can't allocate bigger static arrays even though there is many times that amount of RAM still free. It was not clear from your reply whether this can or will be fixed?
As a dev, it's quite a pain to have to modify existing libraries in to find all the arrays and move them to mallocs.

The shortcoming is on the ESP32 side not the FreeRTOS side, it has to do with the chip's RAM layout (static RAM has to be contiguous for GNU ld to link into it, free DRAM at ESP32 reset time is not all contiguous). The limit is on the total size of static .data+.bss, not the maximum size of any one array.

Agree this is a pain for developers who want to use static allocation. It's currently unclear if it can be fixed, I have a branch where it's mostly fixed but there are still a couple of "gotchas" to work through, which may be intractable.

But for 3) espressif/esp-idf#1934 (comment) documents what the limitations seem to be at this time like you said (thanks) but is there an existing bug to handle the (apparently new) case of arrays fitting at compile time but taking enough space that FreeRTOS crashes at startup? (assert on malloc fail as you pointed out).
If not, should this bug be renamed to become the bug to track this one issue?

Good point. I'd rather not reuse this issue because of the number of other things covered here. Can you please open a new issue on the esp-idf project that describes this single case? Feel free to link back here for context.

@marcmerlin

This comment has been minimized.

Copy link
Author

marcmerlin commented Mar 20, 2019

Thanks @projectgus for the comments and explanations.
From what you said though, is it correct that there is an upper bound to the amount of stack that can be?
I realize that it can be smaller than expected at runtime, but if we know it can't be more than 8KB or somesuch, can the compiler at least detect any allocation that will just not fit in the stack?
It won't catch a 4KB allocation in a stack that could have been 8KB ends up only being 3KB at runtime, but catching some is better than nothing?
As for the problem of static vs dynamic allocations, I guess it is what it is.

@marcmerlin

This comment has been minimized.

Copy link
Author

marcmerlin commented Mar 20, 2019

@projectgus opened this new bug as requested espressif/esp-idf#3211
You can then decide whether anything can be improved for detecting arrays on the stack being way too big, being detectable at runtime, or not.

@projectgus

This comment has been minimized.

Copy link
Member

projectgus commented Mar 20, 2019

Hi @marcmerlin ,

Thanks for opening the linked issue. Will get back to you ASAP on this.

can the compiler at least detect any allocation that will just not fit in the stack?
It won't catch a 4KB allocation in a stack that could have been 8KB ends up only being 3KB at runtime, but catching some is better than nothing?

It's possible to pass the -Wstack-usage=len option to gcc which warns if a function may use more than LEN bytes of stack. Finding a useful value for this is hard, there's unlikely to be a practical one-size-fits-all value (task stacks could be 8KB like they are for loop(), but you can equally create a task with a 100KB stack if you have enough free heap at runtime.) So it might be something we can turn on as an option in ESP-IDF, but I don't know about Arduino (where configuration like this has to be fixed).

@marcmerlin

This comment has been minimized.

Copy link
Author

marcmerlin commented Mar 20, 2019

@projectgus thanks for your answer.
Obviously I don't know much about the gory details of what happens behind the scenes, so my wishful thinking may not match reality :)
@me-no-dev or you can close this if it's not reasonably further actionable in the arduino layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.