Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird IllegalInstruction errors (need to add -Werror=return-type) (IDFGH-6602) #8244

Closed
eldendiss opened this issue Jan 16, 2022 · 11 comments
Closed
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally

Comments

@eldendiss
Copy link

Environment

  • Development Kit: [none]
  • Module or chip used: [ESP32-WROOM-32D]
  • IDF version: v4.3.2
  • Build System: [idf.py]
  • Compiler version: 8.4.0
  • Operating System: [Windows]
  • (Windows only) environment type: [ESP Command Prompt].
  • Using an IDE?: [Yes - vscode]
  • Power Supply: [USB]

Problem Description

My application is calling function declared and defined in separate component. This should take care of i2c communication with GPIO expanders. However i always get some weird errors when calling two specific functions.

I tried:

  • expanding main task size
  • commenting out lines of code, up to the point when the function does nothing
  • measuring 3.3V rail with oscilloscope, to rule out PSU problems
  • checking that i dont use the pins dedicated for onboard SPI flash
  • Another board

Expected Behavior

Call the function and run the code obviously.

Actual Behavior

Calling function from main task results in Core panic Illegalnstruction with nothing insightful in register dump (last address executed is the address of called function - exception decoder points this to the line of code where function is defined. When i comment out everything in function, so it just returns without doing anything, i get LoadProhibited error, again with nothing insightful being reported by register dump - it reports last executed command as a class initializer, which is not even called in that function.
If I comment out that initializer, it again proceeds with error IllegalInstruction

Code to reproduce this issue

#include "adc-api.h"
extern "C" void app_main(void)
{
    while(1){
        vTaskDelay(1000/portTICK_PERIOD_MS);
        setHeatRange(R33,CHAN2);
    }
}

adc-api.h

enum heat_range_e {
    R33,
    R100,
    R330,
    R10
};

enum channels_e {
    CHAN1,
    CHAN2,
    CHAN3,
    CHAN4,
    CHAN_MAX
};

extern esp_err_t setHeatRange(heat_range_e resi, channels_e channel);

adc-api.cpp

esp_err_t setHeatRange(heat_range_e resi, channels_e channel)
{
    return ESP_OK;
}

Debug Logs

Guru Meditation Error: Core  0 panic'ed (IllegalInstruction). Exception was unhandled.
Memory dump at 0x400e6210: 00004136 00004136 f01d0070
0x400e6210: setHeatRange(heat_range_e, channels_e) at D:\STU\repos\model-gas\build/../components/adc-api/adc-api.cpp:122

Core  0 register dump:
PC      : 0x400e6216  PS      : 0x00060630  A0      : 0x800d6b30  A1      : 0x3ffc69c0
0x400e6216: esp_pm_impl_waiti at C:/Users/Dendo/esp/esp-idf/components/esp_pm/pm_impl.c:814

A2      : 0x00000000  A3      : 0x020035ac  A4      : 0x3f4056f0  A5      : 0x000004c2
A6      : 0x3f402810  A7      : 0x3ffe3b80  A8      : 0x8008713e  A9      : 0x3ffc69a0
A10     : 0x00000000  A11     : 0x000000be  A12     : 0x3ffb47c4  A13     : 0x3ffb3f60  
A14     : 0x00000005  A15     : 0x3ffb3f8c  SAR     : 0x00000004  EXCCAUSE: 0x00000000
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff

Backtrace:0x400e6213:0x3ffc69c0 0x400d6b2d:0x3ffc69e0 0x400e69c8:0x3ffc6a00 0x400887e1:0x3ffc6a20
0x400e6213: setHeatRange(heat_range_e, channels_e) at ??:?

0x400d6b2d: app_main at D:\STU\repos\model-gas\build/../main/main.cpp:57 (discriminator 1)

0x400e69c8: main_task at C:/Users/Dendo/esp/esp-idf/components/freertos/port/port_common.c:133 (discriminator 2)

0x400887e1: vPortTaskWrapper at C:/Users/Dendo/esp/esp-idf/components/freertos/port/xtensa/port.c:168

Other items if possible

sdkconfig-elf-coredump.zip

@espressif-bot espressif-bot added the Status: Opened Issue is new label Jan 16, 2022
@github-actions github-actions bot changed the title Weird IllegalInstruction errors Weird IllegalInstruction errors (IDFGH-6602) Jan 16, 2022
@eldendiss
Copy link
Author

Update: changing optimization level in sdkconfig to debug without optimization -O0 fixes the issue.

@AxelLin
Copy link
Contributor

AxelLin commented Jan 17, 2022

I do believe there are some problems with -O0 in esp-idf. #7835

@eldendiss
Copy link
Author

I haven't stumbled upon problems with -O0 yet, but I'm not using nvs for now. But I will watch out for that issue, and report it, if I will be able to reproduce it.

@igrr
Copy link
Member

igrr commented Jan 17, 2022

Hi @eldendiss, thank you for reporting this issue!
Looking at the .elf file you have shared, the code generated for setHeatRange(heat_range_e, channels_e) function definitely doesn't look correct.
Could you please share your project (or part of it sufficient to compile this file) with Espressif to help us reproduce and debug this issue? My email is ivan at espressif.com.

(I wasn't able to reproduce the issue based on the snippets in the issue description, unfortunately. I get the following compiled code after trying the code snippets in the issue description:

Dump of assembler code for function setHeatRange(heat_range_e, channels_e):
   0x400e7778 <+0>:	entry	a1, 48
   0x400e777b <+3>:	mov.n	a7, a1
   0x400e777d <+5>:	s32i.n	a2, a7, 0
   0x400e777f <+7>:	s32i.n	a3, a7, 4
   0x400e7781 <+9>:	movi.n	a2, 0
   0x400e7783 <+11>:	retw.n

while in the ELF file you have attached it looks like this:

400e6210 <_Z12setHeatRange12heat_range_e10channels_e>:
400e6210:       004136          entry   a1, 32
        ...

400e6214 <esp_pm_impl_waiti>:
400e6214:       004136          entry   a1, 32
400e6217:       007000          waiti   0
400e621a:       f01d            retw.n

)

@eldendiss
Copy link
Author

eldendiss commented Jan 17, 2022

Hi, yes I'm working on it right now
@igrr this should be compilable, issue is still present when optimization is enabled and disappears when -O0 flag is used.
I attached compiled elfs for both cases, as well as logs and coredump during crash.

IDFGH-6602.zip
.

@igrr
Copy link
Member

igrr commented Jan 17, 2022

Thank you very much @eldendiss, I was able to reproduce this. I found that your function is declared as returning esp_err_t, but in fact there is no return statement in this function.

This behavior of GCC was noted some time ago in esp8266/Arduino project, see esp8266/Arduino#8165.

According to GCC manual, in C++ code omitting the return statement is undefined behavior:

https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#:~:text=Unlike%20in%20C%2C%20in%20C%2B%2B%2C%20flowing%20off%20the%20end%20of%20a%20non%2Dvoid%20function%20other%20than%20main%20results%20in%20undefined%20behavior%20even%20when%20the%20value%20of%20the%20function%20is%20not%20used

Unlike in C, in C++, flowing off the end of a non-void function other than main results in undefined behavior even when the value of the function is not used.

GCC manual also notes:

This warning is enabled by default in C++ and by -Wall otherwise.

So you would get a warning from GCC about this, if it was not for this line in your project CMakeLists.txt line:

idf_build_set_property(COMPILE_OPTIONS "-w" APPEND)

which disables all the compiler warnings :(

So the moral of this issue is to not disable the compiler warnings, since they can save quite some time debugging weird issues!

Please note that this behavior is not unique to Xtensa architecture or to Espressif's cross-compiler toolchain. Here is an example of the same thing happening with GCC for ARM: https://godbolt.org/z/j5fvx9n1a
There are two functions in the example, func and bad_func. The former returns correctly, and you see that in the assembly output it returns to the caller correctly (pop {r3, pc}). The latter doesn't have any instructions which implement return from the function and will crash in a way similar to what you have observed.

@eldendiss
Copy link
Author

Yes, sorry I totally missed that :) I also missed -w compiler flag was set. Mb, but maybe it will help someone with similar issue in the future.
Thank you for your time, I will be closing this issue, as it now works perfectly and it is more of a GCC issue, though maybe there is a reason for it to throw a warning instead of an error (even when you expect, that syntax resulting in undefined behaviour should be treated as error).

Anyway thank you again for your help.

@igrr
Copy link
Member

igrr commented Jan 17, 2022

though maybe there is a reason for it to throw a warning instead of an error (even when you expect, that syntax resulting in undefined behaviour should be treated as error).

This is a good suggestion. I agree we should add -Werror=return-type by default. We probably can't do this for the existing release branches, but we will change this for the master branch.

I'll keep the issue open to track this.

@igrr igrr reopened this Jan 17, 2022
@igrr igrr changed the title Weird IllegalInstruction errors (IDFGH-6602) Weird IllegalInstruction errors (need to add -Werror=return-type) (IDFGH-6602) Jan 17, 2022
@igrr
Copy link
Member

igrr commented Jan 17, 2022

Okay, I should have read more attentively.

This warning is enabled by default in C++ and by -Wall otherwise.

ESP-IDF already compiles all the code with -Wall -Werror=all by default. So this warning already produces an error, if detected.

The issue in this case was that -w disables all the warnings, including those which would be otherwise treated as errors. So there is not much to do here other than to produce a big scary message at the end of the build if -w is detected in the compiler flags.

@igrr igrr closed this as completed Jan 17, 2022
@espressif-bot espressif-bot added Resolution: Won't Do This will not be worked on Status: Done Issue is done internally and removed Status: Opened Issue is new labels Jan 17, 2022
@eldendiss
Copy link
Author

eldendiss commented Jan 17, 2022

Yes sure, it is indeed an user error. I just disabled warnings globally because when I was debugging I had full log of missing initializer warnings.
My point was more like that it could be treated as implicit error, like when u miss a bracket, or declare function with different return type than it's definition.
And as I said this is GCC issue not an esp-idf one, so nothing can be done from your side.
I also doesn't see much into how GCC syntax checking work and if it has a reason to be implemented the way it is.

Maybe just a suggestion, to add this somewhere in docs, so users should look for problems with their code in addition to HW problems (when I searched for IllegalInstruction it pointed me in the direction of HW problem with SPI, or PSU voltage drops). When I found out that HW is not the reason for crashing, I began to suspect an idf/compiler bug. Fixing it with disabled optimization made me just more sure of it being a bug.
And as it turns out it was indeed a bug, but a user code one :).

@igrr
Copy link
Member

igrr commented Jan 17, 2022

We have this section in the docs about troubleshooting IllegalInstruction exception, I will add a paragraph there to mention the situation you have encountered.
https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/fatal-errors.html#illegal-instr-msg

espressif-bot pushed a commit that referenced this issue Jan 18, 2022
In C++ code it is considered to be undefined behavior to exit a
non-void function without returning a value. Normally this is
detected by the compiler, but users could disable relevant warnings.
Add a note about this possibility.

See #8244 for context.
dskulina pushed a commit to playable-tech/esp-idf that referenced this issue Feb 4, 2022
In C++ code it is considered to be undefined behavior to exit a
non-void function without returning a value. Normally this is
detected by the compiler, but users could disable relevant warnings.
Add a note about this possibility.

See espressif#8244 for context.
dskulina pushed a commit to playable-tech/esp-idf that referenced this issue Feb 5, 2022
In C++ code it is considered to be undefined behavior to exit a
non-void function without returning a value. Normally this is
detected by the compiler, but users could disable relevant warnings.
Add a note about this possibility.

See espressif#8244 for context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

4 participants