Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache-disabled-panic in xTaskGenericNotifyFromISR while all in IRAM (IDFGH-5039) #6825

Closed
boborjan2 opened this issue Apr 5, 2021 · 5 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@boborjan2
Copy link

Environment

  • Module or chip used: [ESP32-WROOM-32E]
  • IDF version (run git describe --tags to find it): v4.3-dev-3175-g9a2d25191
  • Build System: [Make]
  • Compiler version (run xtensa-esp32-elf-gcc --version to find it): (crosstool-NG esp-2020r3) 8.4.0
  • Operating System: [Linux]
  • Using an IDE?: [No]
  • Power Supply: [USB]

Problem Description

I get the following panic:

Re-enable cpu cache.

Guru Meditation Error: Core 0 panic'ed (Cache disabled but cached memory region accessed).

Core 0 register dump:
PC : 0x4008a871 PS : 0x00060034 A0 : 0x800838e1 A1 : 0x3ffb0a00
A2 : 0x3ffbd814 A3 : 0x00100000 A4 : 0xbad00bad A5 : 0x00000002
A6 : 0x3ffb0a20 A7 : 0x3ffb0498 A8 : 0x00000002 A9 : 0x3f400364
A10 : 0x3ffb49f0 A11 : 0x3ffb33b0 A12 : 0x0000153e A13 : 0x3f400394
A14 : 0x3ffb3418 A15 : 0x3ffc3450 SAR : 0x0000001b EXCCAUSE: 0x00000007

EXCVADDR: 0x00000000 LBEG : 0x4008192d LEND : 0x40081935 LCOUNT : 0x00000027
Backtrace:0x4008a86e:0x3ffb0a00 0x400838de:0x3ffb0a20 0x40083964:0x3ffb0a50 0x400871b1:0x3ffb0a70 0x400871da:0x3ffb0a90 0x400820c1:0x3ffb0ab0 0x4008a0ed:0x3ffc3430 0x4008281d:0x3ffc3450 0x40082912:0x3ffc3470 0x40087831:0x3ffc3490 0x40083371:0x3ffc34b0 0x400d776c:0x3ffc34f0 0x40143e32:0x3ffc3520 0x40142d45:0x3ffc3540 0x4014352b:0x3ffc3560 0x40142392:0x3ffc35f0 0x40142647:0x3ffc3670 0x40144057:0x3ffc36e0 0x40141d67:0x3ffc3700 0x40112cd5:0x3ffc3730 0x40111158:0x3ffc3750 0x40120879:0x3ffc3770 0x4011c1bf:0x3ffc3820 0x401655fb:0x3ffc3840 0x4016bd24:0x3ffc3860 0x4016c6d2:0x3ffc3880 0x4016c928:0x3ffc38a0 0x4008fca1:0x3ffc38c0 0x4008f7e1:0x3ffc38f0 0x4008fa85:0x3ffc3920 0x401228d3:0x3ffc3960 0x401230cc:0x3ffc3990 0x40091096:0x3ffc39d0

ELF file SHA256: db62524c156d5e64

Rebooting...

The backtrace is:
0x4008a830 T xTaskGenericNotifyFromISR
0x400838cc T mainNotifyFromISR
0x40083914 t cf_code_in_gpio_isr
0x40087178 t gpio_isr_loop
0x400871b8 t gpio_intr_service
0x40082034 t _xt_medint3
0x4008a0d4 T xTaskGetCurrentTaskHandleForCPU
0x40082814 t cache_enable
0x4008290c t spi1_end
0x40087824 t spiflash_end_default
0x400832a0 T esp_flash_read
0x400d76fc T _fseeko_r
0x40143e1c T dac_output_disable
0x40142d3c T _ZN3nvs8HashListD2Ev
0x4014338c T rtc_sleep_init
0x4014236c T _ZN3nvs4Page8readItemEhNS_8ItemTypeEPKcPvjhNS_9VerOffsetE
0x40142484 T _ZN3nvs4Page15mLoadEntryTableEv
0x40143f80 T rmt_tx_start
0x40141d30 T ZN3nvs4Page9copyItemsERS0
0x40112c40 T ieee80211_tx_mgt_cb
0x40110fc4 T ieee80211_ioctl
0x40120860 T ic_disable_crypto
0x4011c18c T chm_start_op
0x4016536c T wps_enrollee_get_msg
0x4016bce4 T wps_process_authenticator
0x4016c544 T wps_parse_msg
0x4016c920 t wps_process_serial_number$isra$6
0x4008fc34 T ppProcTxCallback
0x4008f7d8 T lmacTxDone
0x4008f898 T lmacEndFrameExchangeSequence
0x4012288c T pm_go_to_wake
0x401230a4 T pm_tx_null_data_done_process
0x40090f48 T ppTask

It happens on a gpio interrupt (in iram). Reproducible with v4.3. Never seen while heavily tested under v4.2.
Map file shows: (I can upload if needed)
0x0000000040080000 _iram_start = ABSOLUTE (.)
...
0x000000004009538c _iram_end = ABSOLUTE (.)

This is the disassembly of the code around PC: (in function xTaskGenericNotifyFromISR)

5458 pxTCB = xTaskToNotify;
5459
5460 taskENTER_CRITICAL_ISR(&xTaskQueueMutex);
0x4008a83c <+12>: 71 cc d8 l32r a7, 0x40080b6c
0x4008a83f <+15>: ad 07 mov.n a10, a7
0x4008a841 <+17>: 65 99 01 call8 0x4008c1d8

5461 {
5462 if( pulPreviousNotificationValue != NULL )
0x4008a844 <+20>: 16 85 00 beqz a5, 0x4008a850 <xTaskGenericNotifyFromISR+32>

5463 {
5464 *pulPreviousNotificationValue = pxTCB->ulNotifiedValue;
0x4008a847 <+23>: c0 20 00 memw
0x4008a84a <+26>: 82 22 57 l32i a8, a2, 0x15c
0x4008a84d <+29>: 82 65 00 s32i a8, a5, 0

5465 }
5466
5467 ucOriginalNotifyState = pxTCB->ucNotifyState;
0x4008a850 <+32>: 92 d2 01 addmi a9, a2, 0x100
0x4008a853 <+35>: c0 20 00 memw
0x4008a856 <+38>: 82 09 60 l8ui a8, a9, 96
0x4008a859 <+41>: 0c 25 movi.n a5, 2
0x4008a85b <+43>: c0 20 00 memw
0x4008a85e <+46>: 52 49 60 s8i a5, a9, 96
0x4008a861 <+49>: 80 80 74 extui a8, a8, 0, 8

5468 pxTCB->ucNotifyState = taskNOTIFICATION_RECEIVED;
5469
5470 switch( eAction )
0x4008a864 <+52>: f6 54 39 bgeui a4, 5, 0x4008a8a1 <xTaskGenericNotifyFromISR+113>
0x4008a867 <+55>: 91 09 d9 l32r a9, 0x40080c8c
0x4008a86a <+58>: e0 44 11 slli a4, a4, 2
0x4008a86d <+61>: 4a 49 add.n a4, a9, a4
0x4008a86f <+63>: 48 04 l32i.n a4, a4, 0
0x4008a871 <+65>: a0 04 00 jx a4

5471 {
5472 case eSetBits :
5473 pxTCB->ulNotifiedValue |= ulValue;
0x4008a874 <+68>: c0 20 00 memw
0x4008a877 <+71>: 42 22 57 l32i a4, a2, 0x15c
0x4008a87a <+74>: 30 34 20 or a3, a4, a3
0x4008a87d <+77>: c0 20 00 memw
0x4008a880 <+80>: 32 62 57 s32i a3, a2, 0x15c

The panic occurs at the switch. Seemingly everything is OK.
Freertos functions are in iram.

Expected Behavior

No crash.

Any help is welcome. No issues in idf v4.2.
Thanks
Viktor

@espressif-bot espressif-bot added the Status: Opened Issue is new label Apr 5, 2021
@github-actions github-actions bot changed the title Cache-disabled-panic in xTaskGenericNotifyFromISR while all in IRAM Cache-disabled-panic in xTaskGenericNotifyFromISR while all in IRAM (IDFGH-5039) Apr 5, 2021
@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Apr 9, 2021
@ESP-Marius
Copy link
Collaborator

One possibility is that the switch-case gets optimized to a jump table, which is placed in flash.

Could you try to add tasks.o: CFLAGS += -fno-jump-tables -fno-tree-switch-conversion to the end of components/freertos/component.mk and see if that helps

@boborjan2
Copy link
Author

According to the disassembly, the jump table should be in iram as well. I will make a try nevertheless.

@boborjan2
Copy link
Author

boborjan2 commented Apr 9, 2021

I confirm it is the switch jump table. With the compiler flags modified, the issue is gone (0 out of 120 tries vs 1 in every 5).
It is super bad though, each and every file may contain IRAM code with switches (or other constructs with jump table) here and there. Have the compiler flags changed since 4.2? Something went wrong with the makefiles. I understand cmake is the default build system now but legacy codebase is imho too huge not to support it.

Yet another IRAM issue while we are at it (though this one is surely not build system related): #6824

Thanks for the help,
Viktor

@ESP-Marius
Copy link
Collaborator

As far as I know nothing have changed with the compile flags, but maybe the new toolchain is more aggressive with this optimization? We are tracking this issue internally to resolve it more globally, like you say, there lots of potential switch-case issues like this.

We definitely plan to support make files for ESP32 in 4.3 and I'm sorry for the regressions. Unfortunately most people have switched to the new cmake build system, which makes discovering errors in the make system more difficult. As for this issue I dont think it's a make vs cmake issue. I suspect the same problem could occur if using cmake.

@ESP-Marius
Copy link
Collaborator

After ee2f8b1 these optimizations are disabled by default to prevent issues like this. And it can be enabled selectively for files where it actually provides a noticeable performance.

Closing the issue.

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: In Progress Work is in progress labels Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

3 participants