-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TW#16112] nvs_get_str() seams to causes lock (reproducable on arduino-esp32) #1157
Comments
I think this was fixed in b013f5d, but I am not sure which IDF commit you are using. Can you please mention the commit ID? |
One additional piece of Information: after several crashes it appears that the ESP32 does not boot. Only the following can than be seen in the log-output:
|
Could you please try the attached patch? Based on my testing, this solves the issue you have found. There are still periods of time when loop_task doesn't run because coreTask is taking all the CPU time (for a few seconds), but there is no more deadlock, eventually TLS connection fails in coreTask, and loop_task runs again. From 0e65e67480b5f5a6f452aa785a6ec64764a78866 Mon Sep 17 00:00:00 2001
From: Ivan Grokhotkov <ivan@espressif.com>
Date: Mon, 30 Oct 2017 18:53:39 +0800
Subject: [PATCH] spi_flash: raise priority of the task performing spi_flash
operation
Fixes https://github.com/espressif/arduino-esp32/issues/740
---
components/spi_flash/cache_utils.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/components/spi_flash/cache_utils.c b/components/spi_flash/cache_utils.c
index 53a0caf8e..cc1f56929 100644
--- a/components/spi_flash/cache_utils.c
+++ b/components/spi_flash/cache_utils.c
@@ -110,6 +110,11 @@ void IRAM_ATTR spi_flash_disable_interrupts_caches_and_other_cpu()
assert(other_cpuid == 1);
spi_flash_disable_cache(other_cpuid, &s_flash_op_cache_state[other_cpuid]);
} else {
+ // Temporarily raise current task priority to prevent a deadlock while
+ // waiting for IPC task to start on the other CPU
+ TaskHandle_t self = xTaskGetCurrentTaskHandle();
+ int old_prio = uxTaskPriorityGet(self);
+ vTaskPrioritySet(self, configMAX_PRIORITIES - 1);
// Signal to the spi_flash_op_block_task on the other CPU that we need it to
// disable cache there and block other tasks from executing.
s_flash_op_can_start = false;
@@ -121,6 +126,8 @@ void IRAM_ATTR spi_flash_disable_interrupts_caches_and_other_cpu()
}
// Disable scheduler on the current CPU
vTaskSuspendAll();
+ // Can now set the priority back to the normal one
+ vTaskPrioritySet(self, old_prio);
// This is guaranteed to run on CPU <cpuid> because the other CPU is now
// occupied by highest priority task
assert(xPortGetCoreID() == cpuid);
--
2.13.5 (Apple Git-94)
|
@igrr after this patch the board no longer freezes. But it seams that the HTTPS-Task eventually ceases to run, I have yet to find out where. EDIT: Disabling the scheduler and changing the priority and back seams a little bit harsh/complicated. Wouldn't some kind of lock or a critical section also work here? |
The scheduler needs to be disabled to prevent switching to other tasks during flash operations. Other tasks may be (and probably are) running code from cache, but the cache is disabled during flash operation. So a task switch during flash operation would likely cause an immediate crash. Effectively, we want the task doing the flash operation to be temporary elevated to highest priority. There is no "normal" way to do this in FreeRTOS, using locks and critical sections only. |
Thanks for the insights Ivan!
That patch you've posted workes fine now in production for weeks now. From
my side it's fine. If you have no objections I'd love to see that fix
upstream!
Am 08.12.2017 11:28 schrieb "Ivan Grokhotkov" <notifications@github.com>:
The scheduler needs to be disabled to prevent switching to other tasks
during flash operations. Other tasks may be (and probably are) running code
from cache, but the cache is disabled during flash operation. So a task
switch during flash operation would likely cause an immediate crash.
Effectively, we want the task doing the flash operation to be temporary
elevated to highest priority. There is no "normal" way to do this in
FreeRTOS, using locks and critical sections only.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1157 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AC_DS2DLMFULncwoE04x82Op7WowGjnhks5s-Q9CgaJpZM4QDWH7>
.
|
@igrr While comparing my branch with master: this still does not seam to be upstream right? |
Sorry for the delay @Curclamas, I've pushed this patch for review now. Should be in IDF master branch soon. |
Thank you very much!
…On Mon, Jun 25, 2018 at 11:11 AM Ivan Grokhotkov ***@***.***> wrote:
Sorry for the delay @Curclamas <https://github.com/Curclamas>, I've
pushed this patch for review now. Should be in IDF master branch soon.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1157 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AC_DS6YcBBUFUPPQwmYAOA5urazEv9Juks5uAKk_gaJpZM4QDWH7>
.
|
I know I am resurrecting this, but I don't see this change propageted into esp-idf and I am experiencing the same issue with the above fix resolving the issue. |
The fix is for the situation when cache disabling mechanism causes a deadlock with user tasks. Situation is as follows: 1. spi_flash operation is started from low-priority task on CPU0 2. It uses IPC to wake up high-priority IPC1 task on CPU1, preventing all other tasks on CPU1 from running. This is needed to safely disable the cache. 3. While the task which started spi_flash operation is waiting for IPC1 task to acknowledge that CPU1 is not using cache anymore, it is preempted by a higher priority application task ("app0"). 4. Task app0 busy-waits for some operation on CPU1 to complete. But since application tasks are blocked out by IPC1 task, this never happens. Since app0 is busy-waiting, the task doing spi flash operation never runs. The more or less logical soltion to the problem would be to also do cache disabling on CPU0 and the SPI flash operation itself from IPC0 task. However IPC0 task stack would need to be increased to allow doing SPI flash operation (and IPC1 stack as well). This would waste some memory. An alternative approach adopted in this fix is to call FreeRTOS functions to temporary increase the priority of SPI flash operation task to the same level as the IPC task. Fixes espressif/arduino-esp32#740 Fixes espressif/esp-idf#1157
Description
This error is reproducible on custom boards as well as ESP32-DevBoard:
When nvs_get_str() is called and there is another task that runs on higher priority, the ESP32 can inadvertently lock up.
Reproduction
The error is described and reproduced in arduino-esp32-issue: espressif/arduino-esp32#740
It seams to be rooted in IDF though.
The text was updated successfully, but these errors were encountered: