MbedOS Error Status: 0x80FF013D Code: 317 Module: 255 with wait(), always occurs after programming or reset, never occurs after power cycle #10339

Hoel · 2019-04-08T08:21:37Z

I encounter a fault exception / Mbed error caused by the wait() function, and for some reason this error only occurs when the project is compiled and run from SW4STM32, the same project compiled from CLI doesnt not show the error. Also it never occurs again after the target has been power cycled.

Here is the full error :

++ MbedOS Fault Handler ++

FaultType: HardFault

Context:
R0 : 00001000
R1 : 00000001
R2 : E000ED00
R3 : 00000000
R4 : 00000000
R5 : 00000000
R6 : 00000000
R7 : 00000000
R8 : 00000000
R9 : 00000000
R10 : 00000000
R11 : 00000000
R12 : 00000000
SP : 20001708
LR : 08006F2F
PC : 08006F30
xPSR : 61000000
PSP : 200016E8
MSP : 20007FC0
CPUID: 412FC230
HFSR : 40000000
MMFSR: 00000000
BFSR : 00000000
UFSR : 00000008
DFSR : 00000008
AFSR : 00000000
Mode : Thread
Priv : Privileged
Stack: PSP

-- MbedOS Fault Handler --

++ MbedOS Error Info ++
Error Status: 0x80FF013D Code: 317 Module: 255
Error Message: Fault exception
Location: 0x8000651
Error Value: 0x8006F30
Current Thread: rtx_idle Id: 0x200011F0 Entry: 0x8003669 StackSize: 0x200 StackMem: 0x20001538 SP: 0x20007F48
Next:
rtx_idle State: 0x2 Entry: 0x08003669 Stack Size: 0x00000200 Mem: 0x20001538 SP: 0x200016F8
Ready:
Wait:
rtx_timer State: 0x83 Entry: 0x080053B1 Stack Size: 0x00000300 Mem: 0x20001238 SP: 0x200014D0
Delay:
main State: 0x13 Entry: 0x080035AD Stack Size: 0x00001000 Mem: 0x20001790 SP: 0x200026F0
rtx_idle State: 0x2 Entry: 0x08003669 Stack Size: 0x00000200 Mem: 0x20001538 SP: 0x200016F8
For more info, visit: https://mbed.com/s/error?error=0x80FF013D&tgt=L80AB_L151CC
-- MbedOS Error Info --

Mbed is the very last revision, i created the project directly from CLI yesterday.
The main.cpp only blinks a LED and print a message on UART, nothing else in it.
When the project is compield from CLI and uploaded with st-flash there is no error at all.
when the project is exported to SW4STM32 (CLI export with -z option) and build / run from there the hardfault always occurs after programming, even with a manual reset (reset tact switch), however if i power cycle the target there is no subsequent error at all (even after multiple hardware reset). This behavior has been repeated and is is 100% consistent.

To make sure it was not a problem with GCC i setup SW4STM32 PATH to point to the Mbed CLI compiler, so the exact same compiler (from theotherjimmy mbed-cli-osx-installer https://github.com/ARMmbed/mbed-cli-osx-installer/releases/tag/v0.0.10 ) is used in both cases.

I also know that the fault comes from wait() since if i remove this statement then there is no error when built / run from SW4STM32.

Once again, when the exact same project is built from CLI (and run with st-flash) there is no fault at all.

The current target is an XDOT_L151CC which has been modified with 8MHz crystal, the set_sysclock function has been directly generated by STM32CubeMX, oher than that no changes have been made to the original target files.

The issue is 100% reproductible, and i believe it can be reproduced to other targets as well.
Here is how to reproduce it:
-create a new project from CLI
-add a blinking LED with a wait() statement in the main loop
-export the project for SW4STM32 with "CLI export" and "-z" option
-open the resulting project in SW4STM32

then

-build the project from CLI and upload to target with st-flash => no fault
-build the project from SW4STM32 and upload to target (run button) => fault exception
*hardware reset the target => still fault exception
*power cycle the target => no fault
*subsequent hardware reset => no fault

Of course i can also provide the two projects if needed.

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug

The text was updated successfully, but these errors were encountered:

0xc0170 · 2019-04-08T08:24:13Z

How does the project settings compare (cli vs exporter) ? Are they 100 % same?

cc @ARMmbed/team-st-mcd

Hoel · 2019-04-08T08:26:13Z

yes, they are exactly the same, the SW4STM32 project has been exported directly from the CLI project and no change have been made on SW4STM32 project settings.

Hoel · 2019-04-08T08:59:34Z

side note, if i use :
wait_ms(3) => no fault
wait_ms(300) => fault
wait(0.03) => fault
wait(0.003) => fault

Hoel · 2019-04-08T09:26:06Z

OK, i found the problem, the issue was caused by the systick setup that was still in set_sysclock function taken from STM32CubeMx, it should have been removed, i overlooked that.

culprit :
HAL_SYSTICK_Config(HAL_RCC_GetHCLKFreq()/1000);
HAL_SYSTICK_CLKSourceConfig(SYSTICK_CLKSOURCE_HCLK);
HAL_NVIC_SetPriority(SysTick_IRQn, 0, 0);

Hoel · 2019-04-08T09:29:12Z

Unfortunately, the issue is only partially solved : if i set wait(1.0) it works fine, but if i set wait(0.3) then the fault is back; so something else is wrong.

Hoel · 2019-04-08T11:09:58Z

after hundreds of debug step, error seems to originate after OsMutexAcquire()

ciarmcom · 2019-04-08T12:01:39Z

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-1129

Hoel · 2019-04-08T12:20:20Z

for verification i tested with wait(0.3) on the CLI project, no error, so the problem definitely occurs only on SW4STM32 and only if the wait() value is under 1000ms.

jeromecoutant · 2019-04-11T15:11:12Z

@Hoel - Please could you check if #10367 impacts your result?
Thx

jeromecoutant · 2019-04-11T15:11:43Z

@deepikabhavnani

Hoel · 2019-04-11T15:29:42Z

@jeromecoutant OK i check that

Hoel · 2019-04-11T15:38:44Z

@jeromecoutant tickless is not enabled in my case, does it matter? i made further debug last days and i feel the problem is somehow related to stdio retarget and possibly delay(), i also noticed very weird behaviors, such as hard fault occuring when only adding a second delay() statement in the main loop (which only blink a LED) or stdio::printf mysteriously stop working (so no more output when harfault occurs except the error LED blink) while serial::printf continue to work normally

jeromecoutant · 2019-04-11T15:49:38Z

no, #10367 increases idle thread size when there is no more compilation optimization option.

See #9106 (comment)

Hoel · 2019-04-11T16:00:28Z

ok, i tried it and it didnt worked, hardfault right after programming. BTW you can see how it didnt print the printf statement (HEL version), but printed correctly the mbed error which uses stdio too, that is very weird. I set it to 512

Hoel · 2019-04-11T16:05:50Z

@jeromecoutant Oh, i disabled all optimisations and it no more hardfault after programming nor after power cycle. Binary size is considerably increased so its probably not a long term option, but at least for now it seem to work, i will try more tests to see if its consistent.

Hoel · 2019-04-11T16:19:19Z

@jeromecoutant
It is working consistently with various delay() / led blink tests (failed previously), i did not get any further hardfault after soft reset or power cycle. That said major problems still persist, the stdio retarget is not working consistently (doesnt work) and more important, the SX1280 radio (SPI) cannot initialize properly, while it works fine when the same project is compiled from the CLI.

Hoel · 2019-04-11T17:06:51Z

@jeromecoutant
here i reproduced the radio init which fails, first there is an mbed error on the last __disable_irq( ) statement, i cannot immediately see a good reason for this.

then if ever i remove the last __disable_irq( ) statement it will hang forver on Wait4Busy(), i did not check the SPI transaction with analyzer yet but i checked the state of BSY GPIO and it is high, which mean the radio is not intialized properly, so most likely something wrong with SPI communication. Again the exact same with CLI works fine, that said i also have to remove the last __disable_irq( ) otherwise i get mbed error.

EDIT:
I went ahead and extracted the function to read register and get firmware revision from the libary, commented the __disable_irq() statements, and that way it works, the radio is initialized properly, it returns the correct revision value. There is definitely something very wrong here since If i comment the __disable_irq() statements in the library it still fails and hang forever on Wait4Busy().

Hoel · 2019-04-12T11:45:50Z

I made made a minimal test code to reproduce the mutex error directly, it occurs in osMutexAcquire when the SPI is locked after __disable_irq().
By the way, the error is not displayed in console, i only have the LED error pattern and nothing is sent to UART... The error message should be sent since when I step debug i reach mbed_error_puts.

EDIT

If i add a printf statement at the beginning of the test, it is not sent to UART but afterwards the MBED error is printed correctly, all that is not very reassuring.

jeromecoutant · 2019-04-12T12:42:30Z

Hi
To be honest, text copy paste is better than picture...

Hoel · 2019-04-12T12:55:52Z

@jeromecoutant
Well , right, i will paste but then the formatting is messed

`int main(){

uart3.printf("[MBED] init ok\r");
printf("test");

RadioSpi = new SPI( PA_7, PA_6, PA_5 );

__disable_irq( );
RadioSpi->lock();
RadioSpi->unlock();
__enable_irq( );

uart3.printf("[MBED] test finished\r");

while (true) {
    led1 = 0;
    wait(0.005);
    led1 = 1;
    wait(2);
}

}`

`[MBED] init ok

++ MbedOS Error Info ++
Error Status: 0x80010133 Code: 307 Module: 1
Error Message: Mutex: 0x20002C2C, Not allowed in ISR context
Location: 0x8004B13
File: mbed_rtx_handlers.c+132
Error Value: 0x20002C2C
Current Thread: main Id: 0x200012D8 Entry: 0x8004783 StackSize: 0x1000 StackMem: 0x20001C28 SP: 0x20002B0C
Next:
main State: 0x2 Entry: 0x08004783 Stack Size: 0x00001000 Mem: 0x20001C28 SP: 0x20002BE8
Ready:
rtx_idle State: 0x1 Entry: 0x080048F1 Stack Size: 0x00000400 Mem: 0x20001448 SP: 0x20001808
Wait:
rtx_timer State: 0x83 Entry: 0x08007ABD Stack Size: 0x00000300 Mem: 0x20001848 SP: 0x20001AB8
Delay:
For more info, visit: https://mbed.com/s/error?error=0x80010133&tgt=L80AB_L151CC
-- MbedOS Error Info --`

jeromecoutant · 2019-05-28T15:16:59Z

You shoud use core_util_critical_section_enter() and core_util_critical_section_exit() functions instead of __disable_irq and __enable_irq

@kjbracey-arm

kjbracey · 2019-05-28T15:24:49Z

Like most HAL APIs other than really low-level ones like DigitalIn/Out, SPI is made thread-safe by a mutex. So you can't use it with interrupts disabled.

I don't know why you're disabling interrupts here - you have no interrupt handlers.

If you really need to you can inherit from SPI or other similar classes to override the virtual lock and unlock methods to stop it using the mutex, but I doubt that's the answer here.

If you do have some other code not shown here which does have an interrupt handler, and that needs its interrupts disabled, then rather than globally disabling all interrupts, temporarily remove your specific interrupt handler during the reset, by attaching NULL to your InterruptIn. (Or use InterruptIn::disable_irq()).

kjbracey · 2019-05-28T15:38:30Z

And yes, the enter/exit_critical section is preferred because it handles the case where there's another OS layer underneath Mbed OS, or something that needs super-fast IRQ response - it's an abstraction that can leave some interrupts enabled, rather than disabling all core IRQs. On most devices it is the same thing though.

jeromecoutant · 2019-06-17T10:00:00Z

Could we close this issue ?

0xc0170 · 2020-02-20T10:10:35Z

Could we close this issue ?

We will close this issue, as there has not been any update for more than a half year. You can reopen with an update if this issue still needs fixing.

Hoel closed this as completed Apr 8, 2019

Hoel reopened this Apr 8, 2019

ciarmcom added type: bug Jira status: OPEN labels Apr 8, 2019

ciarmcom added the mirrored label Apr 8, 2019

Hoel mentioned this issue Apr 8, 2019

CLI export function (for SW4STM32 and others IDE) with either full sources, reduced source (as with online compiler) or compiled library ARMmbed/mbed-cli#882

Closed

linlingao added the devices: st label Apr 9, 2019

0xc0170 closed this as completed Feb 20, 2020

ciarmcom added Jira status: CLOSED and removed Jira status: OPEN labels Feb 20, 2020

Copper-Bot mentioned this issue Feb 3, 2021

STSTM32 platform is using wrong version of toolchain-gccarmnoneeabi package platformio/platform-ststm32#492

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MbedOS Error Status: 0x80FF013D Code: 317 Module: 255 with wait(), always occurs after programming or reset, never occurs after power cycle #10339

MbedOS Error Status: 0x80FF013D Code: 317 Module: 255 with wait(), always occurs after programming or reset, never occurs after power cycle #10339

Hoel commented Apr 8, 2019 •

edited

0xc0170 commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019 •

edited

ciarmcom commented Apr 8, 2019

Hoel commented Apr 8, 2019

jeromecoutant commented Apr 11, 2019

jeromecoutant commented Apr 11, 2019

Hoel commented Apr 11, 2019

Hoel commented Apr 11, 2019

jeromecoutant commented Apr 11, 2019

Hoel commented Apr 11, 2019 •

edited

Hoel commented Apr 11, 2019

Hoel commented Apr 11, 2019

Hoel commented Apr 11, 2019 •

edited

Hoel commented Apr 12, 2019 •

edited

jeromecoutant commented Apr 12, 2019

Hoel commented Apr 12, 2019

jeromecoutant commented May 28, 2019

kjbracey commented May 28, 2019 •

edited

kjbracey commented May 28, 2019

jeromecoutant commented Jun 17, 2019

0xc0170 commented Feb 20, 2020

MbedOS Error Status: 0x80FF013D Code: 317 Module: 255 with wait(), always occurs after programming or reset, never occurs after power cycle #10339

MbedOS Error Status: 0x80FF013D Code: 317 Module: 255 with wait(), always occurs after programming or reset, never occurs after power cycle #10339

Comments

Hoel commented Apr 8, 2019 • edited

Issue request type

0xc0170 commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019

Hoel commented Apr 8, 2019 • edited

ciarmcom commented Apr 8, 2019

Hoel commented Apr 8, 2019

jeromecoutant commented Apr 11, 2019

jeromecoutant commented Apr 11, 2019

Hoel commented Apr 11, 2019

Hoel commented Apr 11, 2019

jeromecoutant commented Apr 11, 2019

Hoel commented Apr 11, 2019 • edited

Hoel commented Apr 11, 2019

Hoel commented Apr 11, 2019

Hoel commented Apr 11, 2019 • edited

Hoel commented Apr 12, 2019 • edited

jeromecoutant commented Apr 12, 2019

Hoel commented Apr 12, 2019

jeromecoutant commented May 28, 2019

kjbracey commented May 28, 2019 • edited

kjbracey commented May 28, 2019

jeromecoutant commented Jun 17, 2019

0xc0170 commented Feb 20, 2020

Hoel commented Apr 8, 2019 •

edited

Hoel commented Apr 8, 2019 •

edited

Hoel commented Apr 11, 2019 •

edited

Hoel commented Apr 11, 2019 •

edited

Hoel commented Apr 12, 2019 •

edited

kjbracey commented May 28, 2019 •

edited