Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM Cortex-M4 stack offset when not using Floating point register sharing #22977

Closed
PierreAronnax opened this issue Feb 20, 2020 · 5 comments · Fixed by #23008
Closed

ARM Cortex-M4 stack offset when not using Floating point register sharing #22977

PierreAronnax opened this issue Feb 20, 2020 · 5 comments · Fixed by #23008
Assignees
Labels
area: ARM ARM (32-bit) Architecture bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug

Comments

@PierreAronnax
Copy link
Contributor

PierreAronnax commented Feb 20, 2020

Describe the bug
While developing a project based on the EFM32GG11B420F2048 using Zephyr RTOS we frequently noticed instability of our applications, usually resulting in 'usage faults'. The instability however was only occurring when booting the application via the build-in UART bootloader. Running the same binary from Eclipse using a debugger did not show the problem.

After investigation we discovered that the current stack pointer of all our threads were outside the allocated stack area immediately when entering the thread.
We have found the cause of this issue: floating-point context is active and Zephyr does not disables it.

Our project does not have CONFIG_FLOAT set, and therefore Zephyr does not initialize the Floating Point Status and Control Register. But it also doesn’t de-initialize these registers, leaving them unchanged. The build-in UART bootloader seems to initialize these registers before Zephyr is running.

After booting without the bootloader the CONTROL register has a value of 2, while after booting from the bootloader the CONTROL register has a value of 6.
CONTROL register bit 2 FPCA is set. Clearing this bit fixes this issue.

CONTROL register

[2] FPCA
When floating-point is implemented this bit indicates whether context floating-point is currently active:
0 = no floating-point context active
1 = floating-point context active.
The Cortex-M4 uses this bit to determine whether to preserve floating-point state when processing an exception.

To Reproduce
main.c

#include <zephyr.h>
#include <sys/printk.h>

#define MY_STACK_SIZE 512
#define MY_PRIORITY 2

K_THREAD_STACK_DEFINE(my_stack_area, MY_STACK_SIZE);

struct k_thread my_thread_data;

static void my_entry_point( void *arg1, void *unused1, void *unused2 )
{
	volatile u32_t i = 0xABCDEF01;

	printk("Total stack size:    %u\n", MY_STACK_SIZE);
	printk("Claimed stack size:  %u\n", K_THREAD_STACK_SIZEOF(my_stack_area) / 2);
	printk("Variable 'i' offset: %u\n", (u32_t)&i - (u32_t)&my_stack_area);

	while(1)
	{
		;
	}
}

void main(void)
{
	memset(my_stack_area, 0x00, MY_STACK_SIZE);
	printk("Thread example application\n");

	k_tid_t my_tid = k_thread_create(&my_thread_data, my_stack_area,
                                 K_THREAD_STACK_SIZEOF(my_stack_area) / 2,
                                 my_entry_point,
                                 NULL, NULL, NULL,
                                 K_PRIO_COOP(MY_PRIORITY), 0, K_NO_WAIT);

	while(1)
	{
		k_sleep(K_FOREVER);
	}
}

prj.conf

# CONFIG_FLOAT is not set

west build -b efm32gg_stk3701a application/name
west flash

When you don't have this board or don't have this bootloader, simulate the bug by changing arch/arm/core/prep_c.c line 140 into

static inline void enable_floating_point(void)
{
	__set_CONTROL(__get_CONTROL() | (CONTROL_FPCA_Msk));
}

Console output:

*** Booting Zephyr OS build zephyr-v2.1.0-42-g9a8f359928cb  ***
Thread example application
Total stack size:    512
Claimed stack size:  256
Variable 'i' offset: 308

Observe that 308 is outside 256.

Expected behavior
Console output:

*** Booting Zephyr OS build zephyr-v2.1.0-42-g9a8f359928cb  ***
Thread example application
Total stack size:    512
Claimed stack size:  256
Variable 'i' offset: 236

Observe that 236 is inside 256.

Impact
Showstopper as every application crashes when we don't erase the bootloader or don't use the debugger.

Environment

  • EFM32GG11B420F2048
  • Zephyr v2.1.0

Additional context
I believe this issue is not only limit to my processor but impacts every ARM with a FPU.

As a workarround use the following prj.conf

CONFIG_FLOAT=y
CONFIG_FP_SHARING=y

As this clears the flag in arch_switch_to_main_thread

Or change arch/arm/core/prep_c.c line 140 into

static inline void enable_floating_point(void)
{
	__set_CONTROL(__get_CONTROL() & (~(CONTROL_FPCA_Msk)));
}

But that probably needs more processor checks and register de-initialization.

@PierreAronnax PierreAronnax added the bug The issue is a bug, or the PR is fixing a bug label Feb 20, 2020
@PierreAronnax PierreAronnax changed the title ARM Coretex-M4 stack offset when not using Floating point register sharing ARM Cortex-M4 stack offset when not using Floating point register sharing Feb 20, 2020
@stephanosio
Copy link
Member

@PierreAronnax Thanks for providing a detailed analysis on the problem.

Since you have already provided a workable solution, would you be willing to open a pull request for the fix? or do you want the maintainers to do it?

static inline void enable_floating_point(void)
{
#ifdef CONFIG_CPU_HAS_FPU
	__set_CONTROL(__get_CONTROL() & (~(CONTROL_FPCA_Msk)));
#endif
}

cc @ioannisg

@stephanosio stephanosio added the area: ARM ARM (32-bit) Architecture label Feb 21, 2020
@ioannisg
Copy link
Member

Thanks for the bug report, @PierreAronnax , I 'll take a look as soon as possible. Yeap, feel free to open a bug-fix PR as well.

@PierreAronnax
Copy link
Contributor Author

@stephanosio
I don't think this is the complete solution. For instance when CONFIG_LOG is set but CONFIG_FP_SHARING isn't the bit should be cleared to. Also CPACR, FPCCR and FPSCR should be initialzed to. I will work out a PR.

@ioannisg
Thanks, I will create a PR soon.

@ioannisg
Copy link
Member

Hi @PierreAronnax , first, thanks again for the investigation, here.
I confirm that the issue you have spotted is a valid one. Allow me, however, to stress that this is
just an aspect of a rather generic problem with Zephyr not fully initializing processor status in reset. As you mentioned, this becomes an issue when Zephyr image is loaded by another image that does certain things with the Cortex-M core and does not clear the status. See, for example a similar issue about the MSP initialization, #7404, I 've raised long time ago.

I believe we, eventually, need to make sure the Zephyr boot sequence needs to initialize all core Cortex-M components, we should perhaps open an issue for that. In the mean time, I am going to take a look at your PR and see if we can fix (at least) the issue you report here for v2.2.0 release.

@ioannisg ioannisg self-assigned this Feb 24, 2020
@jhedberg jhedberg added the priority: medium Medium impact/importance bug label Feb 25, 2020
@ioannisg
Copy link
Member

BTW, this is not only applicable to Cortex-M4, but also to every Cortex-M with FP extension

ioannisg pushed a commit to PierreAronnax/zephyr that referenced this issue Feb 27, 2020
Upon reset, the CONTROL.FPCA bit is, normally, cleared. However,
it might be left un-cleared by firmware running before Zephyr boot,
for example when Zephyr image is loaded by another image.
We must clear this bit to prevent errors in exception unstacking.
This caused stack offset when booting from a build-in EFM32GG bootloader

Fixes zephyrproject-rtos#22977

Signed-off-by: Luuk Bosma <l.bosma@interay.com>
jhedberg pushed a commit that referenced this issue Feb 27, 2020
Upon reset, the CONTROL.FPCA bit is, normally, cleared. However,
it might be left un-cleared by firmware running before Zephyr boot,
for example when Zephyr image is loaded by another image.
We must clear this bit to prevent errors in exception unstacking.
This caused stack offset when booting from a build-in EFM32GG bootloader

Fixes #22977

Signed-off-by: Luuk Bosma <l.bosma@interay.com>
hakehuang pushed a commit to hakehuang/zephyr that referenced this issue Mar 18, 2020
Upon reset, the CONTROL.FPCA bit is, normally, cleared. However,
it might be left un-cleared by firmware running before Zephyr boot,
for example when Zephyr image is loaded by another image.
We must clear this bit to prevent errors in exception unstacking.
This caused stack offset when booting from a build-in EFM32GG bootloader

Fixes zephyrproject-rtos#22977

Signed-off-by: Luuk Bosma <l.bosma@interay.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ARM ARM (32-bit) Architecture bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants