New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arch: arm: Introduce SMP support for Cortex-A/R #61206
arch: arm: Introduce SMP support for Cortex-A/R #61206
Conversation
I added |
You are not on the reviewer list, but I think you must be interested in it. Please have a look when you have time. @povergoing, @ithinuel |
Thank you! |
@ibirnbaum FYI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks and kibitzing on the switch implementation. I haven't had time to look at the full PR. On the whole it seems pretty solid though.
arch/arm/core/cortex_a_r/thread.c
Outdated
#if defined(CONFIG_USE_SWITCH) | ||
extern void z_arm_exit_exc(void); | ||
thread->switch_handle = thread; | ||
thread->callee_saved.lr = (uint32_t)z_arm_exc_exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming: calling that value "LR" is a minor abuse; that register is for the return address. As seen here it's actually being faked as the entry (!) address of the exception return path. Which is fine as far as implementation goes, but IMHO it would be clearer to call this "PC", since it's the address at which at execution will resume. Calling it "LR" is just an implementation detail from how you implemented the cooperative switch case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling it "LR" can tell the user that we save the LR register into it, I'm worried that the user may get confused if we call it "PC".
static ALWAYS_INLINE void | ||
arch_thread_return_value_set(struct k_thread *thread, unsigned int value) | ||
{ | ||
thread->arch.swap_return_value = value; | ||
} | ||
|
||
#else | ||
|
||
static inline void arch_switch(void *switch_to, void **switched_from) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want ALWAYS_INLINE here. This gets expanded in just a few hot paths in the kernel; we know we want it inlined, don't give the toolchain the option to mess it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/* save old thread into switch handle which is required by | ||
* z_sched_switch_spin() | ||
*/ | ||
str r1, [r1, #___thread_t_switch_handle_OFFSET] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: this must be the very last thing done with the old thread context. Since someone will be coming back here to add FPU support later, it might be nice for the comment to warn about that and/or mention the docs in arch_interface.h that explain the requirement.
Also, at least on SMP Cortex-A you need a write barrier immediately before this store, so that no other CPU can possibly see the earlier stores cross with it and miss some of the saved data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ldr r2, =_thread_offset_to_callee_saved | ||
add r2, r1, r2 | ||
|
||
stm r2, {r4-r11, sp, lr} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pleasingly simple, but FWIW making it a function call is an extra set of instructions that you really don't need in the cooperative switch case It wouldn't be much more complicated to make arch_switch() an inline assembly expansion that does mostly the same stuff, adds a clobber for r0-3 and r12 [1], and stores the address of the following instruction into the thread context via a simple assembly symbol.
[1] Probably. I'm winging my knowledge of the ARM ABI based on general intuition and a copy of the AAPCS spec I just googled in another window. Apply salt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reconsidering this now.
325f358
to
e80aee1
Compare
MMU or MPU unit need to be initialized by its own CPU. - Primary core initialize MMU or MPU unit in z_arm_prep_c. - Secondary core initialize MMU or MPU unit in z_arm_secondary_start. Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
Replace the TLS base address pointer from TPIDRURO to TPIDRURW. The difference between them is that TPIDRURO is read-only in user mode but TPIDRURW isn't. So TPIDRURO is much more suitable for store the address of _kernel.CPU[n]. For this reason, this commit replaces the base pointer of the TLS area. Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
Store the current CPU's struct _cpu instance into TPIDRURO, so that the CPU core can get its struct _cpu instance by reading TPIDRURO. This is useful in the SMP system. Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
This commit introduce 'USE_SWITCH' feature into cortex-A/R(aarch32) architecture For introducing USE_SWITCH, the exception entry and exit are unified via `z_arm_cortex_ar_enter_exc` and `z_arm_cortex_ar_exit_exc`. All exceptions including ISR are using this way to enter and exit exception handler. Differentiate exception depth and interrupt depth. Allow doing context switch when exception depth greater than 1 but not allow doing this when interrupt depth greater than 1. Currently, USE_SWITCH doesn't support FPU_SHARING and USERSPACE. Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
This commit introduces SMP support into Cortex-A/R aarch32 architecture. For now, this only supports multiple core start together and only allow one CPU initialize system as primary core, others loop at the beginning as the secondary cores and wait for wake up. Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
This commit add a new board named 'fvp_baser_aemv8r_aarch32_smp' Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
e80aee1
to
c9f2ef9
Compare
Now,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, there are still a lot of things that need to be refined but definitely belong to another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fairly important change. Can you add a note to the 3.6 release notes for it? There's a boilerplate release notes file for 3.6 in tree already. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't see if I ever dropped a +1 on this or not, but doing so now. All my concerns seem addressed.
@SgrrZhf Why don't we do a new documentation PR with the release notes update there instead, and maybe see if anything in |
Ok, I‘ll update the release notes in another PR later. |
This PR do mainly things below.
USE_SWITCH_SUPPORT
for Cortex-A/R (AArch32).z_arm_cortex_ar_enter_exc
z_arm_cortex_ar_exit_exc
isr_wrapper
code. Don't allow context switch when interrupt depth is greater than 1.fvp_baser_aemv8r_aarch32_smp
platform.Note:
fvp_baser_aemv8r_aarch32_smp
platform enabledUSE_SWITCH_SUPPORT
.