arch: arm: Add code for swapping threads between secure and non-secure #26414

oyvindronningstad · 2020-06-24T12:35:27Z

This adds code to swap_helper.S which does special handling of LR when
the interrupt came from secure. The LR value is stored to memory, and
put back into LR when swapping back to the relevant thread.

Signed-off-by: Øyvind Rønningstad oyvind.ronningstad@nordicsemi.no

Longer explanation:

When a pendsv interrupt happens (to switch threads), LR contains a very special value (EXC_RETURN) describing the thread it came from (https://developer.arm.com/docs/100235/latest/the-cortexm33-processor/exception-model/exception-entry-and-return). The current implementation of swap_helper.S (which contains the pendsv interrupt handler) assumes a number of things about EXC_RETURN that do not hold if the CPU is executing a call into secure mode. This leads to undefined behavior if this happens, e.g. if a delayed work hits its timeout, while the CPU is in secure mode.

The current implementation leaves EXC_RETURN unchanged when switching between threads. The exception to this is when FPU is enabled. In this case, the FType bit can have different values based on whether FPU is used by the thread. This individual bit is handled correctly by swap_helper.S.

The pendsv handler normally doesn't touch the EXC_RETURN, except to set or clear a single bit to do with floating point registers. With TrustZone, there are one or two more bits that might be different between threads, the S bit (secure/nonsecure stack) and possibly the DCRS bit.

This PR introduces storing of the entire EXC_RETURN on entering the pendsv handler, and restoring the EXC_RETURN of the new thread before exiting the pendsv handler. We also set a default EXC_RETURN value when creating the thread.

This should also work for secure threads that are switched out while calling a secure function.

Why does this work?
As long as LR contains the same EXC_RETURN when a thread is switched in as when it was switched out, all the registers should be correctly popped from the relevant stack. The EXC_RETURN bits collectively specify which/how many registers are pushed, and to which stack.

How to choose the default EXC_RETURN value?
Because of the standardized way threads are initialized, there are only two valid values: 0xFFFFFFFD for secure/regular code, and 0xFFFFFFBC for non-secure code.

What are the limitations of this approach?
If one thread is executing in secure mode, no other threads can enter secure mode safely. It might work if the second thread is allowed to exit before the first thread is rescheduled, but since the NS kernel can do nothing about the S stacks, the two secure calls will be sharing stacks so the stack might overflow. As a consequence, all calls to secure mode should be prefaced with grabbing a global mutex. It might be possible to do something clever in swap_helper to not swap in a secure thread when another is running, but that is out of scope here.

ioannisg · 2020-06-24T15:59:20Z

@oyvindronningstad I would appreciate if you could elaborate here, on which exact problem you are trying to address. Then we can look at the PR together.

oyvindronningstad · 2020-06-25T13:08:36Z

@oyvindronningstad I would appreciate if you could elaborate here, on which exact problem you are trying to address. Then we can look at the PR together.

@ioannisg I updated the PR description now. Please tell me your thoughts.

ioannisg · 2020-06-25T16:50:14Z

@oyvindronningstad thanks for the longer explanation. That's interesting and inline with our earlier discussion in the downstream ticket.

I am basically going to comment on the idea, for now - not the specific implementation details.

So, as far as I understand you are trying to solve the problem of a NSPE thread properly continuing a Secure call while being context-switched out and in again, is that correct? Then, the solution you propose could theoretically work; the ARCHitecture allows you to manipulate the EXC_RETURN and return from the exception while transiting to the alternative security state. However, we must be very careful, because during the exception return, the ARCHitecture performs a series of integrity checks that need to pass, and in this case, I am not sure all these checks will always pass. For example: if the PendSV interrupt preempts a Secure interrupt then the unstacked XPSR won't match the execution mode the PE is supposed to return to. (This is of course a corner case, only valid when the Secure domain has a low priority interrupt, lower than the Non-Secure PendSV, but it is a scenario you cannot handle with this idea).

Anyway, if the integrity checks pass (we must check this), then your idea will work, because a NS thread's PSP (PSP_NS) remains un-touched when doing a secure call, and Zephyr Kernel ensures it is properly saved/restored. So when a thread switches back -in, to finish the secure call, when it returns from the call, the correct PSP_NS will be loaded.

How to choose the default EXC_RETURN value?

This is indeed a "slightly" weak side of the idea, because we are constructing an EXC_RETURN value based on assumptions on how Zephyr ARCH code works for ARM. For Non-Secure PendSV we need to assume that:

return mode is always THREAD (never HANDLER)
Zephyr threads use PSP (never MSP)
[the FType flag is anyways manipulated]

These are all reasonable assumptions to make.

If one thread is executing in secure mode, no other threads can enter secure mode safely AFAIK

Yes, this is correct. We cannot read/modify the Secure stack pointer :)

As a consequence, all calls to secure mode should be prefaced with grabbing a global mutex.

Yes, so, then, I need to question the real value of this idea. Generally, we expect NSPE to be a multi-threaded RTOS (Zephyr) so we should be locking the scheduler, anyways, before doing a secure call. Then there is no need for all this infrastructure. In summary, the idea has value only if there is a single NSPE execution context that interacts with the SPE. How probable is this scenario? Do we have such cases in our downstream project?

Side note:
The equivalent case with respect to floating-point is with FPU and FPU_SHARING:
FPU: un-shared register mode, a single thread can use the FP
FP_SHARING: many threads can use the FP

In the downstream projects I have not seen examples that enable FPU but not FP_SHARING. Because developers usually have multiple contexts that want to use the FP.

oyvindronningstad · 2020-06-26T06:09:41Z

Thanks a lot for the answer.

For Non-Secure PendSV we need to assume that:

return mode is always THREAD (never HANDLER)

Zephyr threads use PSP (never MSP)

If it's a problem, can we infer these from CONFIG_s and set the default EXC_RETURN accordingly?

Generally, we expect NSPE to be a multi-threaded RTOS (Zephyr) so we should be locking the scheduler, anyways, before doing a secure call.

With this patch, we shouldn't need to lock the scheduler. If we still need to lock the scheduler for some other reason, I agree with you that it has limited utility. If we don't need to lock the scheduler (even if we do need a global mutex), time-sensitive NS threads can be executed while a long running secure service is executed (RNG for example takes ~50-100 ms), making the system more responsive.

joh06937 · 2020-07-09T13:50:11Z

Sorry to randomly jump in on this, but for the last week or two I've been going down the route of trying to have Zephyr and our primary application running entirely out of Secure flash/RAM on the nRF9160 with a dedicated thread that calls into a user-written application in Non Secure flash/RAM. This PR was just pointed out to me.

The intent is that we can have our code running on our part and use TrustZone as a nice hardware protection mechanism (that requires very little intricate configuration) to allow our code to be shipped to a customer and lets them flash their own restricted application onto our device, which we would run and manage on their behalf, all without risking our IP/keys/sensitive info from being accessible in any way by them (and while providing Secure APIs for various functionality).

Unfortunately, I have run into a few roadblocks (not the least of which is the manual boundary management found in this PR and the concern over Non Secure-side interrupt management, as mentioned above). I've also had some trouble with the stacking of the floating point registers causing a fault on my part, even with a bit of extra configuration with CP11/10 in SCB->NSACR.

Anyway, my point here is that there is at least one person out there who's interested in a robust way to solve this issue :-)

github-actions · 2020-09-08T00:59:32Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

oyvindronningstad · 2021-03-24T12:04:54Z

Rebased on top of #33510

ioannisg · 2021-05-03T15:29:42Z

@microbuilder @galak @stephanosio @carlocaione if any of you could review this patch before the 2.6 freeze :)

arch/arm/core/aarch32/swap_helper.S

modules/trusted-firmware-m/interface/interface.c

arch/arm/core/offsets/offsets_aarch32.c

arch/arm/core/aarch32/Kconfig

Introduce a Kconfig option to allow Secure function calls to be pre-empted. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no> Signed-off-by: Øyvind Rønningstad <oyvind.ronningstad@nordicsemi.no>

This adds code to swap_helper.S which does special handling of LR when the interrupt came from secure. The LR value is stored to memory, and put back into LR when swapping back to the relevant thread. Also, add special handling of FP state when switching from secure to non-secure, since we don't know whether the original non-secure thread (which called a secure service) was using FP registers, so we always store them, just in case. Signed-off-by: Øyvind Rønningstad <oyvind.ronningstad@nordicsemi.no>

When ARM_NONSECURE_PREEMPTIBLE_SECURE_CALLS is enabled, if FPU is being used (CONTROL.FPCA == 1), store all FP registers before entering the secure function, and restore them afterwards. This is needed if any NS thread or ISR that interrupts the secure function uses FP registers. If they do, a secure UsageFault occurs unless this change is applied. This allows k_sched_lock() and k_sched_unlock() to be dropped when ARM_NONSECURE_PREEMPTIBLE_SECURE_CALLS is enabled. Enable ARM_NONSECURE_PREEMPTIBLE_SECURE_CALLS by default when building TF-M. Signed-off-by: Øyvind Rønningstad <oyvind.ronningstad@nordicsemi.no>

For testing secure->non-secure thread swapping. This also tests that the FP context is correctly preserved when calling a secure function. Signed-off-by: Øyvind Rønningstad <oyvind.ronningstad@nordicsemi.no>

Add an entry in the release notes regarding the support for non-blocking secure calls in Zephyr. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>

microbuilder

Nice work on this @oyvindronningstad

ioannisg · 2021-05-04T18:11:53Z

I'll add a patch to fix the test failure for the STM boards - apparently the overlay is not correct. To make this easy to track, I will not force-push, so just apply the patch on top of the PR. Please don't merge.

oyvindronningstad · 2021-05-04T19:15:33Z

I'll add a patch to fix the test failure for the STM boards - apparently the overlay is not correct. To make this easy to track, I will not force-push, so just apply the patch on top of the PR. Please don't merge.

Which test failure? Looks like the CI is green.

ioannisg · 2021-05-04T19:24:57Z

I'll add a patch to fix the test failure for the STM boards - apparently the overlay is not correct. To make this easy to track, I will not force-push, so just apply the patch on top of the PR. Please don't merge.

Which test failure? Looks like the CI is green.

The overlay is not correct, and the NS test does not boot. After changing the overlay the test is able to run, however it fails. Attaching log

*** Booting Zephyr OS build zephyr-v2.5.0-3129-gb84e7d4cdcb2  ***
Running test suite test_thread_swap
===================================================================
START - test_thread_swap
    Assertion failed at ../s
    Assertion failed at ../src/main.c:41: work_func: (main_thread->arch.mode_exc_return & EXC_RETURN_S is false)
EXC_RETURN not secure: 0xac
 FAIL - test_thread_swap in 0.19 seconds
===================================================================
Test suite test_thread_swap failed.
===================================================================
PROJECT EXECUTION FAILED

oyvindronningstad · 2021-05-04T19:39:56Z

After changing the overlay the test is able to run, however it fails.

That error means there's probably some timing issue in the test. The work function didn't manage to interrupt the secure service. Likely the crypto is faster and finishes before the work fires.

ioannisg · 2021-05-04T19:40:51Z

After changing the overlay the test is able to run, however it fails.

That error means there's probably some timing issue in the test. The work function didn't manage to interrupt the secure service. Likely the crypto is faster and finishes before the work fires.

I understand that. What is your proposal to fix this? Or should we remove the ST platforms from the allow-list?

oyvindronningstad · 2021-05-04T19:55:42Z

Try decreasing the delay on the work

Decrease the submitted work delay, to ensure that the PSA crypto operation will be preempted when the work fires. The modification is required for devices with fast crypto operations. Also minor corrections to the test ase name, so it is not the same with the other arm_thread_swap test. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>

Correct thee dt overlays for the STM32 boards, so the Zephyr image starting address is in accordance with what TF-m expects it to be. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>

ioannisg · 2021-05-05T08:38:50Z

Test is passing now on

nrf5340
nrf9160
nucleo_l552ze_q
stm32l562e_q
mps2_an521

microbuilder · 2021-05-05T12:27:41Z

A bit late to the party, but tests also passing on lpcxpresso55s69_ns

MZachmann · 2021-05-08T20:49:20Z

on Windows I'm getting a warning with gnu arm (q4 2020) that __packed in the struct near line 114 is being ignored and it needs to be after the struct keyword. It's fine with struct __packed.

ioannisg · 2021-05-09T21:15:45Z

etting a warning with gnu arm (q4 2020) that __packed in the stru

@oyvindronningstad you might want to take a look?

oyvindronningstad · 2021-05-10T07:18:36Z

on Windows I'm getting a warning with gnu arm (q4 2020) that __packed in the struct near line 114 is being ignored and it needs to be after the struct keyword. It's fine with struct __packed.

@MZachmann Thanks, do you mind checking that this removes your warning? #35031

oyvindronningstad requested a review from ioannisg June 24, 2020 12:35

zephyrbot added area: ARM ARM (32-bit) Architecture area: API Changes to public APIs labels Jun 24, 2020

oyvindronningstad force-pushed the swap-secure branch from 5ed9b38 to 4620e17 Compare June 24, 2020 13:06

oyvindronningstad mentioned this pull request Jun 24, 2020

secure_services: Lock scheduler before calling secure services. nrfconnect/sdk-nrf#2519

Merged

oyvindronningstad force-pushed the swap-secure branch 2 times, most recently from 27ca9c9 to 8516ad2 Compare June 24, 2020 14:33

github-actions bot added the Stale label Sep 8, 2020

github-actions bot closed this Sep 22, 2020

oyvindronningstad reopened this Mar 24, 2021

oyvindronningstad force-pushed the swap-secure branch 2 times, most recently from 8e7e9f0 to 4828524 Compare March 24, 2021 12:04

github-actions bot added area: Kernel area: Modules area: Samples Samples area: Tests Issues related to a particular existing or missing test labels Mar 24, 2021

oyvindronningstad marked this pull request as ready for review March 24, 2021 12:04

oyvindronningstad requested review from andyross, dcpleung, galak, MaureenHelm, microbuilder and nashif as code owners March 24, 2021 12:04

ioannisg reviewed May 3, 2021

View reviewed changes

arch/arm/core/aarch32/swap_helper.S Show resolved Hide resolved

ioannisg reviewed May 3, 2021

View reviewed changes

modules/trusted-firmware-m/interface/interface.c Show resolved Hide resolved

carlocaione reviewed May 3, 2021

View reviewed changes

arch/arm/core/offsets/offsets_aarch32.c Outdated Show resolved Hide resolved

github-actions bot added area: Documentation Release Notes To be mentioned in the release notes labels May 4, 2021

ioannisg reviewed May 4, 2021

View reviewed changes

arch/arm/core/aarch32/Kconfig Outdated Show resolved Hide resolved

ioannisg reviewed May 4, 2021

View reviewed changes

arch/arm/core/aarch32/Kconfig Outdated Show resolved Hide resolved

ioannisg and others added 5 commits May 4, 2021 15:38

arch: arm: Add config for non-blocking secure calls

bf223ff

Introduce a Kconfig option to allow Secure function calls to be pre-empted. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no> Signed-off-by: Øyvind Rønningstad <oyvind.ronningstad@nordicsemi.no>

tests: arm: Add arm_thread_swap_tz test

0c29fc5

For testing secure->non-secure thread swapping. This also tests that the FP context is correctly preserved when calling a secure function. Signed-off-by: Øyvind Rønningstad <oyvind.ronningstad@nordicsemi.no>

doc: release notes: release notes entry for non-blocking secure calls

e8a3996

Add an entry in the release notes regarding the support for non-blocking secure calls in Zephyr. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>

oyvindronningstad force-pushed the swap-secure branch from 11136e4 to e8a3996 Compare May 4, 2021 13:41

microbuilder approved these changes May 4, 2021

View reviewed changes

ioannisg added 2 commits May 5, 2021 10:37

tests: arm_thread_swap_tz: correct dt overlays for stm32 boards

bda970a

Correct thee dt overlays for the STM32 boards, so the Zephyr image starting address is in accordance with what TF-m expects it to be. Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>

ioannisg merged commit 0eb3b3c into zephyrproject-rtos:master May 5, 2021

oyvindronningstad deleted the swap-secure branch May 10, 2021 07:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arch: arm: Add code for swapping threads between secure and non-secure #26414

arch: arm: Add code for swapping threads between secure and non-secure #26414

oyvindronningstad commented Jun 24, 2020 •

edited

ioannisg commented Jun 24, 2020

oyvindronningstad commented Jun 25, 2020 •

edited

ioannisg commented Jun 25, 2020

oyvindronningstad commented Jun 26, 2020

joh06937 commented Jul 9, 2020

github-actions bot commented Sep 8, 2020

oyvindronningstad commented Mar 24, 2021

ioannisg commented May 3, 2021

microbuilder left a comment

ioannisg commented May 4, 2021

oyvindronningstad commented May 4, 2021

ioannisg commented May 4, 2021

oyvindronningstad commented May 4, 2021

ioannisg commented May 4, 2021

oyvindronningstad commented May 4, 2021

ioannisg commented May 5, 2021

microbuilder commented May 5, 2021

MZachmann commented May 8, 2021

ioannisg commented May 9, 2021

oyvindronningstad commented May 10, 2021

arch: arm: Add code for swapping threads between secure and non-secure #26414

arch: arm: Add code for swapping threads between secure and non-secure #26414

Conversation

oyvindronningstad commented Jun 24, 2020 • edited

ioannisg commented Jun 24, 2020

oyvindronningstad commented Jun 25, 2020 • edited

ioannisg commented Jun 25, 2020

oyvindronningstad commented Jun 26, 2020

joh06937 commented Jul 9, 2020

github-actions bot commented Sep 8, 2020

oyvindronningstad commented Mar 24, 2021

ioannisg commented May 3, 2021

microbuilder left a comment

Choose a reason for hiding this comment

ioannisg commented May 4, 2021

oyvindronningstad commented May 4, 2021

ioannisg commented May 4, 2021

oyvindronningstad commented May 4, 2021

ioannisg commented May 4, 2021

oyvindronningstad commented May 4, 2021

ioannisg commented May 5, 2021

microbuilder commented May 5, 2021

MZachmann commented May 8, 2021

ioannisg commented May 9, 2021

oyvindronningstad commented May 10, 2021

oyvindronningstad commented Jun 24, 2020 •

edited

oyvindronningstad commented Jun 25, 2020 •

edited