Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provision TEE threads for system invocations #5789

Closed
wants to merge 7 commits into from

Conversation

etienne-lms
Copy link
Contributor

@etienne-lms etienne-lms commented Jan 24, 2023

Proposal to add an SMC function ID to OP-TEE entry for system operations invocation using provisioned thread contexts.
Related to linaro-swg/linux#108 Linux [PATCH 1/2] and [PATCH 2/2].

(edited: updated Linux counterpart refs)

core/include/kernel/thread.h Outdated Show resolved Hide resolved
mk/config.mk Outdated Show resolved Hide resolved
core/arch/arm/kernel/thread.c Outdated Show resolved Hide resolved
core/arch/arm/include/sm/optee_smc.h Outdated Show resolved Hide resolved
@etienne-lms
Copy link
Contributor Author

Comment addressed (in the right way, i hope).
I've added a tee_caps.c driver i tried to make generic. Tell me if it makes sense.

mk/config.mk Outdated Show resolved Hide resolved
core/include/tee/tee_caps.h Outdated Show resolved Hide resolved
@jenswi-linaro
Copy link
Contributor

I've added a tee_caps.c driver i tried to make generic. Tell me if it makes sense.

I don't think it's needed. In principle, all the capabilities we announce to the normal world are new features that it doesn't matter if the normal world uses or not. I mean, if normal world doesn't use one particular feature there's nothing OP-TEE can do about that. We should try to keep things as simple as possible.

@etienne-lms
Copy link
Contributor Author

I've added a tee_caps.c driver i tried to make generic. Tell me if it makes sense.

I don't think it's needed. In principle, all the capabilities we announce to the normal world are new features that it doesn't matter if the normal world uses or not. I mean, if normal world doesn't use one particular feature there's nothing OP-TEE can do about that. We should try to keep things as simple as possible.

I agree with making things simple.... hmmm, okay. I'll drop that. That means updating device images with this feature needs linux kernel image to be updated first then TEE image to be updated unless what the provisioned TEE system threads will not be used by the old Linux which may affect the platform.

@etienne-lms
Copy link
Contributor Author

Seen that the series makes changes that are reverted afterward, are you okay i squash fixup commits for the next proposal?

@jenswi-linaro
Copy link
Contributor

Seen that the series makes changes that are reverted afterward, are you okay i squash fixup commits for the next proposal?

Please do.

@etienne-lms etienne-lms force-pushed the system-thread branch 2 times, most recently from 12b6db1 to 0b0b9b1 Compare January 27, 2023 11:20
@etienne-lms
Copy link
Contributor Author

Udpdated.

@jenswi-linaro
Copy link
Contributor

With OPTEE_SMC_FUNCID_CALL_SYSTEM_WITH_REGD_ARG updated:
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>

Please wait with merging this until we know how the counterpart in the Linux kernel driver is received.

@jenswi-linaro
Copy link
Contributor

Feel free to squash and apply the tag.

Adds provisioned system threads among the OP-TEE provisioned threads.
New config switch CFG_NUM_SYSTEM_THREADS defines a number of thread
contexts reserved for system function invocations. The feature is
reported by TEE during capabilities exchange.

This is needed for platforms where specific OP-TEE yielded services
are dedicated to system management that can be indirectly invoked from
an RPC sequence and hence cannot wait clients complete their invocation
to release their TEE thread context unless what system can deadlock.
SCMI services for clocks, regulators and more, exposed by the SCMI PTA,
are examples of such system services.

Adds a new SMC function ID OPTEE_SMC_CALL_SYSTEM_WITH_REGD_ARG, defined
for yielded system calls expecting using system provisioned resources.

The implementation uses a best effort strategy when allocating a system
thread. If all system thread contexts are already in use, allocate from
the common pool.

Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Provision a system thread for SCMI support on platform STM32MP1

Acked-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
@etienne-lms
Copy link
Contributor Author

Tags applied (@jenswi-linaro, i changed the plat-stm32mp1 R-b into an A-b)

@etienne-lms etienne-lms changed the title [RFC] Provision TEE threads for system invocation Provision TEE threads for system invocations Feb 1, 2023
From Linux optee driver discussion, REE should call OP-TEE to reserve
system contexts when needed. This change modifies how threads are
allocated (check counters instead of finding a free cell).

Adds 2 fastcall SMC functions for REE to reserve/release system contexts
that is thread_counts.sys_reserved. The policy allows context reservation
as far as there remains at least 1 generic purpose thread context.

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Enable CFG_RESERVED_SYSTEM_THREAD.

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
@etienne-lms
Copy link
Contributor Author

etienne-lms commented Feb 14, 2023

Updated to address comments at upstream.
Linux driver patches v2 can be found here
Linux driver patches v4.

Copy link
Contributor

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also responding in the kernel thread for the kernel side of things.

*/
#define OPTEE_SMC_FUNCID_CALL_RESERVE_SYS_THREAD U(21)
#define OPTEE_SMC_CALL_RESERVE_SYS_THREAD \
OPTEE_SMC_STD_CALL_VAL(OPTEE_SMC_FUNCID_CALL_RESERVE_SYS_THREAD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a fast call, same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I messed this.

@@ -35,6 +35,22 @@
#include <trace.h>
#include <util.h>

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more complicated than expected. I was expecting one new global variable which would replace the old CFG_NUM_SYSTEM_THREADS, leaving __thread_alloc_and_run() unchanged apart from that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new scheme, we expect REE to call optee to reserve system threads. This could be done there are already TEE threads assigned, so we can have the thread context pool split in 2 parts (upper indices for system invocation, lower part for the rest). We must count how many threads are free hence to logic proposed here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why you need to count. If threads are already in use it's likely they are staying off the higher ranges. Are you making all this trouble to guard against the pathological case where all threads have been used and the last of these happens to be one that runs forever?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the implementation proposed here, there is no specific pre-provisioning of thread contextes on OP-TEE side. It's the REE that requests some system contexts reservation. With this scheme, when REE requests reservation of a system thread context, OP-TEE has to deal with already allocated threads. A pros is that running OP-TEE with a kernel not supporting TEE system context, will be able to use all available TEE thread context.
I'll try to simplify but I fear cases where OP-TEE accept to reserve a system context but fails to let REE use it because, as you said, the upper thread(s) are already in use.

args->a0 = OPTEE_SMC_RETURN_OK;

} else {
args->a0 = OPTEE_SMC_RETURN_OK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right!

@@ -239,5 +239,10 @@ uint32_t thread_handle_std_smc(uint32_t a0, uint32_t a1, uint32_t a2,
void thread_scall_handler(struct thread_scall_regs *regs);

void thread_spmc_register_secondary_ep(vaddr_t ep);

/* Reservation of system thread context */
TEE_Result reserve_sys_thread(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a thread_ prefix.

@@ -203,6 +209,8 @@
OPTEE_SMC_STD_CALL_VAL(OPTEE_SMC_FUNCID_CALL_WITH_RPC_ARG)
#define OPTEE_SMC_CALL_WITH_REGD_ARG \
OPTEE_SMC_STD_CALL_VAL(OPTEE_SMC_FUNCID_CALL_WITH_REGD_ARG)
#define OPTEE_SMC_CALL_SYSTEM_WITH_REGD_ARG \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sumit's suggestion with a bit to indicate if it's a system request made a lot of sense. That way we don't add arbitrary limitations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where to store that bit. ABI for OPTEE_SMC_CALL_WITH*_ARG leave no room for an extra bit, we need new funcIDs, reversing an ABI register to carry that information (e.g. bit0 of input a4 argument).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sumit suggested BIT(15) in the SMC ID if I remember correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would define few new funcIDs, from SMCCC view.
Using existing funcIDs and considering bit15 would be quite easy in Linux and OP-TEE impelmentation, but the description of this specific bit would look a bit hacky, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't address this comment yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we start calling top 4 bits for funcIDs as flags in OP-TEE SMC ABI? Since they are already zeros and would give us enough flexibility to add features like system threads etc. Otherwise the function IDs list will keep on increasing given all the combinatorial involved and hard to keep track off. Also, still there will be 12 bits for real funcIDs which allows that list to go upto value: 8191.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to have both, combinations explicitly defined so:

#define OPTEE_SMC_CALL_SYSTEM_THREAD_FLAG BIT(15)
#define OPTEE_SMC_CALL_SYSTEM_WITH_ARG \
        OPTEE_SMC_STD_CALL_VAL(OPTEE_SMC_FUNCID_CALL_WITH_ARG | \
                               OPTEE_SMC_CALL_SYSTEM_THREAD_FLAG)
#define OPTEE_SMC_CALL_SYSTEM_WITH_RPC_ARG \
        OPTEE_SMC_STD_CALL_VAL(OPTEE_SMC_FUNCID_CALL_WITH_RPC_ARG | \
                               OPTEE_SMC_CALL_SYSTEM_THREAD_FLAG)
#define OPTEE_SMC_CALL_SYSTEM_WITH_REGD_ARG \
        OPTEE_SMC_STD_CALL_VAL(OPTEE_SMC_FUNCID_CALL_WITH_REGD_ARG | \
                               OPTEE_SMC_CALL_SYSTEM_THREAD_FLAG)

The OPTEE_SMC_CALL_SYSTEM_THREAD_FLAG doesn't make sense for all SMCs so we should still define all SMCs, but with the bit defined there's still some structure in how an SMC is constructed so masking, etc can still be done where that's convenient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that's fine with me.

Fixes SMC funcIDs RESERVE_SYS_THREAD/UNRESERVE_SYS_THREAD: fastcall IDs
not standard + minor renaing (drop CALL_ in macros name).

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
System invocation contexts are reserved from context pool high indices,
simplifies lookup for a free thread.

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
request_system_thread_context() and release_system_thread_context() to
return an error code when CFG_RESERVED_SYSTEM_THREAD is disabled.

Rename [un]eserve_sys_thread() to thread_[un]reserve_sys_ctx().

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
@jenswi-linaro
Copy link
Contributor

At some point, you mentioned that you only wanted to reserve this thread if it was going to be used by normal world. And the reason for that is that you can't afford to waste an unused thread. With paging enabled an unused thread uses ~700 bytes if I remember correctly. Of these ~700 bytes are ~500 bytes VFP state which in theory could be saved on the stack instead. So we're basically doing this to save 200-700 bytes of nonpaged memory?

@etienne-lms
Copy link
Contributor Author

Let me check my numbers 😕, i agree it's maybe not worth it.

@etienne-lms
Copy link
Contributor Author

etienne-lms commented Feb 17, 2023

A thread costs 912 bytes of bss (ctx's and pgt entries) and 4kB of nozi (mmu tbl) on 32b.
On 64b, it costs 768 bytes of bss and 80B of nozi (1 mmu ul1 table).
Numbers from current post 3.20.0 revision.

@etienne-lms
Copy link
Contributor Author

sorry, I messed up retrieving the numbers... I'll update.

@etienne-lms
Copy link
Contributor Author

etienne-lms commented Feb 17, 2023

Cost of a thread (CFG_NUM_THREADS) in static mapping. Numbers from current post 3.20.0 revision.
64b.LPAE: 1872 bytes of bss + 4kbyte of nozi
32b/LPAE: 912 bytes of bss + 4kbyte of nozi
32b/32b-mmu: 768 bytes of bss + (1kB + 128 bytes) of nozi
bss stores ctx's and pgt entries. nozi store mmu tables.

@jenswi-linaro
Copy link
Contributor

Thanks for the detailed figures. Are you using 32b/32b-mmu in your use case?

@etienne-lms
Copy link
Contributor Author

No, the platform is using LPAE. But could be an argument for using 32b-mmu on that platform.

@etienne-lms
Copy link
Contributor Author

From recent discussions, Linux should have all info needed to monitor TEE thread entry and provision system entry contexts for dedicated services. I've created linaro-swg/linux#109 on Linux kernel where no change is needed in OP-TEE OS.

@etienne-lms
Copy link
Contributor Author

Closing this P-R in favor to a REE only system thread management.

@etienne-lms etienne-lms deleted the system-thread branch May 24, 2023 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants