[RFC] Dynamic shared memory implementation. #1232

lorc · 2016-12-12T19:01:48Z

Hello,

I want to propose new approach for shared memory. Idea is not to use carveout in memory, but to use memory provided by Normal World. Normal World allocates memory at its side and provided list of pages to OP-TEE, so OP-TEE kernel can map them into its virtual space.

This is not the final version, because it should be merged with @jenswi-linaro MOBJs framework. But anyway, it was tested on simple tests. With this approach I was able to number allocate large shared buffers (e.g. 2M each).

There are number of patches in this PR. First patches are quite generic and I can create separate PRs to merge them if maintainers wish. Most important are two last patches: one of them adds shared memory subsystem and second one puts it into use.

There are one problem that is not resolved at this time - we need to list of shared pages to TA memory manager. I hope MOBJs will help with this.

There are corresponding kernel and client changes based on @jenswi-linaro patches for registering shared memory. I'll present them later.

lorc · 2016-12-12T20:56:36Z

Checkpatch have failed two times. But one error looks like a bug in checkpatch (or I don't understand what it wants from me). Another message is about long line. But there are string constants, which should not be divided into two parts (because string constant division also triggers checkpatch error).

prime-zeng · 2016-12-13T02:08:15Z

core/arch/arm/include/mm/tee_mmu_defs.h

@@ -69,7 +69,7 @@
 #define TEE_MMU_L1_ALIGNMENT		TEE_MMU_L1_SIZE

 #define TEE_MMU_L2_NUM_ENTRIES		(TEE_MMU_L2_SIZE / 4)
-#define TEE_MMU_L2_SIZE			(1 << 10)
+#define TEE_MMU_L2_SIZE			(1 << 13)


RES_VASPACE_SIZE is 10MB, so the L2 entries is not enough here?

this is the size of a nonlpae L2 table. should remain 1KByte.

prime-zeng · 2016-12-13T02:30:56Z

core/arch/arm/mm/shmem.c

+		unsigned int idx;
+
+		while (true) {
+			if (!core_mmu_find_table(vaddr, 99, &tbl_info)) {


Use UINT_MAX instead of 99.

prime-zeng · 2016-12-13T03:14:45Z

core/arch/arm/mm/shmem.c

+ *   TEE_ERROR_XXX - on error
+ */
+
+int shmem_map_buffer(paddr_t pa, size_t size, paddr_t *pages,


The page list dynamic mapping is a common function, can we introduce a new function like map_page_list, and the shmem_map_buffer is a wrapper function of the map_page_list here?

jenswi-linaro · 2016-12-13T10:18:00Z

Please create a new PR for the first 6 patches. They look quite easy and shouldn't take much time to review.

etienne-lms

some comment.
This changes enables registered shm for the core, but not yet the TAs, right?

etienne-lms · 2017-01-11T14:23:31Z

core/arch/arm/mm/shmem.c

+bool shmem_contains_region_pa(paddr_t pa, size_t size)
+{
+	bool result = false;
+	struct shmem_mapping *area = NULL;


minor: no need to init area

etienne-lms · 2017-01-11T14:25:28Z

core/arch/arm/mm/shmem.c

+{
+	while (size) {
+		if (shmem_contains_region_pa(pa, (size < SMALL_PAGE_SIZE) ? size
+					     : SMALL_PAGE_SIZE))


minor style?: : before newline.

etienne-lms · 2017-01-11T14:36:01Z

core/arch/arm/mm/core_mmu.c

+		va = map_pa2va(find_map_by_type_and_pa(m, pa), pa);
+		if (!va)
+			va = shmem_pa2va(pa);
+		break;


do you really need this ? not simply case MEM_AREA_SHM_VASPACE: va = shmem_pa2va(pa); break;

etienne-lms · 2017-01-11T14:36:48Z

core/arch/arm/mm/shmem.c

+
+/**
+ * struct shmem_mapping - informaion about shared buffer
+ * @next:	next elemein in linked list


typo: s/elemein/element/

etienne-lms · 2017-01-11T14:39:46Z

core/arch/arm/mm/shmem.c

+ * @pages_cnt:	Number of mapped pages
+ * @pages:	Addresses of pages that hold this buffer. Actually we are not
+ *		obligued to store page addresses there, because we can get them
+ *		from MMU. But this aproach simplifies code a lot.


typo: s/obligued/obliged/ + s/aproach/approach/

etienne-lms · 2017-01-11T15:36:59Z

core/arch/arm/mm/shmem.c

+		/* Special case for last page */
+		if (area->pages_cnt > 1 &&
+		    pa >= area->pages[area->pages_cnt-1] &&
+		    pa < (area->pages[area->pages_cnt-1] +


minor: spaces around - in the 2 lines above.

etienne-lms · 2017-01-11T15:37:48Z

core/arch/arm/mm/shmem.c

+			return area;
+
+		/* All other cases are trivial */
+		for (int i = 1; i < area->pages_cnt-1; i++) {


style: move 'int i' definition to block top, spaces around -

etienne-lms · 2017-01-11T15:46:59Z

core/arch/arm/mm/shmem.c

+		core_mmu_set_entry(&tbl_info, idx, 0, TEE_MATTR_HIDDEN_BLOCK);
+		vaddr += SMALL_PAGE_SIZE;
+	}
+


mapping table allocated at map stage are never freed here.
there sould rather be a generic core_mmu.c routine for runtime mapping+table(s) alloc and unmapping+table(s) release.

etienne-lms · 2017-01-11T15:49:16Z

core/include/mm/tee_mm.h

@@ -62,6 +62,9 @@ extern tee_mm_pool_t tee_mm_sec_ddr;
 /* Virtual eSRAM pool */
 extern tee_mm_pool_t tee_mm_vcore;

+/* Shared memory pool */
+extern tee_mm_pool_t tee_mm_shm __early_bss;


minor: __early_bss not needed

etienne-lms · 2017-01-11T16:07:29Z

core/arch/arm/include/mm/tee_mmu_defs.h

@@ -69,7 +69,7 @@
 #define TEE_MMU_L1_ALIGNMENT		TEE_MMU_L1_SIZE

 #define TEE_MMU_L2_NUM_ENTRIES		(TEE_MMU_L2_SIZE / 4)
-#define TEE_MMU_L2_SIZE			(1 << 10)
+#define TEE_MMU_L2_SIZE			(1 << 13)


this is the size of a nonlpae L2 table. should remain 1KByte.

jenswi-linaro · 2017-01-23T11:56:35Z

Please rebase this now that all prerequisites are in place.

lorc · 2017-01-31T12:26:06Z

Patches was rebased on master branch.

jenswi-linaro

Overall this looks good. I suspect that it depends on CFG_SMALL_PAGE_USER_TA=y for mapping of TA parameters.

It looks like struct shmem_mapping and struct mobj_reg_shm overlaps a bit. What if all shared memory dealings where based on MOBJS? We could have yet another MOBJ class adding what's missing from struct shmem_mapping to struct mobj_reg_shm or some other base class.

jenswi-linaro · 2017-02-01T07:26:29Z

core/arch/arm/kernel/thread.c

 	}

+	if (!check_alloced_shm(pa, size, sizeof(uint64_t)))


This function will return false if core_pbuf_is(CORE_MEM_NSEC_SHM, pa, size) returns false.

jenswi-linaro · 2017-02-01T07:49:52Z

core/arch/arm/mm/core_mmu.c

+		core_mmu_find_table(vaddr, UINT_MAX, &tbl_info);
+		idx = core_mmu_va2idx(&tbl_info, vaddr);
+		core_mmu_set_entry(&tbl_info, idx, 0,
+				   TEE_MATTR_HIDDEN_BLOCK);


0 would be more appropriate than TEE_MATTR_HIDDEN_BLOCK. Hidden block is something used by the pager.

jenswi-linaro · 2017-02-01T08:10:27Z

core/arch/arm/mm/core_mmu.c

+		va = map_pa2va(find_map_by_type_and_pa(m, pa), pa);
+		if (!va)
+			va = shmem_pa2va(pa);
+		break;


jenswi-linaro · 2017-02-01T08:13:40Z

core/arch/arm/tee/entry_std.c

+	if (arg == NULL)
+		goto err;
+
+	num_params = arg->num_params;


num_params will be read once again in tee_entry_std() below after this function has returned. What if normal world succeeds in using this window to increase num_params?

jenswi-linaro · 2017-02-01T08:22:18Z

core/include/mm/tee_mm.h

@@ -62,6 +62,9 @@ extern tee_mm_pool_t tee_mm_sec_ddr;
 /* Virtual eSRAM pool */
 extern tee_mm_pool_t tee_mm_vcore;

+/* Shared memory pool */
+extern tee_mm_pool_t tee_mm_shm __early_bss;


jenswi-linaro · 2017-02-01T08:26:27Z

core/include/optee_msg.h

+ * Special parameter type denoting that command buffer continues on
+ *  specified page
+ */
+#define OPTEE_MSG_ATTR_TYPE_NEXT_FRAGMENT	0xc


This together with OPTEE_MSG_ATTR_NESTED is quite complicated. To make it easier to understand please describe how they can be used in one or two example use cases.

etienne-lms

This provides ability to map discontiguous SHM (registered or rpc-allocated) into OP-TEE core but not into TAs. Do you plan to support such SHM in TA ? or I am wrong?

etienne-lms · 2017-02-01T11:05:45Z

core/arch/arm/kernel/thread.c

-		thread_rpc_free_arg(co);
-		pa = 0;
-		co = 0;
+


If needing to call a RPC to allocate discontiguous buffer, I think RPC should rather have NonSecure allocating the buffer, registering it to OP-TEE core and then returning to RPC allocation caller with the registered buffer ID. For the RPC basic communication, default SHM is ok, no real need to complexify with pure "registered" SHM.

This would be the use case where a TA is loaded.

For normal world to call back and register an SHM, it would need yet another thread in secure world as we don't support calling back from an RPC on the same thread.

ok, understood.

lorc · 2017-02-03T19:04:37Z

@jenswi-linaro , I tried to dump all shmem.c code and switch to mobjs. And now I'm thinking that code in thread.c should be mobj-aware. E.g. thread_rpc_alloc_payload() should return mobj and so on. What do you think about this?

Also, I don't know what to do with checks like core_pbuf_is. Are they necessary in mobj-aware code? Because mobjs provide own security checks.

etienne-lms · 2017-02-06T09:44:02Z

@lorc & @jenswi-linaro, regarding core_pbuf_is, i think it is very important that these macros are used when creating a mobj based on a physical location provided by nonsecure world. Once a mobj is safely created, we can rely on the mobj native checks (based on their identifier/property flags rather than physical location).

For example, at core entry, nonsecure provides a buffer location (eventually a list of physical chunks) to be handled as a shm buffer. The core must verify each physical location against authorized SHM locations before creating a "mobj" for the buffer.

jenswi-linaro · 2017-02-06T09:45:39Z

I agree, thread.c should be mobj-aware.

lorc · 2017-02-10T21:30:04Z

Sorry for late answer, I was on sick leave. @etienne-lms, my idea was to put core_pbuf_is into mobj allocation code. In this way you will have only valid mobj or does not have it at all. I think, in this case extra core_pbuf_is check is somewhat redundant. What do you think?

I'm asking because I had removed shmem code, so check like this (https://github.com/OP-TEE/optee_os/pull/1232/files#diff-21c5a372f38c092a429fd2b7e8f71ef3R543) is no longer possible.

Anyway, code in thread.c and entry_std.c looks much cleaner with mobjs used.

etienne-lms · 2017-02-13T09:40:01Z

hi @lorc , hope you feel better.

my idea was to put core_pbuf_is into mobj allocation code. In this way you will have only valid mobj or does not have it at all.

I agree, This is the best way. Get a mobj and once ok only rely on the mobj API.

With your proposal, you plan to create a mobj for each of the physically contiguous chunks of the registered shm ? I rather like the proposal from Jens to extend mobj to support physically discontiguous memory and hide this complexity behind inside the mobj layer (easy to say but I'm not sure it is less complex to implement/maintain :).

lorc · 2017-03-10T11:44:41Z

Hi all,

There is a new version of registered SHM. I get rid of shmem.c and moved all code to mobjs, including code in thread.c, FS, sockets and so on.

jenswi-linaro · 2017-03-13T11:57:35Z

core/arch/arm/plat-vexpress/platform_config.h

@@ -221,7 +221,7 @@
 #endif

 #define DRAM0_BASE		UINTPTR_C(0x40000000)
-#define DRAM0_SIZE		(UINTPTR_C(0x40000000) - CFG_SHMEM_SIZE)
+#define DRAM0_SIZE		(UINTPTR_C(0x42100000) - CFG_SHMEM_SIZE)


I'd rather update build/qemu_v8.mk.
Long term we could read (or supplement) the DRAM config from DT on the QEMU platforms.

jenswi-linaro · 2017-03-13T11:59:10Z

core/arch/arm/include/sm/optee_smc.h

+ * Privileged call flag. Used by supplicant. Asks OP-TEE to use reserved
+ * thread if all regular threads are used.
+ */
+#define OPTEE_SMC_FLAGS_PRIVILEGED	BIT(0)


We can't waste that much memory for this. There has to be some other way.

You mean that you don't want to allocate whole register for flags? Possibly we can pass flag in r0 then.

No, I'm concerned about the stack.

Ah, I see. We need 3 pages per thread for stack, right? This feature will require one extra thread, so additional 12Kb of memory will be used. Is this too much?

I can imagine another approach: save context (and stack) of a suspended thread to dynamically allocated buffer and use that thread for a privileged call. But what if we couldn't allocate buffer for context? There will be deadlock.

Another approach is to try to squeeze privileged thread stack into one of the suspended thread's stack. But this is also prone to problems with the stack size.

Probably we can forbid STD SMC calls from a RPC call. Then there will be no need to a privileged thread. But in that case we will need another call path to register SHM in a RPC request. I had that solution (maybe you remember OPTEE_MSG_ATTR_TYPE_NEXT_NESTED) in earlier versions of the pacthes. But there was the extra param type (NESTED one), additional logic, etc, etc. I didn't liked it, so I choose simpler approach with existing shared memory calls.

Maybe you have ideas on how to avoid deadlock?

One extra privileged thread isn't enough, one extra privileged thread per normal thread is actually needed to avoid deadlocks. That also lead me into the idea that if during RPC registering of SHM is needed it can be done using the suspended thread currently doing the RPC. If we only allow registration of SHM in this case I don't think we'll have a problem with stack overruns.

However, it will result in some extra ping-pong and it may be a bit complicated to find out which thread did the RPC and do a call inside that.

I'm afraid I've forgotten how OPTEE_MSG_ATTR_TYPE_NEXT_NESTED worked, but I think the best solution would be to let the return of OPTEE_MSG_RPC_CMD_SHM_ALLOC to carry the information needed to register the shared memory. Do you really needed OPTEE_MSG_ATTR_TYPE_NEXT_NESTED for that?

I suppose it's possible with careful programming to avoid needing more than one reserved thread, but that means constantly being aware of this restriction and thus accumulating technical debt. Since this will affect the ABI it's important to get it right from the start.

But regardless wasting 8k for this is too much, especially since there's other options.

As I can see it, one thread will be enough in all cases except case when privileged thread will issue own RPC. That is the biggest problem in my opinion.

Can we allocate stack dynamically with malloc()?

I agree that it's a big problem as it can occur when taking a mutex. :-)
To allocate the stack with malloc() is likely to fail due to fragmented heap (8k is a significant part of ~32k).

Okay, point taken. Generally speaking, RPC handling code should not issue OP-TEE calls because this calls can lead to more RPCs.

But in this particular situation with SHM we can:

a) make exception for special register_shm call
b) rework the way how results are returned from RPC

Variant a) leads to alterations to tasker/scheduler code. Variant b) leads to more complex ABI between OP-TEE, kernel driver and supplicant. I prefer b) because I did it once.

What do you think?

I prefer b) too.

lorc · 2017-04-05T17:26:33Z

@jenswi-linaro I had reworked this in a way you asked. I have added a new parameter type and corresponding logic in thread.c
Probably, it is more clear from kernel size: linaro-swg/linux#29 (check out rpc.c).

jenswi-linaro · 2017-04-07T08:34:35Z

core/arch/arm/include/kernel/misc.h

 size_t get_core_pos(void);

 uint32_t read_mode_sp(int cpu_mode);
 uint32_t read_mode_lr(int cpu_mode);

+/**


These functions should go into a .h file with a better name, they should also have a proper prefix.

Could you please suggest file name and prefix? They used in both std_entry.c and thread.c

msg_param could be used as both prefix and basename of the file.
extract_pages_from_params() would then become msg_param_extract_pages() and so on.

jenswi-linaro · 2017-04-07T09:17:24Z

core/arch/arm/kernel/misc.c

@@ -0,0 +1,242 @@
+/*


Please choose a better file name for these functions.

jenswi-linaro · 2017-04-07T09:20:50Z

core/arch/arm/kernel/thread.c

 			args->a0 = OPTEE_SMC_RETURN_ENOMEM;
 			return;
 		}

+		arg = mobj_get_va(mobj, 0);
+		if (!arg || !mobj_is_nonsec(mobj))


It should be enough to assert this.

jenswi-linaro · 2017-04-07T09:22:59Z

core/arch/arm/kernel/thread.c

-		if (!parg || !ALIGNMENT_IS_OK(parg, struct optee_msg_arg) ||
-		    !(arg = phys_to_virt(parg, MEM_AREA_NSEC_SHM))) {
-			thread_rpc_free_arg(carg);
+			&carg);


Perhaps the cookie should be stored inside the mobj, but that can be done later.

Yes, I thought about that. But I don't want to make this change in current PR.

jenswi-linaro · 2017-04-07T09:44:29Z

core/arch/arm/kernel/thread.c

+			goto free_first;
+		}
+		num_pages = (nested_params[0].u.tmem.size + page_offset - 1)
+			/ SMALL_PAGE_SIZE + 1;


Please keep the / on the line above and indent this line properly.

jenswi-linaro · 2017-04-07T10:56:29Z

core/arch/arm/tee/entry_std.c

 		return;
 	}

+	arg = mobj_get_va(mobj, 0);
+	if (!arg || !mobj_is_nonsec(mobj))


Wouldn't an assert be enough?

jenswi-linaro · 2017-04-07T11:32:41Z

core/arch/arm/kernel/misc.c

+	if (!mobj)
+		goto err;
+
+	params = (struct optee_msg_param *)


Cast not needed.

jenswi-linaro · 2017-04-07T11:52:34Z

core/arch/arm/kernel/misc.c

+	params = (struct optee_msg_param *)
+		mobj_get_va(mobj, pa_params & SMALL_PAGE_MASK);
+	pages_cnt = 1;
+	for (uint32_t i = 0; i < num_params; i++) {


Please declare local variables at the beginning of a block.

jenswi-linaro · 2017-04-07T11:52:44Z

core/arch/arm/kernel/misc.c

+				goto err;
+
+			pages_cnt++;
+			params = (struct optee_msg_param *)mobj_get_va(mobj, 0);


Cast not needed

jenswi-linaro · 2017-04-07T11:56:08Z

core/arch/arm/kernel/misc.c

+			params = (struct optee_msg_param *)mobj_get_va(mobj, 0);
+		} else {
+			params++;
+			if (((vaddr_t)params & SMALL_PAGE_MASK) == 0)


It wouldn't hurt with a comment stating that we're checking that we're not overflowing into the next page.

Preferred style is if (!((vaddr_t)params & SMALL_PAGE_MASK))

I'm agree about comment, but disagree about style in this particular place.
I want to emphasis that we are getting 0 after applying the mask. I think, it is more obvious with == 0

I disagree, but it's no big deal. Keep it as is if you think it's important.

jenswi-linaro · 2017-04-07T11:59:58Z

Please create a new PR for "spinlock: add cpu_spin_lock_xsave()/xrestore() functions" and
"tee_mm: add locking while working with pool->entry" to get them out of the way of this review.

lorc · 2017-04-07T12:05:09Z

@jenswi-linaro Thank you for review.

Regarding that two patches, there are PR for them: #1356
But looks like it is stalled....

etienne-lms · 2017-05-17T07:02:32Z

Wouldn't it be more easy if registering SHM reference was using a buffer from the default physically contiguous SHM negotiated at OP-TEE init ?

jenswi-linaro · 2017-05-17T08:21:44Z

I have another idea. As you pointed out we can't use the user space pages to carry the physical pointers. We scrap OPTEE_MSG_ATTR_FRAGMENT and add a OPTEE_MSG_ATTR_NONCONTIG or similar.

What this OPTEE_MSG_ATTR_NONCONTIG means is that the physical address in struct optee_msg_param_tmem:but_ptr points to a page holding a list of physical addresses, each address is one "user space" page. The last address in this page holds the address of next page of addresses if needed. struct optee_msg_param_tmem:size is still the size in bytes (normally page aligned, except in the corner case below) of the shared memory, from this it's easy to determine how many pages are used to hold the pointers etc. When OPTEE_MSG_ATTR_NONCONTIG is used then the shared memory is always page aligned, eventual offsets into the first page etc has to be taken care of in an outer layer.

This model isn't too complicated it's the same as what's in use today except that with OPTEE_MSG_ATTR_NONCONTIG in combination with OPTEE_MSG_ATTR_TYPE_TMEM_* what's described above is activated. This should work both for RPC allocations (OPTEE_MSG_RPC_CMD_SHM_ALLOC) and OPTEE_MSG_CMD_REGISTER_SHM.

I think it's important (especially for the virtualization use case) to be able to get rid of the shared memory pool we're using today completely. We can still support the shared memory pool, but it's not a requirement.

So how can we deal with a normal invoke with OPTEE_MSG_ATTR_TYPE_TMEM_* in combination with OPTEE_MSG_ATTR_NONCONTIG where the shared memory buffer isn't page aligned? One way could be to let the buf_ptr & SMALL_PAGE_MASK bits signify the offset into the shared memory buffer while buf_ptr & ~SMALL_PAGE_MASK is the pointer to the physical page holding the list of physical addresses.

Does it make sense? Can it be simplified further? Am I missing any corner case?

lorc · 2017-05-17T19:10:52Z

@jenswi-linaro, thanks for the idea. I'm currently implementing it. Do deal with non-page aligned shared buffer I prefer to store beginning of the shared buffer as the first element of the array. So pages_array[0] & ~SMALL_PAGE_MASK will point to the firts page and pages_array[0] will point to the beginning of the buffer.

But I see another problem there: RPC calls. It is easy to allocate buffer for pages array in OPTEE_MSG_CMD_REGISTER_SHM because this is synchronous call and I can free this buffer right after SMC invocation.
In RPC case, I will allocate buffer for user pages array in response to OPTEE_MSG_RPC_CMD_SHM_ALLOC. But when I should free it? It should be deferred until normal return from SMC. My previous implementation used allocated buffer to store pages, because in this case buffer was provided either by driver itself or by supplicant. And it was okey, because RPC code didn't allocated any new buffer. What do you think? Should I use the same approach, or it is better to develop some generic mechanism for deferred cleanup?

jenswi-linaro · 2017-05-18T06:09:47Z

It's not so important where the offset is stored, but I think at least one special case can be removed if it's coded into the lower bits of buf_ptr. I'd rather have registered shared memory page aligned as that's how it's mapped nonetheless. Then the offset isn't part of the shared memory object, instead it's part of the parameter. We should be careful to avoid adding features that aren't strictly needed since those usually brings special cases that has to be dealt with.

Regarding the buffer for the pages array for RPC:
We should allocate this buffer just as for the invoke cases, we just need to figure how when we can free it.
We could store it somewhere in optee_do_call_with_arg(), but as we can't know for sure when the information has been used we can't free it until the function returns (or the shared memory object is freed). Another option is to store it in struct tee_shm, but then we'll have the problem storing this information in a generic way, we can't have OP-TEE specific stuff in that struct.

lorc · 2017-05-18T20:03:14Z

Okay, so I decided to do as you say and store offset in buf_ptr. But to do this, I had to ensure that buf_ptr always points to a page boundary. I made necessary adjustments on the kernel side and now it is fine (at least xtest is happy :) )

So, there are v5 delta. To produce #1527 I had to squash v1-v4 deltas and rebase to the master branch.
I don't ask you to review whole PR, because it will be rebased again after #1527 merge. But can you please take a look at v5 delta (5681beb) to confirm new ABI implementation?

Patch for the kernel is on the way.

jenswi-linaro · 2017-05-19T06:52:44Z

Looks good. :-) I've added a few comments in the "v5 delta".

Only misunderstanding is that I'd like OPTEE_MSG_ATTR_TYPE_NONCONTIG to be a separate bit that is used together with OPTEE_MSG_ATTR_TYPE_TMEM_* and only OPTEE_MSG_ATTR_TYPE_TMEM_*. Without OPTEE_MSG_ATTR_TYPE_TMEM_* OPTEE_MSG_ATTR_TYPE_NONCONTIG doesn't mean anything.

copy_in_params() needs to deal with the OPTEE_MSG_ATTR_TYPE_NONCONTIG bit also where memory hasn't been registered in advance. We could of course wait with this until it's a problem that needs to be optimized, but I think this is such a small change compared to everything that it's better to do it now than having to add yet another capability later when it's introduced.

lorc · 2017-05-19T16:54:24Z

@jenswi-linaro, thanks for the review. I had pushed v6 delta.

I want to clarify copy_in_params() with OPTEE_MSG_ATTR_TYPE_TMEM_* use case. You want to pass TMEM_* buffers without prior registration, right? That would save use two OP-TEE calls, but will require some rework on NW side. Anyways, I'll try to implement this.

jenswi-linaro · 2017-05-22T07:21:43Z

Yes, copy_in_params() should be able to work with OPTEE_MSG_ATTR_TYPE_TMEM_* where the OPTEE_MSG_ATTR_NONCONTIG bit can be both set or clear. The latter case to preserve the ABI and be backwards compatible. The former case as an optimization to avoid three calls just to make one invoke with unregistered physically non-contiguous memory.

lorc · 2017-05-22T21:19:06Z

@jenswi-linaro, thanks for the clarification.
During implementing and testing this feature I encountered two issues:

First one is backward parameters passing (when returning results from TA back to client app). Without OPTEE_MSG_ATTR_NONCONTIG we can easily calculate buffer offset: p->u.memref.shm_offs = mp->u.tmem.buf_ptr - shm_pa.
But what to do with OPTEE_MSG_ATTR_NONCONTIG ? mp->u.tmem.buf_ptr holds pointer to the page list. Probably we can calculate offset on OP-TEE side and put it into mp->u.tmem.buf_ptr. But this will be semantically incorrect. Probably we can put real PA into mp->u.tmem.buf_ptr on OP-TEE side. But then kernel driver should perform PA->VA translation for userspace code.
Another issue is with userspace mapping. As you remember, you have used get_user_pages_fast() to get client pages. Issue is in with this line in regression_1000.c:145:
static const uint8_t sha256_in[] = { 'a', 'b', 'c' };
As you can see, this is very short static const array. Compiler have put it in .text section. Problem is that get_user_pages() can't get reference to pages in this section, because technically it is not a part of application's memory, it is mmaped from executable file.
I don't know if memory reference to .text section can be considered as a programmer error, because this memory chunk serves as input parameter, so it can reside in RO memory. What do you thing about this issue?
(BTW, output buffer is also declared as static const and this is clearly programmer error).

jenswi-linaro · 2017-05-23T07:14:10Z

Issue 1:
The pointer value isn't supposed to be updated so there should be any need to update that on the way back.

Issue 2:
Hmm, we'll need to do some magic. On error perhaps we can let the the client lib check the arguments and use temporary buffers for stuff that's in read-only pages. How the lib reliably can tell if a page is read-only mapped or not I don't know though.

lorc · 2017-05-25T12:38:17Z

Hi @jenswi-linaro,
Sorry for delay. It took some time to test this feature. So there are v7 delta patch that implements it. I think, now it is quite fine.
It will require some rework in ioctl interface, but let's discuss that in driver PR.

As for that two issues:

Issue 1: I also think that pointer value should not be updated, but optee driver does this for some reason.
Issue 2: I think that simple fallback to shadow buffer will be enough.

jenswi-linaro · 2017-05-25T14:21:13Z

Thanks @lorc
Issue 1: I don't know why either, shouldn't be needed though. Note that params_to_user() in drivers/tee/tee_core.c doesn't.

Issue 2: sounds good.

I'll not be able to do much reviewing this week, but next week I'll prioritize this.

This patch adds various helper functions to manipulate with parameters passed to/from normal world. Also it introduces new optee_param type which is used to pass long lists of parameters. Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

This is quite large patch, that enables registered shared memory support. Main idea is that normal world can provide own pages for shared memory. So there are two new calls: OPTEE_MSG_CMD_REGISTER_SHM and OPTEE_MSG_CMD_UNREGISTER_SHM. Also, OPTEE_MSG_RPC_CMD_SHM_ALLOC can return registered shared buffer. To ease up use of such buffers all code that uses them is moved to mobjs. That means that TA loader, fs_rpc, sockets, etc all use mobjs to represent shared buffers instead of simple paddr_t. Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

reworked ABI for registered SHM and RPC calls. Removed OPTEE_MSG_ATTR_TYPE_NESTED and OPTEE_MSG_ATTR_FRAGMENT, added OPTEE_MSG_ATTR_TYPE_NONCONTIG. Now "register SHM" and "alloc mem" RPC are handled in the same way. List of pages is stored as plain array of 64 bit values. Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

- OPTEE_MSG_ATTR_TYPE_NONCONTIG converted to OPTEE_MSG_ATTR_NONCONTIG: This is flag now, rather than a type. - removed OPTEE_MSG_ATTR_FRAGMENT (forgot to remove in prev commit). Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

- fixed dump mistake with (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT | OPTEE_MSG_ATTR_TYPE_MASK) instead of (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT | OPTEE_MSG_ATTR_NONCONTIG) in thread.c - added support for non-contig TMEM references - removed duplicate code that constructed mobjs from non-contig buffers: added msg_param_mobj_from_noncontig_param() function that does the job. Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

mobj_shm represents buffer in predefined SHM region. It is used as fallback, when dynamic SHM is disabled. also removed "#ifdef CFG_SMALL_PAGE_USER_TA" check Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

There was bug in offset calculation: invalid variable was used. Also division remainder was replaced with bitwise and to speed up things a bit. Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

lorc · 2017-06-02T22:02:37Z

Hello. So, after #1527 was merged, I rebased this PR onto master branch.
Also I added two new patches:
5e668b5 adds new mobj type instead of mobj_phys which I used earlier.
3d4b5d7 fixes another dumb bug in newly added mobj_reg_shm

jenswi-linaro

Looks good. It would be very good if you could break out the part

Also, OPTEE_MSG_RPC_CMD_SHM_ALLOC can return registered shared
buffer. To ease up use of such buffers all code that uses them
is moved to mobjs. That means that TA loader, fs_rpc, sockets, etc
all use mobjs to represent shared buffers instead of simple paddr_t.

And take that in a separate pull request, only the core part of the non-contiguous shared memory would remain after that.

jenswi-linaro · 2017-06-05T09:02:21Z

core/arch/arm/tee/entry_std.c

+	}
+
+	/* We don't need mobj there, we only care if it was created */
+	if (!msg_param_mobj_from_noncontig_param(arg->params, false))


Please update the comment with why this doesn't cause mobj leakage.

jenswi-linaro · 2017-06-05T09:23:19Z

core/arch/arm/tee/entry_std.c

+
+	arg = (struct optee_msg_arg *)mobj_get_va(mobj, 0);
+	if (arg == NULL) {
+		mobj_free(mobj);


Please use a cleanup label instead, as:

if (arg) goto error; . . . error: mobj_free(mobj); return NULL;

jenswi-linaro · 2017-06-05T10:54:39Z

core/include/optee_msg.h

@@ -140,15 +144,29 @@ struct optee_msg_param_value {
 };

 /**
+ * struct optee_msg_param_nested - nested params


The nested stuff can be removed

lorc · 2017-06-06T19:52:23Z

Okay, I created #1582. It includes three patches that I took from this PR.

lorc · 2017-06-15T22:06:28Z

During review this PR evolved into something completely different, multiple PRs were detached from it. Current code base does not resemble what was pushed originally. So I decided to close it and push new patches for registered shm in new PRs.
Thank you for your time, guys.

prime-zeng reviewed Dec 13, 2016

View reviewed changes

This was referenced Dec 13, 2016

New SHM design. Easy part. #1236

Merged

tee_client_api: register user memory OP-TEE/optee_client#72

Merged

New OP-TEE SHM design. linaro-swg/linux#29

Merged

etienne-lms reviewed Jan 11, 2017

View reviewed changes

lorc force-pushed the new_shm_rel branch 2 times, most recently from e1117ee to 623fb56 Compare January 24, 2017 18:54

jenswi-linaro reviewed Feb 1, 2017

View reviewed changes

etienne-lms reviewed Feb 1, 2017

View reviewed changes

lorc force-pushed the new_shm_rel branch from 623fb56 to c02682f Compare March 10, 2017 11:42

etienne-lms mentioned this pull request Mar 13, 2017

Is there a way to pass fixed physical memory region to TA as a parameter (shm_mem)? #1409

Closed

jenswi-linaro reviewed Mar 13, 2017

View reviewed changes

lorc force-pushed the new_shm_rel branch from c02682f to 44f5bca Compare April 5, 2017 17:24

jenswi-linaro reviewed Apr 7, 2017

View reviewed changes

lorc force-pushed the new_shm_rel branch from 47604e3 to 5681beb Compare May 18, 2017 19:53

lorc added 7 commits June 2, 2017 19:40

shmem: v6 delta

43b8c25

- OPTEE_MSG_ATTR_TYPE_NONCONTIG converted to OPTEE_MSG_ATTR_NONCONTIG: This is flag now, rather than a type. - removed OPTEE_MSG_ATTR_FRAGMENT (forgot to remove in prev commit). Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

reg_shm: delta v8: added new mobj_shm

5e668b5

mobj_shm represents buffer in predefined SHM region. It is used as fallback, when dynamic SHM is disabled. also removed "#ifdef CFG_SMALL_PAGE_USER_TA" check Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

mobj: mobj_reg_shm: fix offset calculation

3d4b5d7

There was bug in offset calculation: invalid variable was used. Also division remainder was replaced with bitwise and to speed up things a bit. Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>

lorc force-pushed the new_shm_rel branch from 6d9c9a2 to 3d4b5d7 Compare June 2, 2017 21:58

jenswi-linaro reviewed Jun 5, 2017

View reviewed changes

lorc mentioned this pull request Jun 6, 2017

Use mobjs to pass shared buffers #1582

Merged

lorc closed this Jun 15, 2017

lorc deleted the new_shm_rel branch June 15, 2017 22:38

lorc restored the new_shm_rel branch June 15, 2017 22:39

jenswi-linaro mentioned this pull request Jun 19, 2017

Assign non-sec DDR configuration from DT #1623

Merged

[RFC] Dynamic shared memory implementation. #1232

[RFC] Dynamic shared memory implementation. #1232

Conversation

lorc commented Dec 12, 2016

lorc commented Dec 12, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro commented Dec 13, 2016

etienne-lms left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro commented Jan 23, 2017

lorc commented Jan 31, 2017

jenswi-linaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etienne-lms left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorc commented Feb 3, 2017

etienne-lms commented Feb 6, 2017

jenswi-linaro commented Feb 6, 2017

lorc commented Feb 10, 2017

etienne-lms commented Feb 13, 2017

lorc commented Mar 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorc commented Apr 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro commented Apr 7, 2017

lorc commented Apr 7, 2017

etienne-lms commented May 17, 2017

jenswi-linaro commented May 17, 2017 • edited Loading

jenswi-linaro Mar 14, 2017 •

edited

Loading

jenswi-linaro commented May 17, 2017 •

edited

Loading

lorc commented May 18, 2017 •

edited

Loading

jenswi-linaro commented May 22, 2017 •

edited

Loading