Add a Stack Allocation API #26999

cfriedt · 2020-07-20T21:43:53Z

Relates to #24714

We want an API such that we can pass in a desired stack size, and a suitably aligned pointer to a stack object will be returned, with all reserved space, alignment, and permissions considerations taken care of.

andrewboie · 2020-07-20T22:40:56Z

With the updates to sys_heap this shouldn't be hard, one of my PRs needs to go in #24714 and then it should be easy.

andrewboie · 2020-08-13T17:23:28Z

For cases where user mode compatibility is not needed, for a stack with requested buffer size N just allocate a blob with size Z_KERNEL_STACK_SIZE_ADJUST(N) and aligned to Z_KERNEL_STACK_OBJ_ALIGN.

To get this completely working with user mode the following steps need to be taken:

At creation time it must be decided whether the created stack will have the capability to host a user thread, or only a supervisor thread.
To create a stack with requested size N, requires reserving a memory region of size X with alignment Y, where both X and Y are functions of N that vary per-arch. We now have macros to help with this:
- For size, X = Z_THREAD_STACK_SIZE_ADJUST(N) or X = Z_KERNEL_STACK_SIZE_ADJUST(N), for user and supervisor-only stacks respectively
- For alignment, Y = Z_THREAD_STACK_OBJ_ALIGN(N) or Y = Z_KERNEL_STACK_OBJ_ALIGN, for user and supervisor-only stacks respectively. In the latter case, the alignment is not a function of N.
If the stack can host user thread, kernel object metadata must be set up properly
- A struct z_object metadata must be created for the stack object and placed within the dynamic tables, this is handled by z_dynamic_object_create()
- If CONFIG_GEN_PRIV_STACKS=y, then the privilege mode stack memory must also be allocated, this is a fixed-size kernel-only stack. A struct z_stack_data must also be allocated and properly configured.

cfriedt · 2020-10-09T12:43:23Z

Looking at this again, and we were just chatting about this on Slack. Some additional notes:

#29064 - @andrewboie provides the struct k_thread * as an argument to fn_abort()
^^ this will also be necessary for #25973 as we need to free the stack from a safe context (either another thread or the idle thread). I would for sure use this function pointer in pthread.c .

I thought about adding another flag K_STACK_ON_HEAP and oring that into user_options. That would be easiest, as long as the original void * from k_malloc() can be recreated from the members of struct _thread_stack_info, because that same void * will need to be passed back to k_free().

andrewboie · 2020-10-09T20:50:14Z

I thought about adding another flag K_STACK_ON_HEAP and oring that into user_options. That would be easiest, as long as the original void * from k_malloc() can be recreated from the members of struct _thread_stack_info, because that same void * will need to be passed back to k_free().

I don't think you should be touching anything inside the k_thread object, that is reserved for the kernel.
Do this accounting at the pthread level.

cfriedt · 2021-02-03T20:51:57Z

For cases where user mode compatibility is not needed, for a stack with requested buffer size N just allocate a blob with size Z_KERNEL_STACK_SIZE_ADJUST(N) and aligned to Z_KERNEL_STACK_OBJ_ALIGN.

That was easy - 15/20 platforms are passing as-is right now with that approach. I think that the stack size might be too small on some 64-bit architectures, so I'll have to increase the default.

To get this completely working with user mode the following steps need to be taken:

At creation time it must be decided whether the created stack will have the capability to host a user thread, or only a supervisor thread.

I have not done that yet. Currently, I've mainly considered kernel mode threads.

To create a stack with requested size N, requires reserving a memory region of size X with alignment Y, where both X and Y are functions of N that vary per-arch. We now have macros to help with this:

For size, X = Z_THREAD_STACK_SIZE_ADJUST(N) or X = Z_KERNEL_STACK_SIZE_ADJUST(N), for user and supervisor-only stacks respectively

For alignment, Y = Z_THREAD_STACK_OBJ_ALIGN(N) or Y = Z_KERNEL_STACK_OBJ_ALIGN, for user and supervisor-only stacks respectively. In the latter case, the alignment is not a function of N.

If the stack can host user thread, kernel object metadata must be set up properly

A struct z_object metadata must be created for the stack object and placed within the dynamic tables, this is handled by z_dynamic_object_create()

If CONFIG_GEN_PRIV_STACKS=y, then the privilege mode stack memory must also be allocated, this is a fixed-size kernel-only stack. A struct z_stack_data must also be allocated and properly configured.

Today I took a stab at the creating a k_thread_stack_alloc() function that would create a dynamic k_object under the assumption that it was a compile-time issue whether to use kernel or user mode threads, but that was a bad assumption. It would be nice to have a consistent API for that that would work for both user and kernel threads, such that whether it was user or kernel mode would be decided by the mode of the calling thread.

For kernel threads, it should be fine to deallocate the stack with a call to abort_fn(), but for user mode threads, it looks as though the stack could be automatically cleaned up when the number of references to the stack drop to zero. Is that true?

@nashif - I'm hoping to finish this up in the near future. I'll need to come up with a better system than the K_STACK_ON_HEAP flag since @andrewboie does not want me to touch the k_thread object. Probably the next-best thing I could do would be to keep a linked-list of pointers to stacks that have been dynamically allocated, and then k_free() if necessary in the abort_fn() callback.

cfriedt · 2021-02-27T15:54:26Z

z_dynamic_object_aligned_create() aligns the dynamic object, not the data section of the dynamic object, so it does not, by itself, appear to be able to be able to allocate a suitably aligned stack area. It would be ideal to be able to re-use k_aligned_alloc() to allocate the stack area, then to allocate a dynamic object, and then to embed a pointer to the stack area inside of it, but I'm not sure if that's exactly how kernel objects work in terms of MPU / MMU protection.

This change adds the syscall k_alloc_thread_stack() which dynamically allocates suitably aligned and padded thread stacks. Fixes zephyrproject-rtos#26999 Signed-off-by: Christopher Friedt <chrisfriedt@gmail.com>

This change adds tests for dynamically allocated thread stacks. Fixes zephyrproject-rtos#26999 Signed-off-by: Christopher Friedt <chrisfriedt@gmail.com>

joelguittet · 2023-04-01T20:25:10Z

Hello @cfriedt

I would like to know the status and goals on this feature, I see you have also #44379 that is a draft PR.

My understanding is actually what you have develop permits to reserve an area (looks it is a good idea !) to declare dynamically some stacks, that can be passed to k_thread_create. It seems then you need to k_thread_join waiting the termination of the task to free the stack itself when it's no more used. Is it correct ?

The last point is annoying in my opinion, because this means using dynamic stack we need to implement a kind of controller that will check for tasks created with dynamic stack, verify if they are terminated and then release the corresponding stacks.

If I refer to FreeRTOS, this is what is done: when a task is deleted, it is added to a list for termination. Then the idle task periodically check this list and free memory.

So my question will be more exactly and more directly: why not having the same behavior in Zephyr, idle task checking terminated tasks to free the dynamically created stacks that are no more used ?

Thanks,
Joel

cfriedt · 2023-04-01T20:42:54Z

Hi @joelguittet - sorry this issue has not moved very much, but I really don't get very many opportunities to work on this feature these days. Most of this is done in my "spare time".

The goal is to be able to dynamically allocate a stack. There have been many attempts with PRs that try and do this sort of thing using a pool of statically allocated stacks. That is technically quite easy.

What is not as easy is using arbitrary heap memory for a thread stack. For that, we need a corresponding kernel object as well for when CONFIG_USERSPACE is enabled. Additionally, on systems with an MMU, we want to be able to set the correct MMU flags for the allocated stack regions before the thread begins to run, and clear the flags after the thread has joined.

All of that is done in #44379 except for the MMU bits. I have not really had an opportunity to figure those out. Also, from what I understand, it may be a different procedure for different architectures (but I'm not certain of that).

There is additional complexity for systems that have an MPU or PMP, because those units can sometimes have insane memory alignment requirements for thread stacks.

So yes, there is the typical requirements for stack setup / tear-down, but also additional complexities.

I have no ETA for when this will be finished, as it is quite low on my list of priorities at the moment.

joelguittet · 2023-04-01T21:35:05Z

@cfriedt
Thanks for the prompt answer! You made a big picture which is very interesting, looks like the topic is more expanded that I was thinking.
Regarding ETA, no worries, I understand perfectly, myself I'm also developing in my free time :-)
I will try to have a specific implementation for my use case.
Joel

cfriedt assigned andrewboie Jul 20, 2020

andrewboie added this to the v2.4.0 milestone Jul 20, 2020

andrewboie added Enhancement Changes/Updates/Additions to existing features area: Kernel labels Jul 20, 2020

nashif added the priority: medium Medium impact/importance bug label Aug 13, 2020

cfriedt mentioned this issue Oct 9, 2020

lib: posix: dynamic stack support for pthread_create() #25973

Closed

cfriedt pinned this issue Feb 3, 2021

nashif unpinned this issue Feb 3, 2021

cfriedt mentioned this issue Mar 3, 2021

build: remove dynamic-pthread-stacks patch cfriedt/greybus-for-zephyr#10

Open

nashif unassigned andrewboie Apr 23, 2021

cfriedt mentioned this issue Aug 13, 2021

add an eBPF interpreter and JIT interface #37671

Open

cfriedt mentioned this issue Feb 1, 2022

The pthreads are not working on user space. ARM64 cortex_a53 but generic requirement. #42279

Closed

This was referenced Mar 13, 2022

Support for std::thread and std::this_thread #25569

Open

kernel: add a stack allocation api #43730

Closed

cfriedt modified the milestones: v2.4.0, v3.3.0 Oct 21, 2022

stephanosio modified the milestones: v3.3.0, v3.4.0 Jan 26, 2023

joelguittet mentioned this issue Feb 11, 2023

platform : zephyr mender_rtos_task_create does not use stack_size joelguittet/mender-mcu-client#4

Closed

nashif removed this from the v3.4.0 milestone Jun 1, 2023

cfriedt mentioned this issue Jun 30, 2023

kernel: support dynamic thread stack allocation #44379

Merged

cfriedt closed this as completed in #44379 Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Stack Allocation API #26999

Add a Stack Allocation API #26999

cfriedt commented Jul 20, 2020 •

edited

Loading

andrewboie commented Jul 20, 2020

andrewboie commented Aug 13, 2020

cfriedt commented Oct 9, 2020 •

edited

Loading

andrewboie commented Oct 9, 2020

cfriedt commented Feb 3, 2021 •

edited

Loading

cfriedt commented Feb 27, 2021

joelguittet commented Apr 1, 2023

cfriedt commented Apr 1, 2023

joelguittet commented Apr 1, 2023

Add a Stack Allocation API #26999

Add a Stack Allocation API #26999

Comments

cfriedt commented Jul 20, 2020 • edited Loading

andrewboie commented Jul 20, 2020

andrewboie commented Aug 13, 2020

cfriedt commented Oct 9, 2020 • edited Loading

andrewboie commented Oct 9, 2020

cfriedt commented Feb 3, 2021 • edited Loading

cfriedt commented Feb 27, 2021

joelguittet commented Apr 1, 2023

cfriedt commented Apr 1, 2023

joelguittet commented Apr 1, 2023

cfriedt commented Jul 20, 2020 •

edited

Loading

cfriedt commented Oct 9, 2020 •

edited

Loading

cfriedt commented Feb 3, 2021 •

edited

Loading