New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: support memory mapped stacks #70810
kernel: support memory mapped stacks #70810
Conversation
9681584
to
49be6cc
Compare
Pylint complains so we fix. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
x86_64 does not currently support demand paging so don't advertise it. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Most places use CONFIG_X86_STACK_PROTECTION, but there are some places using CONFIG_HW_STACK_PROTECTION. So synchronize all to use CONFIG_X86_STACK_PROTECTION instead. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Previous commit changed the privileged stack size to be using kconfig CONFIG_PRIVILEGED_STACK_SIZE instead of simply CONFIG_MMU_PAGE_SIZE. However, the stack bound check function was still using the MMU page size, so fix that. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds the mechanism to do cleanup after k_thread_abort() is called with the current thread. This is mainly used for cleaning up things when the thread cannot be running, e.g., cleanup the thread stack. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is similar to k_mem_map()/_unmap(). But instead of using anonymous memory, the provided physical region is mapped into virtual address instead. In addition to simple mapping physical ro virtual addresses, the mapping also adds two guard pages before and after the virtual region to catch buffer under-/over-flow. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This introduces support for memory mapped thread stacks, where each thread stack is mapped into virtual memory address space with two guard pages to catch under-/over-flowing the stack. This is just on the kernel side. Additional architecture code is required to fully support this feature. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This extends the test for memory mapped stack, as the address of memory mapped stack object would be different than the actual stack object. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This extends the test for memory mapped stack, as the address of memory mapped stack object would be different than the actual stack object. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This extends the thread abort hook to support memory mapped stack. The calculation to find out to which instance of thread pools the outgoing thread belongs requires physical address. So find the physical address via the memory mapped stack for that. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This extends the mem_protect/stackprot tests to support testing with memory mapped stack. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The current memory mapped stack code requires actual physical addresses for stacks, and cannot deal with stacks already using virtual addresses. So disable mapped stack via defconfig. Note that this is done before enabling memory mapped stacks on x86 so test won't fail when the support in the architecture code is introduced. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
49be6cc
to
0b1c3c9
Compare
This adds the necessary bits to enable memory mapping thread stacks on both x86 and x86_64. Note that currently these do not support multi level mappings (e.g. demand paging and running in virtual address space: qemu_x86/atom/virt board) as the mapped stacks require actual physical addresses. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done! It's not every day a 2xxxx series issue gets some attention 🪁
*/ | ||
thread->stack_info.mapped.addr = NULL; | ||
thread->stack_info.mapped.sz = 0; | ||
#endif /* CONFIG_THREAD_STACK_MEM_MAPPED */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How this works with smp ? What happens if two threads running in parallel do z_thread_abort(self)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spinlock in k_thread_abort_cleanup()
will serialize them. Thread 1 will setup the data for deferred unmapping. Thread 2 will see that and will do the unmapping for thread 1, then setup its own deferred unmapping data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aaah ok, there is a cleanup first for any pending thread.
yay ! overflow detection ! |
I'm gonna be that guy again. This is a huge patch and a ton of complexity... do we really want to do this this way? The feature has value, but is this the right API? I mean, look at Linux. Linux has all this stuff too. And it doesn't have any support in the kernel to do it. You create a new thread stack with mmap() and just point clone() to it, and the kernel uses it right where you told it to. Can't we do this with a library layer where you say "give me a mmap'd thread stack", and you get one, and then pass that to k_thread_create()? Why bother making a complicated thread creation API even more complicated? |
This could also be used for some other interesting things aside from thread stacks 😏 |
* | ||
* @param thread Pointer to thread to be cleaned up. | ||
*/ | ||
void k_thread_abort_cleanup(struct k_thread *thread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good example here: if your stack creation is a library level thing, you can just provide a top level wrapper or whatever and clean it up on exit. No need to muck with thread lifecycle stuff in the kernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be used for some other interesting things aside from thread stacks 😏
In fact pthreads would be the obvious choice for the "wrapper" API that does the appropriate stack allocation/mapping magic on behalf of the app, because that's where users expect that abstraction to live as that's where they find it on Linux.
* cleared below. | ||
*/ | ||
void *stack_mapped = k_mem_phys_map((uintptr_t)stack, stack_obj_size, | ||
K_MEM_PERM_RW | K_MEM_CACHE_WB | K_MEM_MAP_UNINIT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also just to be pedantic, this isn't an optional thing in this scheme: every stack in every thread gets remapped automatically whether it wants it or not. But control over stack memory is something a lot of apps want. Consider a runtime for a Go-like language with lots of I/O threads flying around using tiny bounded stacks, etc...
Obviously most apps will just want defaults, but at the same time most apps would be just as happy with a recommendation like "use pthreads to get automatic MMU stack bounds, k_thread_create() is the low level API".
This is done in this way to be as transparent to the API user as possible. As long as the kconfig is enabled, the feature is enabled without any code changes on the application. With your idea of having a library call to produce memory mapped stacks (whether coming from malloc or statically declared stacks), the user of If you can come up with another solution without requiring application code changes, please feel free to submit it. I am very interested in seeing such solution. |
But... why do we not want application code changes, though? New features get new APIs, that's not so weird to me? And I repeat, it's not as simple as "just turn it on", some apps won't want this, or will want it for only certain threads. k_thread_create() is an extremely complicated API already. The philosophy here is basically that toolkits are better than frameworks. There's no Right Answer for how to lay out memory spaces, it's just a complicated area. Every runtime on Linux tends to have its own ideas about how to do this, for example (Python threads don't get initialized like bare pthreads, which are different from Rust, etc...) and it won't be any different for us. If we land this API like it is we're stuck with it forever, and I really don't think this is likely to evolve cleanly. |
Just a suggestion, couldn't mapping be a flag that is added to k_thread_stack_alloc()? I assume it should be possible to detect if a stack is mapped in k_thread_stack_free() |
If an app doesn't want it, they don't have to turn on the kconfig. I can add a thread creation flag to use stack as-is. The fundamental question is, whether we want to be safe by default, or only opt into safety if needed (or rather, if someone remembers to be safe). I would argue that in most cases, safe by default should be the choice. And that being unsafe needs to be a conscious choice (or rather, MUST be a conscious choice). Here, having guards around stack serves is important to prevent corrupting data when running in kernel mode, whether with kernel mode only threads or serving syscalls with privileged stack. Overflowing thread stack in user mode will touch the privileged stack in current arch MMU code and it will generate page faults. So that is covered. However, with kernel mode only threads and serving syscalls, they are running in kernel mode. And given that most stacks are declared statically in the |
This only works if stacks are allocated at runtime. For statically declared stacks (which majority of them are), they would still need to be pre-processed first if mapping is not handled automatically. |
This adds support to memory map thread stacks during thread creation, and subsequent (delayed) un-mapping when the thread is terminated. With memory mapped stacks, guard pages can be created enclosing the stacks to catch stack overflow and underflow. This requires MMU.
(See individual commits for details.)
Relates: #28899