New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: configure MPU only if switching to a different process #1822
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome! And thanks for the PR!
I don't think there are any docs or tests that need updating for this. Did your testing include running multiple apps as well?
Generally, I think this is a good change. However, I'm concerned there may be some extra considerations where we may need to identify the MPU configuration as "tainted". Specifically:
- if there are additional grant allocations, then the deny region may need to grow (though, this is likely not an immediate issue as the allowed region is generally minimal)
- if there was a change in permitted region, i.e. the application called
brk
or an IPCallow
* then these need to be recalculated - Maybe others I'm not thinking of?
Basically, I think this is a good common-case optimization, but I think we also need a way for something to "poison" the MPU configuration to force it to be recalculated.
*though that second case may actually be fine since in the current architecture as it's always a different process that is allowing a region, so we'll go through regular process switch
We would need to update the kernel/src/platform/mpu.rs documentation, particularly to clarify that the MPU implementation must be setup to only enforce in the userspace privilege level. |
Yes, I also tried this with multiple apps and didn't notice any issues.
Thank you for catching this! I actually looked at various places that can change the MPU config but missed at least one case where MPU config can be modified and not applied immediately, e.g. |
Yeah, I think that's a great idea. I believe the mpu configuration structure is already well-equipped for this as all of its state is private and can only be modified by getters and setters; hooking the latter to mark the MPU configuration as dirty, and then perhaps updating In that case, I actually imagine we could remove the |
1826: kernel: use MapCell for Process::stored_state r=bradjc a=alphan ### Pull Request Overview Currently, `Process::stored_state` is defined as a `Cell`. This change aims to improve syscall latency by using `MapCell` instead. See also: #1730, #178. ### Testing Strategy I ran the same `libtock-rs` app used to test #1822 with and without this change on opentitan on verilator. This change saves ~1200 cycles for a `command` syscall. ### TODO or Help Wanted The main goal of `MapCell` is to avoid panics at runtime, but I'm not sure if it is OK to silently continue in cases where `Process::stored_state`can _possibly_ be empty, which should never happen. The current implementation exclusively uses `Cell::get`, but a `take` could always leave a `default` `StoredState` behind so this change doesn't seem to be worse in that regard. I tried to preserve the current behavior but please let me know if I missed anything. ### Documentation Updated - [X] Updated the relevant files in `/docs`, or no updates are required. ### Formatting - [x] Ran `make formatall`. Co-authored-by: Alphan Ulusoy <alphan@google.com>
Currently, the scheduler configures the MPU every time when it switches to a process. This change aims to improve syscall latency by caching the last running process and configuring the MPU only when switching to a different process. See also: tock#1730.
|
This matches the new `contains` method of Option: rust-lang/rust#62358
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
I added the implementation for ARM as well. It's a bit lighter weight there as ARM already largely did this by cacheing the region calculation result. However, this is still a case where we can trade 8 memory writes for 1 memory compare on the hot path, so likely worth it.
Re #1654: This adds a feature from the upcoming Option API. It's trivial to re-implement ourselves if the manages to become the last blocking feature, so I think there's little harm in using the official API for now.
@ppannuto, thank you for reviewing this change and your comments. I have a question regarding the changes in |
Hmm.., you're right we need some form of structure that is the global to the MPU hardware. The MPU configuration is stored per process (here), so doing dirty tracking in the I don't think the |
The |
Not really sure what I was thinking the first time, but this now correctly tracks the hardware configuration in the hardware structure and the per-process MPU configuration in the process config structure, and will update the hardware if either don't match. This also adds some documentation to these structures to save me from myself in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me; looking forward to the rest of the #1860 series.
Reverted changes in |
Thank you for reviewing this @bradjc and @silvestrst! Added comments to address @silvestrst's comments. |
bors r+ |
big queue i guess? |
Pull Request Overview
Currently, the scheduler configures the MPU every time when it switches
to a process. This change aims to improve syscall latency by caching the
last running process and configuring the MPU only when switching to a
different process.
See also: #1730.
Testing Strategy
I ran a simple
libtock-rs
application for measuring syscall latency withand without this change on opentitan on verilator. This change reduces the
latency of a command syscall by
27511711 cycles (the exact number of cyclessaved obviously depends on the actual implementation, e.g. would probably
be less after #1821).
Also ran
make allcheck
.TODO or Help Wanted
I couldn't find any doc files or tests that should be updated. Since
this is my first real Tock PR, it would be great if you could point me
in the right direction if I missed anything.
Documentation Updated
/docs
, or no updates are required.Formatting
make formatall
.