New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement stack switching on system call entries and on ARM traps/IRQ's #1174
Implement stack switching on system call entries and on ARM traps/IRQ's #1174
Conversation
049a2f9
to
38f97d5
Compare
38f97d5
to
e3ff15c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Besides the inline comments, would it be okay to open a separate PR for the Arm part, as I would like to give a bit more thought on the implementation of the IRQ / trap stack without blocking the x86_64
part? I am planning to provide a review asap, so that we get it merged by early next week. Thanks a lot 🙏🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved-by: Razvan Deaconescu razvand@unikraft.io
Add a macro-definition for the x86 `KERNEL_GS_BASE` MSR that is usually used in conjunction with the `swapgs` instruction for an efficient swap between the `gs_base` register placed in `KERNEL_GS_BASE` and `GS_BASE` and also add a macro-definition for the `MSR_GS_BASE` MSR to access the `gs_base` register in 64-bit mode. Co-authored-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Implement basic methods to be able to read/write from/to the GS_BASE and KERNEL_GS_BASE MSR's, as well as from an offset relative to the former's value. Co-authored-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Align the `lcpus` array with other per-cpu variables by defining it through the `UKPLAT_PER_LCPU_DEFINE` macro. Furthermore, add a `TODO:` for the `lcpus` used in `lcpu_start64`. This direct usage leaks the implementation of `UKPLAT_PER_LCPU_DEFINE`. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Every LCPU shall have their `GS_BASE` and `KERNEL_GS_BASE` registers assigned to their own `struct lcpu` element of the `lcpus` array. Co-authored-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Every LCPU shall have their `TPIDR_EL1` system register assigned the value of the address of their own `struct lcpu` element in the global `lcpus` array. Co-authored-by: Michalis Pappas <michalis@unikraft.io> Signed-off-by: Michalis Pappas <michalis@unikraft.io> Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Implement an architecture specific fast access to the currently executing CPU's index in the array of the statically declared CPU's. On x86_64, this is done by derefering the `struct lcpu` pointer of the currently executing CPU previously stored inside he `GS_BASE` MSR. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Implement an architecture specific fast access to the currently executing CPU's index in the array of the statically declared CPU's. On ARM64 this is done by dereferencing from the `struct lcpu` we know is stored inside `tpidr_el1`. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Now that we have a fast access to the currently executing CPU's `struct lcpu` structure inside the array of `struct lcpu`'s, make use of `lcpu_arch_idx()` in `ukplat_lcpu_idx`. With this, we now no longer need to employ the logic implied by `CONFIG_UKPLAT_LCPU_IDISIDX`. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
The `lcpu_arch_id()` method can be slow depending on the architecture. Instead, rely on the ID currently stored inside the currently executing CPU's `struct lcpu`. Furthermore, since on non-SMP builds we make `lcpu_arch_id()` always return 0, make `ukplat_lcpu_id()` also return 0 by default anyway. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Therefore, move the guarding `#ifdef` below the register definitions so that assembly files may be able to include this header and benefit from the aforementioned macros. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Although the Vector table has `EL0` exceptions related entries, we never execute from `EL0` and they will never happen. Furthermore, `SP_EL0` is a perfect candidate for an emergency scratch register in an `EL1` only OS like us. Thus, remove all of the `EL0` logic, while maintaining the same alignments and semantics. Lastly, allow catching missized formats of the `struct __regs` and `struct __callee_saved_regs` at build time by using `UK_CTASSERT` on them using `__REGS_SIZEOF`/`__REGS_PAD_SIZE` macros similar to those defined by x86. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Define a per-cpu buffer whose size is twice that of the configured stacks and whose definition may be represented through the following diagram: ``` STACK_SIZE STACK_SIZE <---------------------><---------------------> |============================================| | | | | trap stack | IRQ stack | | | | |============================================= ^ SP_EL0 ``` The middle address of this buffer shall be assigned to `SP_EL0`, a great candidate for a register free of use as a `EL1` only Unikernel. Now, depending on whether an exception is an IRQ or a trap, the early assembly entry will switch to either the IRQ stack or the trap stack, by simply making use of the `SP_EL0` system register. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Define two macros to represent the maximum known possible alignment we can have for the ECTX we are aware of and its size respectively. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Define two macros to represent the maximum known possible alignment we can have for the ECTX we are aware of and its size respectively. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Simple macros that do not involve C code may as well be visible to code written in assembly. This may come in handy when not wanting to use magic values. Therefore, make it so that architecture specific macros from context related headers are also visible to assembly written code by having them defined outside the `!__ASSEMBLY` guards. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Add a method that transparently allocates and returns an auxiliary stack according to CONFIG_UKPLAT_AUXSP_PAGE_ORDER. Note that if `libukvmem` is enabled, this stack shall be mapped through VMA stack to ensure that it is backed by physical memory early on and, furthermore, ends in a guard page to let us know of potential stack overflows. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
`struct uk_thread` contains a field called `auxsp` which is meant to represent an auxiliary stack that may be used when switching stacks during syscall entries or when simply wanting to have a scratch space to use during a very fragile state of the system, such as when handling an exception. Give access to this field through `struct lcpu` as well. This field shall represent the auxiliary stack pointer of the thread currently executing on this LCPU. If uksched/ukthread is disabled, this can come in handy as simply a pointer to a scratch space that we can contaminate with whatever we may want during a more fragile execution context. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Add a new field to `struct uk_thread` that can represent a secondary stack that can be used as a backup stack. This can become very useful in cases such as when wanting to defer exception handling without creating another thread. For example, we want to return from an exception into a function inside the same thread to be able to do deferred I/O outside exception context without contaminating the original stack that was present before the trap. We can avoid polluting the original stack by using this auxiliary stack instead. Update each thread creation/deletion method's signature and implementation accordingly. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
If shcheduling/multithreading is not enabled, allocate an auxiliary stack for bootstrap LCPU. Otherwise, the LCPU will have the same auxiliary stack as the thread that runs on it and this allocation is not needed. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Implement basic getter/setter methods for the `auxsp` field that exists in `struct lcpu`. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Make sure to fill in the value of current lcpu's (bootstrap lcpu in our case) auxiliary stack pointer with that of the current thread's (main thread) when initializing scheduling. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
If uksched/ukthread is enabled, make sure that the current LCPU's auxiliary stack pointer is updated to always point to the currently executing thread's auxiliary stack pointer by setting it to that of the thread it is about to switch to during context switching. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Deprecate locally defined `ENTRY` macro in favor of the more generally available, equivalent macro defined in `uk/asm.h`. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
We would usually use the application's stack to execute the issued system call. However this present a couple of problems such as if the application's stacks are too small (like it is the case for Go's goroutines) this will end up generating an unhandled pagefault or, even worse, corrupting other memory areas beyond the respective stack. Another problem would be that we will end up overwriting stuff on the Red Zone for those applications that employ it. To fix this, switch to the per-thread auxiliary stack on system call entry and use that instead. The switch shall be done with the help of the `swapgs` instruction to help us swap the `gs_base` registers, as our own `KERNEL_GS_BASE` contains a pointer to the current LCPU's `struct lcpu` that has an up to date pointer to the current thread's auxiliary stack. Co-authored-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Marco Schlumpp <marco@unikraft.io> Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Previously, we would save zeroes on the syscall entry frame instead of using the actual values for the registers that would normally be popped off the stack since they would not be used anyway by a syscall. For completeness's sake, do still save them, to offer those that may view the structure a clear view of what the syscall will return to. Furthermore, we no longer need to clear the stack before syscall exit (the previous `addq $(6 * 8), %rsp`) because the auxiliary stack is used instead, so do not do this operation anymore. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
We would usually use the trap stack to execute the issued system call. However this may be troublesome if the syscall will generate page faults for example, which would end up reusing the same stack. To fix this, switch to the per-thread auxiliary stack on system call entry and use that instead. The switch shall be done with the help of the `tpidr_el1` register to help us, as it will be guaranteed to contain `struct lcpu` that has an up to date pointer to the current thread's auxiliary stack. Signed-off-by: Sergiu Moga <sergiu@unikraft.io> Reviewed-by: Michalis Pappas <michalis@unikraft.io> Approved-by: Razvan Deaconescu <razvand@unikraft.io> GitHub-Closes: #1174
Prerequisite checklist
checkpatch.uk
on your commit series before opening this PR;Base target
x86_64
,arm64
kvm
Additional configuration
Description of changes
struct lcpu *
to a system register (gs_base
/k_gs_base
onx86_64
,tpidr_el1
onARM64
)sp_el0
onARM64
as the register to hold the middle address of a double-sized buffer meant to represent the two stacks to switch to in case of an IRQ/trap:auxsp
which represents the auxiliary stack that both architectures can use to switch to in case of a system callx86_64
is achieved with the help ofswapgs
and%gs:<AUXSP_OFFSET>
, while onARM64
it is achieved by usingtpidrro_el0
as a scratch register. The latter is assumed to be fine to do so because it will always hold a value the application can't modify and we will always be able to restore it to its desired known value anytime we wantDepends on #1173
NOTE: This should not be tested on its own. Use #1175 instead as it contains everything.