Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement stack switching on system call entries and on ARM traps/IRQ's #1174

Merged
merged 26 commits into from Dec 23, 2023

Conversation

mogasergiu
Copy link
Member

@mogasergiu mogasergiu commented Nov 24, 2023

Prerequisite checklist

  • Read the contribution guidelines regarding submitting new changes to the project;
  • Tested your changes against relevant architectures and platforms;
  • Ran the checkpatch.uk on your commit series before opening this PR;
  • Updated relevant documentation.

Base target

  • Architecture(s): x86_64, arm64
  • Platform(s): kvm
  • Application(s): N/A

Additional configuration

Description of changes

  • assign the current LCPU's struct lcpu * to a system register (gs_base/k_gs_base on x86_64, tpidr_el1 on ARM64)
  • use sp_el0 on ARM64 as the register to hold the middle address of a double-sized buffer meant to represent the two stacks to switch to in case of an IRQ/trap:
          STACK_SIZE               STACK_SIZE
     <---------------------><--------------------->
     |============================================|
     |                     |                      |
     |       trap stack    |        IRQ stack     |
     |                     |                      |
     |=============================================
                           ^
                         SP_EL0
  • define a per-thread and per-LCPU field auxsp which represents the auxiliary stack that both architectures can use to switch to in case of a system call
  • the stack switch on x86_64 is achieved with the help of swapgs and %gs:<AUXSP_OFFSET>, while on ARM64 it is achieved by using tpidrro_el0 as a scratch register. The latter is assumed to be fine to do so because it will always hold a value the application can't modify and we will always be able to restore it to its desired known value anytime we want

Depends on #1173

NOTE: This should not be tested on its own. Use #1175 instead as it contains everything.

@mogasergiu mogasergiu requested review from a team as code owners November 24, 2023 17:44
@mogasergiu mogasergiu requested review from michpappas, mschlumpp and skuenzer and removed request for a team November 24, 2023 17:49
@mogasergiu mogasergiu force-pushed the smoga/stack_switch branch 2 times, most recently from 049a2f9 to 38f97d5 Compare November 25, 2023 11:07
mogasergiu added a commit to mogasergiu/unikraft that referenced this pull request Nov 25, 2023
mogasergiu added a commit to mogasergiu/unikraft that referenced this pull request Nov 26, 2023
Copy link
Member

@michpappas michpappas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Besides the inline comments, would it be okay to open a separate PR for the Arm part, as I would like to give a bit more thought on the implementation of the IRQ / trap stack without blocking the x86_64 part? I am planning to provide a review asap, so that we get it merged by early next week. Thanks a lot 🙏🏼

plat/common/include/x86/cpu_defs.h Show resolved Hide resolved
plat/common/include/x86/gsbase.h Show resolved Hide resolved
plat/common/x86/lcpu.c Show resolved Hide resolved
include/uk/plat/memory.h Outdated Show resolved Hide resolved
plat/common/x86/syscall.S Outdated Show resolved Hide resolved
plat/common/x86/syscall.S Show resolved Hide resolved
plat/common/lcpu.c Show resolved Hide resolved
include/uk/plat/lcpu.h Outdated Show resolved Hide resolved
include/uk/plat/memory.h Outdated Show resolved Hide resolved
plat/Config.uk Outdated Show resolved Hide resolved
@razvand razvand added this to the v0.16.0 (Telesto) milestone Dec 6, 2023
mogasergiu added a commit to mogasergiu/unikraft that referenced this pull request Dec 8, 2023
Copy link
Contributor

@razvand razvand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved-by: Razvan Deaconescu razvand@unikraft.io

@razvand razvand changed the base branch from staging to staging-pr-1174 December 23, 2023 12:41
@razvand razvand merged commit c8c97ac into unikraft:staging-pr-1174 Dec 23, 2023
10 checks passed
razvand pushed a commit that referenced this pull request Dec 23, 2023
Add a macro-definition for the x86 `KERNEL_GS_BASE` MSR that is usually
used in conjunction with the `swapgs` instruction for an efficient swap
between the `gs_base` register placed in `KERNEL_GS_BASE` and `GS_BASE`
and also add a macro-definition for the `MSR_GS_BASE` MSR to access the
`gs_base` register in 64-bit mode.

Co-authored-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Implement basic methods to be able to read/write from/to the
GS_BASE and KERNEL_GS_BASE MSR's, as well as from an offset relative
to the former's value.

Co-authored-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Align the `lcpus` array with other per-cpu variables by defining
it through the `UKPLAT_PER_LCPU_DEFINE` macro.

Furthermore, add a `TODO:` for the `lcpus` used in `lcpu_start64`.
This direct usage leaks the implementation of `UKPLAT_PER_LCPU_DEFINE`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Every LCPU shall have their `GS_BASE` and `KERNEL_GS_BASE` registers
assigned to their own `struct lcpu` element of the `lcpus` array.

Co-authored-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Every LCPU shall have their `TPIDR_EL1` system register assigned
the value of the address of their own `struct lcpu` element in the
global `lcpus` array.

Co-authored-by: Michalis Pappas <michalis@unikraft.io>
Signed-off-by: Michalis Pappas <michalis@unikraft.io>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Implement an architecture specific fast access to the currently
executing CPU's index in the array of the statically declared CPU's.

On x86_64, this is done by derefering the `struct lcpu` pointer of the
currently executing CPU previously stored inside he `GS_BASE` MSR.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Implement an architecture specific fast access to the currently
executing CPU's index in the array of the statically declared CPU's.

On ARM64 this is done by dereferencing from the `struct lcpu` we know
is stored inside `tpidr_el1`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Now that we have a fast access to the currently executing CPU's
`struct lcpu` structure inside the array of `struct lcpu`'s,
make use of `lcpu_arch_idx()` in `ukplat_lcpu_idx`.

With this, we now no longer need to employ the logic implied by
`CONFIG_UKPLAT_LCPU_IDISIDX`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
The `lcpu_arch_id()` method can be slow depending on the architecture.
Instead, rely on the ID currently stored inside the currently executing
CPU's `struct lcpu`.

Furthermore, since on non-SMP builds we make `lcpu_arch_id()` always
return 0, make `ukplat_lcpu_id()` also return 0 by default anyway.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Therefore, move the guarding `#ifdef` below the register definitions
so that assembly files may be able to include this header and benefit
from the aforementioned macros.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Although the Vector table has `EL0` exceptions related entries,
we never execute from `EL0` and they will never happen.

Furthermore, `SP_EL0` is a perfect candidate for an emergency
scratch register in an `EL1` only OS like us.

Thus, remove all of the `EL0` logic, while maintaining the same
alignments and semantics.

Lastly, allow catching missized formats of the `struct __regs` and
`struct __callee_saved_regs` at build time by using `UK_CTASSERT` on
them using `__REGS_SIZEOF`/`__REGS_PAD_SIZE` macros similar to those
defined by x86.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Define a per-cpu buffer whose size is twice that of the configured
stacks and whose definition may be represented through the following
diagram:
```
      STACK_SIZE               STACK_SIZE
 <---------------------><--------------------->
 |============================================|
 |                     |                      |
 |       trap stack    |        IRQ stack     |
 |                     |                      |
 |=============================================
                       ^
                     SP_EL0
```

The middle address of this buffer shall be assigned to `SP_EL0`,
a great candidate for a register free of use as a `EL1` only Unikernel.

Now, depending on whether an exception is an IRQ or a trap, the early
assembly entry will switch to either the IRQ stack or the trap stack,
by simply making use of the `SP_EL0` system register.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Define two macros to represent the maximum known possible alignment
we can have for the ECTX we are aware of and its size respectively.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Define two macros to represent the maximum known possible alignment
we can have for the ECTX we are aware of and its size respectively.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Simple macros that do not involve C code may as well be visible to code
written in assembly. This may come in handy when not wanting to use
magic values.

Therefore, make it so that architecture specific macros from context
related headers are also visible to assembly written code by having
them defined outside the `!__ASSEMBLY` guards.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Add a method that transparently allocates and returns an auxiliary
stack according to CONFIG_UKPLAT_AUXSP_PAGE_ORDER.

Note that if `libukvmem` is enabled, this stack shall be mapped
through VMA stack to ensure that it is backed by physical memory early
on and, furthermore, ends in a guard page to let us know of potential
stack overflows.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
`struct uk_thread` contains a field called `auxsp` which is meant to
represent an auxiliary stack that may be used when switching stacks
during syscall entries or when simply wanting to have a scratch space
to use during a very fragile state of the system, such as when handling
an exception.

Give access to this field through `struct lcpu` as well. This field
shall represent the auxiliary stack pointer of the thread currently
executing on this LCPU.

If uksched/ukthread is disabled, this can come in handy as simply a
pointer to a scratch space that we can contaminate with whatever we
may want during a more fragile execution context.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Add a new field to `struct uk_thread` that can represent a secondary
stack that can be used as a backup stack.

This can become very useful in cases such as when wanting to defer
exception handling without creating another thread. For example,
we want to return from an exception into a function inside the
same thread to be able to do deferred I/O outside exception
context without contaminating the original stack that was present
before the trap. We can avoid polluting the original stack by using
this auxiliary stack instead.

Update each thread creation/deletion method's signature and
implementation accordingly.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
If shcheduling/multithreading is not enabled, allocate an auxiliary
stack for bootstrap LCPU. Otherwise, the LCPU will have the same
auxiliary stack as the thread that runs on it and this allocation is
not needed.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>

Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Implement basic getter/setter methods for the `auxsp` field that
exists in `struct lcpu`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Make sure to fill in the value of current lcpu's (bootstrap lcpu in
our case) auxiliary stack pointer with that of the current thread's
(main thread) when initializing scheduling.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
If uksched/ukthread is enabled, make sure that the current LCPU's
auxiliary stack pointer is updated to always point to the currently
executing thread's auxiliary stack pointer by setting it to that
of the thread it is about to switch to during context switching.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Deprecate locally defined `ENTRY` macro in favor of the more generally
available, equivalent macro defined in `uk/asm.h`.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
We would usually use the application's stack to execute the issued
system call. However this present a couple of problems such as
if the application's stacks are too small (like it is the case
for Go's goroutines) this will end up generating an unhandled
pagefault or, even worse, corrupting other memory areas beyond
the respective stack. Another problem would be that we will
end up overwriting stuff on the Red Zone for those applications
that employ it.

To fix this, switch to the per-thread auxiliary stack on system call
entry and use that instead. The switch shall be done with the help of
the `swapgs` instruction to help us swap the `gs_base` registers, as
our own `KERNEL_GS_BASE` contains a pointer to the current LCPU's
`struct lcpu` that has an up to date pointer to the current thread's
auxiliary stack.

Co-authored-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Marco Schlumpp <marco@unikraft.io>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
Previously, we would save zeroes on the syscall entry frame instead
of using the actual values for the registers that would normally be
popped off the stack since they would not be used anyway by a syscall.

For completeness's sake, do still save them, to offer those that may
view the structure a clear view of what the syscall will return to.

Furthermore, we no longer need to clear the stack before syscall exit
(the previous `addq $(6 * 8), %rsp`) because the auxiliary stack is
used instead, so do not do this operation anymore.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
razvand pushed a commit that referenced this pull request Dec 23, 2023
We would usually use the trap stack to execute the issued
system call. However this may be troublesome if the syscall will
generate page faults for example, which would end up reusing the
same stack.

To fix this, switch to the per-thread auxiliary stack on system call
entry and use that instead. The switch shall be done with the help of
the `tpidr_el1` register to help us, as it will be guaranteed to contain
`struct lcpu` that has an up to date pointer to the current thread's
auxiliary stack.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Reviewed-by: Michalis Pappas <michalis@unikraft.io>
Approved-by: Razvan Deaconescu <razvand@unikraft.io>
GitHub-Closes: #1174
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch/arm arch/x86_64 lang/c Issues or PRs to do with C/C++ plat/kvm Unikraft for KVM topic/syscall Related to syscalls
Projects
Status: Done!
Development

Successfully merging this pull request may close these issues.

None yet

4 participants