Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove -m-no-red-zone on AMD64 #292

Open
mato opened this issue Nov 22, 2018 · 4 comments
Open

Remove -m-no-red-zone on AMD64 #292

mato opened this issue Nov 22, 2018 · 4 comments

Comments

@mato
Copy link
Member

mato commented Nov 22, 2018

We have been carrying -m-no-red-zone for AMD64 for all code (bindings themselves and "user" code) since the dawn of time. Since then, the CPU interrupt handling and trap handling code has been refactored to use ISTn, i.e. separate stacks for interrupt and trap handlers.

We should consider removing -m-no-red-zone. There are two options:

  1. Remove it from "user" code only, i.e. for the MirageOS case, do not pass it down to OPAM via pkg-config.
  2. Remove it from the bindings as well.

The first option seems entirely safe -- "user" code will never run in an interrupt context, and any trap/interrupt handlers in bindings currently run on a separate stack via ISTn.

The second option should be possible with a minor tweak to our usage of ISTn to accomodate virtio (hvt does not use hardware interrupts). Currently, we have (from bindings/cpu_x86_64.c):

IST1 (cpu_intr_stack): All hardware interrupts except #NMI (virtio only).
IST2 (cpu_trap_stack): All traps except #DF.
IST3 (cpu_nmi_stack): #NMI, #DF.

Given that on virtio it is theoretically possible for IRQ0 (timer) to nest with IRQx (virtio_net), I think we want something like this:

IST1 (cpu_intr1_stack): IRQ0 (timer, virtio only)
IST2 (cpu_intr2_stack): All other hardware interrupts (i.e. only virtio-net).
IST3 ... IST5 (unused, reserved for future expansion)
IST6 (cpu_trap_stack): All traps except #DF.
IST7 (cpu_nmi_stack): #NMI, #DF.

References: System V Application Binary Interface x86-64 Architecture Processor Supplement, section 3.2.2, also this stackoverflow answer.

Any opinions? @djwillia @ricarkol?

@mato mato self-assigned this Nov 22, 2018
@djwillia
Copy link
Member

Interesting, the red zone business seems very subtle, which is why I guess most kernels just avoid it. But you are right, we certainly don't need it in the hvt case. I would lean towards removing it entirely (option 2 above). It seems the only issue is the virtio case.

A couple of questions: in virtio, does linking code that has been built with and without red zone (option 1) cause any issues, in other words would you need the IST partitioning scheme even if you only changed flags for "user" code? And second, would it be easy to see if there are issues with the IST partitioning scheme? It seems pretty racy and I'm not sure how often you get an interrupt clobber when not being careful about the red zone.

@mato
Copy link
Member Author

mato commented Nov 29, 2018

@djwillia I think I explained the situation on our call today, let me know if you need any recap here?

@mato
Copy link
Member Author

mato commented Dec 11, 2018

See also https://www.kernel.org/doc/Documentation/x86/kernel-stacks, which gives a good explanation of the IST mechanism on x86_64.

Notably, what I did not realise but makes sense in hindsight, IST-using interrupt or trap handlers cannot nest safely. This needs some more thought about the implications for virtio; it's possible we have a race condition between virtio-net and timer interrupts even in the current setup, although it must be extremely rare since no one has reported it. In order to be 100% safe, it would be best for virtio to always use separate ISTs for the timer and virtio-net.

mato added a commit to mato/solo5 that referenced this issue Mar 27, 2019
This is a large change, and the majority of it is effectively a full
re-write of the build system. The actual removal of hvt compile-time
specialization is fairly straightforward.

User-visible changes:

- 'configure.sh' now needs to be run manually before running 'make',
  this is consistent with POLA.
- conversely, 'make clean' no longer cleans Makeconf. Use 'distclean' or
  'clobber' for that.
- 'configure.sh' will now print the targets that can (will) be built on
  this system. The strategy is still "build everything we can", however
  I have disabled Genode on all systems except Linux due to toolchain
  issues.
- You can now build a subset of targets from the top-level 'make', by
  specifying 'CONFIG_XXX=' (disable) or 'CONFIG_XXX=1' (enable) either
  on the command line, or editing the generated Makeconf.
- Makefiles use silent rules by default. To get the old verbose ones
  back, use 'make V=1'.
- The 'solo5-hvt' tender is no longer "specialized" to the unikernel.
  We build two tenders, 'solo5-hvt' with all non-debug modules
  configured and 'solo5-hvt-debug' with additional debug modules (gdb,
  dumpcore where available).
- 'solo5-hvt-configure' is kept around for now for backward
  compatibility with OPAM/MirageOS but is essentially a NOP.

Developer-visible changes:

- The build system now has proper support for auto-generation of
  dependencies. This means you can safely edit source files, run make
  and be sure you will get a complete incremental build.
- Makefiles have been refactored to use common best practices, remove
  repetition, consistent variable names and clear interfaces between
  configure.sh/Makeconf/Makefiles, all the while keeping them simple
  enough to understand for me on a Monday morning before coffee. I.e.
  limit use of macros, eval, etc.
- hvt tender modules are no longer defined by compile-time flags,
  instead a dynamic array is placed into a special ELF section
  (.modules).  This means that a hvt tender binary can be combined from
  an arbitrary set of hvt_module_XXX object files, which is the right
  way to do things going forward and also simplifies the build system
  (not needing to build multiple targets from the same set of sources).

Shortcomings / TODOs:

- Dependency files (*.d) are stored in-tree. I spent several days on
  trying to figure out how to get them to work out of tree, but in
  combination with the non-recursive use of subdirectories in 'bindings'
  I could not figure out the required Makefile magic.
- HVT_DROP_PRIVILEGES=0 is non-functional with the new modules
  arrangement, but needs a re-design anyway.

Other changes included as part of this PR:

- Revert privilege dropping on FreeBSD (see discussion in Solo5#282).
- The build system changes effectively implement option 1 in Solo5#292, i.e.
  on x86_64 -m no-red-zone is only used for bindings, not for
  application code.
- tests/tests.bats has been refactored for DRY as it was getting totally
  unmaintainable.
@mato
Copy link
Member Author

mato commented May 20, 2019

#345 rationalized the various build flags, and in the process implements option 1 here, so -mno-red-zone is no longer passed down to OPAM consumers. I'm keeping this open for now, but don't plan to do any further red-zone/IST related changes in the short term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants