Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Muen port #190

Merged
merged 15 commits into from Jun 1, 2017
Merged

Add Muen port #190

merged 15 commits into from Jun 1, 2017

Conversation

Kensan
Copy link
Contributor

@Kensan Kensan commented Apr 11, 2017

These changes port the Solo5 kernel to run on top of the Muen Separation Kernel.

All necessary hardware interfaces are implemented as PV drivers using facilities provided by the Muen platform with the exception of the block interface.

To test this with MirageOS the following pins are needed:

opam pin add solo5-kernel-muen git://github.com/codelabs-ch/solo5#muen
opam pin add ocaml-freestanding git://github.com/codelabs-ch/ocaml-freestanding#muen
opam pin add mirage git://github.com/codelabs-ch/mirage#muen

The Mirage changes add "muen" as a new platform. Thus unikernels can be configured as follows:

mirage configure -t muen

After building the unikernel it must be copied over to the Muen policy object directory for inclusion in the final system image:

objcopy -Obinary tests/test_hello/test_hello.muen <muen_workdir>/policy/obj/unikernel

On the Muen-side the devel-mirage branch has to be used.

Simple unikernels that do not require network access, e.g. test_hello, can be tested using Bochs and the Mirage/Solo5 Muen system policy:

make SYSTEM=xml/mirage-solo5.xml emulate COMPONENTS="libmutime libdebuglog sm time dbgserver"

More involved scenarios that require network access can be run on real hardware, in this example a Lenovo T430s:

$ make SYSTEM=xml/mirage-solo5-net.xml iso

The Solo5 kernel port was tested by running all Solo5 test programs on Bochs and on real hardware. Additionally, the static_website_tls unikernel from mirage-skeleton was built using Mirage with the above mentionend pins and deployed to hardware and used to locally serve the Muen project website see here.

@mato
Copy link
Member

mato commented Apr 12, 2017

@Kensan A first pass of reading through your changes looks good to me. I'm now going to deliberate on how this affects the multi-arch refactoring on the guest side and will get back to you. I expect there'll need to be some tweaks, but not much.

@mato
Copy link
Member

mato commented Apr 24, 2017

@Kensan The kernel-side multi-arch refactoring is now up for comment in #192. As expected this does not affect the Muen port much, so you should be able to easily rebase atop #192 once it is merged. I'll take another pass at reviewing your changes as a whole shortly.

@Kensan
Copy link
Contributor Author

Kensan commented Apr 24, 2017

@mato Great, thanks for the heads up. I rebased the patches and there were only minor fixups necessary. I pushed the result to a separate branch for now.

@mato
Copy link
Member

mato commented Apr 25, 2017

@Kensan Why is the LDMXCSR in fpu_init() required? Given that CR4.OSXMMEXCEPT should be set by the monitor (or guest in the virtio case) there should be no need to mask those exceptions, or am I missing something?

@Kensan
Copy link
Contributor Author

Kensan commented Apr 25, 2017

@mato Setting CR4.OSXMMEXCPT enables the SIMD floating-point exceptions. Thus, unmasked SSE exceptions will be delivered which means #XM must be handled by the OS, see Intel SDM Vol. 3A, section 2.5. When the SIMD exceptions are masked in MXCSR the hardware takes a default action and continues the calculation. Without masking the action Solo5 would need to handle them explicitly by taken the appropriate action on #XM.

Note that apparently OS X Hypervisor.Framework also needs explicit MXCSR initialization, see this comment. The initial value of MXCSR can differ depending on the hypervisor due to deviations in how the the FPU state storage area is initialized. As it is not automatically managed by VMX/part of the VMCS each hypervisor must do it "manually".

@mato
Copy link
Member

mato commented Apr 25, 2017

@Kensan Having now read through the Intel SDM sections relating to FPU/SSE and other processor extensions, I think the intent is that it is the guest side's responsibility to initialise the FPU and SSE control registers (CR0.MP, CR0.EM, CR0.NE, CR4.OSFXSR, CR4.OSXMMEXCEPT and MXCSR) to whichever state the guest code expects/supports.

This means that (I think) we're doing it wrong and should move all FPU/SSE-related initialisation code into kernel/cpu_x86_64.c. This includes:

  • the existing code (partly) enabling the FPU/SSE through the initial VCPU state defined in ukvm/ukvm_cpu_x86_64.h (for the ukvm target)
  • kernel/virtio/boot.S (for the virtio target)
  • adding code to explicitly set MXCSR to 0x1f80 for all targets (ie. what you did only for muen)

The intent here is to set the relevant state in one place, and one place only. This should not be the hypervisor/monitor since that does not know what the guest supports (or not).

What do you think? Also @djwillia?

@Kensan
Copy link
Contributor Author

Kensan commented Apr 25, 2017

@mato I agree that it is more robust to initialize the FPU uniformly across all supported hypervisors, so your proposed solution sounds good to me.

Regarding the placement of FPU initialization: I do not know ARM FPU features well enough but would it make sense to declare fpu_init() on the same layer/level as cpu_init() in kernel.h?

@mato
Copy link
Member

mato commented Apr 25, 2017

@Kensan I think CPU and FPU (and traps/exceptions) are fairly tightly coupled, so if we have a fpu_init() it should be called from (and defined in) cpu_<arch>.c, not at the top-level scope.

@Kensan
Copy link
Contributor Author

Kensan commented Apr 25, 2017

@mato Ok, sounds good. Thanks for the explanation.

@djwillia
Copy link
Member

I still don't really understand why it can be the guests full responsibility to set up the MXCSR; doesn't the hypervisor need to be able to set this state to switch between guests that set it differently?

I had moved it out to see if it was possible to eliminate all cpu-specific bits from the guest and push them to ukvm in the Hypevisor.framework branch, but since it seems necessary to have some of the fpu init in the guest, I agree it's better to put it all together.

@mato
Copy link
Member

mato commented Apr 25, 2017 via email

@Kensan
Copy link
Contributor Author

Kensan commented Apr 25, 2017

FPU state is not automatically managed by VMX. Thus each hypervisor must handle it explicitly, e.g. using XSAVE/XRSTOR with a storage area per guest. Since MXCSR is saved/restored to this memory region the initial value depends on how the XSAVE storage area is set up initially.
If for example the FPU area is simply zeroized (as is done on Muen and apparently Hypervisor.Framework), then all flags in MXCSR will be clear and SSE exceptions will be delivered to the guest. On the other hand, if the hypervisor explicitly sets up specific fields like MXCSR in the storage area, then the guest FPU execution environment will look different.

@djwillia
Copy link
Member

thanks @Kensan, that clears things up. So, it looks like, if we wanted the monitor to "do it for the guest" in the ukvm case, we could do the entire setup from the monitor via KVM_SET_FPU for Linux/KVM and hv_vcpu_write_fpstate for Hypervisor.framework for a consistent initial state; then the guest would not even have to do anything to MXCSR.

But it's probably a good idea to have the virtio backend set up the floating point (including MXCSR) to a known state. I think it's fine for the ukvm backend to do the same, since we have other reasons for not being able to remove all HW-ness from the guest (like not being able to trap processor exceptions through KVM).

This simplifies re-use of UKVM kernel code for other upcoming platforms.
This simplifies re-use of UKVM kernel code for other upcoming platforms
which may provide different network and block implementations.
@Kensan
Copy link
Contributor Author

Kensan commented May 20, 2017

I tested the rebased changes on Muen and everything works as expected. From my side, the updated changes are ready for review.

@Kensan
Copy link
Contributor Author

Kensan commented May 21, 2017

Fixed an incorrect indentation (tab vs. spaces).

Copy link
Member

@mato mato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me. I like the refactoring of console/net/block/poll out into separate modules.

Some comments:

  1. I presume you are the copyright holder of all the Muen-specific code added in this PR? (Aside: We should probably implement some kind of Signed-Off-By: policy, annoying as it is).
  2. How can I test this? Could we add some kind of support to tests/run-tests.sh that would be able to run the tests on Muen itself? What would be required to do this?

@@ -29,13 +29,16 @@ void _start(void *arg)
platform_init(arg);
cmdline = cmdline_parse(platform_cmdline());

console_init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this call should be moved to before cpu_init()? That would in theory allow for console output if there's an early trap, assertion or other error in cpu_init() or platform_init().

I haven't read through the muen channels and console init code in detail, would it work to call the initialisation this early? Ukvm should be fine as it's just a simple hypercall which will always work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this would work since the Muen console driver has no initialization dependencies on other parts of the system. Will move the call above cpu_init().

@Kensan
Copy link
Contributor Author

Kensan commented May 23, 2017

@mato Glad you like the changes especially that you think the refactoring is a benefit not only for the Muen port.

  1. I presume you are the copyright holder of all the Muen-specific code added in this PR?
    (Aside: We should probably implement some kind of Signed-Off-By: policy, annoying as it is).

Yes, that is the case. I can add a Signed-off-by line to all commits if you prefer.

How can I test this? Could we add some kind of support to tests/run-tests.sh that would be able to run the tests on Muen itself?

Apart from test_blk and test_ping_server all test can be executed on Muen running on Bochs. We have been able to boot Muen under QEMU using nested virtualization but unfortunately it is very unstable and not usable at the moment. Running it on real hardware is a bit challenging because the it is highly platform-specific, as in one needs to provide a precise hardware description to the Muen build process.

What would be required to do this?

To run the tests under Bochs one basically needs to install the dependencies listed on the project website plus a fairly recent Bochs version (>=2.6.5). The SPARK tools are only required if one wants to perform the formal proofs.

Is there a specific setup you have in mind that you would like to implement so the Muen-specific code is covered by tests?

This enables initialization of console and network drivers, if required
by a given platform.
This simplifies re-use of UKVM kernel code for other upcoming platforms
which provide a custom poll implementation.
This simplifies re-use of UKVM kernel code for other upcoming platforms
which implement some additional initialization steps.
The subject info driver provides an API to query information about the
execution environment when running as a subject on top of the Muen SK.
The clock uses the Muen subject info and time memory regions to provide
monotonic and wallclock time.
Muen shared memory channels are an implementation of the SHMStream
Version 2 IPC protocol (shmstream).
The console writes output using a shared memory stream channel.
The network backend sends/receives data by using a shared memory stream
channel for each direction. Configuration of the hardware MAC address is
currently not supported and a pseudo-random address is generated on init.
For now there is no support for block devices on the Muen platform.
Initialize FPU MXCSR to the default value. Full SSE support is required
by the OCaml runtime.
The variable specifies common object files for UKVM platforms.
@mato
Copy link
Member

mato commented May 23, 2017 via email

@mato
Copy link
Member

mato commented Jun 1, 2017

With a bunch of help from @Kensan, I've finally managed to test these changes and things work as advertised following @Kensan's instructions in the PR.

Next steps for integrating Muen support in Mirage:

  1. PR for the changes to mirage/ocaml-freestanding (@Kensan)
  2. An initial version of solo5-kernel-muen published to OPAM (@mato)
  3. A new version of ocaml-freestanding with the changes in (1) published to OPAM, adding | solo5-kernel-muen as a dependency (@mato)
  4. PR for the changes to mirage/mirage (@Kensan)
  5. (at some point after that) "Formal" releases of Solo5, Mirage with documentation for the Muen port.

Merging this now so that we can proceed with the above and also to get the refactoring in this PR into master.

/cc @avsm @djwillia

@ansiwen
Copy link
Contributor

ansiwen commented Jul 28, 2017

What an awesome PR! Thanks a lot from the NetHSM project!

Kensan added a commit to codelabs-ch/muen that referenced this pull request Jan 25, 2018
Summary:
These changes enable the execution of MirageOS [1] unikernels built with
Solo5 [2] as subjects on top of Muen. MirageOS is a library operating
system written in OCaml.

Upstream Solo5 [3] support for Muen includes an implementation of
debuglog and muennet. Using the network support it is possible to deploy
advanced unikernels [4] which serve static websites over TLS.

The new Mugenukvm tool generates Solo5/UKVM boot info structures which
are processed by Solo5 during bootup. The boot info is similar to Linux
ZP but trivial in comparison.

Multiple Mirage/Solo5 system policies are provided for running
simple unikernels that require no hardware in Bochs and more involved
examples on real hardware.

[1] - https://mirage.io
[2] - https://github.com/Solo5/solo5
[3] - Solo5/solo5#190
[4] - https://github.com/mirage/mirage-skeleton

Test Plan:
Built and ran all test cases from the Solo5 repos and deployed
static_website_tls [4] on the Lenovo T430s to successfully serve
websites using routed and bridged networking.

Reviewers: reet

Reviewed By: reet

Differential Revision: https://dev.codelabs.ch/D664
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants