Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mirage 4.0, dune and cross-compilation #1195

Closed
TheLortex opened this issue Oct 28, 2020 · 11 comments
Closed

Mirage 4.0, dune and cross-compilation #1195

TheLortex opened this issue Oct 28, 2020 · 11 comments
Milestone

Comments

@TheLortex
Copy link
Member

Hi, this issue introduces the final changes we need to have dune to build unikernels in mirage.
It comes in the continuation of #969. These changes have been partially implemented for testing purposes, this issue exists to make sure everyone is aware of the update plan.
Feel free to comment on this issue if you need precisions on a specific point !

The changes

The unikernel build is now done in two steps:

  • install a cross-compiler for the desired target via opam.
  • building the unikernel using duniverse and dune.

Step 1: cross-compilers

As most mirage targets actually build for the same architecture as the host, we don't actually need cross-compilation to build unikernels. However it has the advantage of having a simpler mental model. Code that needs to run on the host system are made with the host compiler, and code that needs to be run in the unikernel are built with the target compiler (the cross-compiler). Having first-class cross-compilation support easily enables new targets such as esp32 or risc-v. Ocaml cross-compilation is not a nice story, it's still hard in 4.11.1 to build a cross-compiler: a lot of tweaks have to be done to be able to build one correctly. Therefore the biggest changes are in ocaml-freestanding: it is split in two, libc and cross-compiler, so that the configuration step of the cross-compiler can be done with the installation path of the libc.

This is the summary of changes:

  • Mirage tool is updated, unikernels are either built for unix or cross-compiled. Cross-compiled backends may require to install toolchains with opam (ocaml-freestanding, solo5-bindings-<x>).
  • Introduction of a new, header/tools only solo5-headers opam package.
  • Introduction of solo5-libc package containing nolibc and openlibm.
  • ocaml-freestanding becomes a cross-compiler based on solo5 headers and libc. It is installed in <opam-switch-root>/freestanding-sysroot/, and can be refered to using ocamlfind -toolchain freestanding <command>.

Step 2: duniverse

The opam tooling is not ready for cross-compilation, so we need to rely exclusively on dune to perform unikernel builds. This means that a tool was needed to fetch and download all the required dependencies of the unikernel. This is the role of duniverse.
duniverse, now known as the opam plugin opam monorepo, uses opam to parse the dependency file and locate all the required sources, before downloading them in a ./duniverse folder.
When the sources are fetched, dune is configured to have one build context per mirage target, set up to use the correct compiler for each target. Then a single dune build command is able to build all the required dependencies to produce the requested unikernels.

This is the summary of changes:

  • mirage configure -t <target> generates:

    • dune: build rules
    • dune-project: general dune configuration
    • dune-workspace: definition of build contexts, that's how installed cross-compilers are detected by dune
    • <unikernel>-hvt.opam: opam file definition, declaring dependencies (runtime, build and toolchain). Toolchain dependencies are guarded by a new build-context variable and aim to be installed by opam.
    • mirage.context: command line arguments for mirage configure
    • Makefile
  • Duniverse is now an opam plugin, called opam monorepo. It parses an opam file to compute the transitive closure of a project's dependencies, and locally fetches them. It uses the opam resolver so it supports opam pins and repos.

  • opam monorepo lock: solve the dependencies and find where to fetch each project, generating a <unikernel>.opam.lock file.

  • opam monorepo pull: download the dependencies according to the lockfile.

Workflows

Build a unikernel from scratch

  • Install OCaml and opam
  • opam install mirage
  • mirage configure -t <target>: generates build / install files (see earlier)
  • make depend or mirage depend: install dependencies:
    • toolchains using opam (env OPAMVAR_build_context=1 opam install . --deps-only)
    • using duniverse (equivalent to opam monorepo lock && opam monorepo pull)
  • dune build: build the unikernel for all requested targets, located in _build/mirage-<target>/.

Then, it's possible to:

  • change unikernel.ml to modify the app: dune build
  • change config.ml to change the configuration:
    • If there's some dependency change: mirage configure -t <target> && make depends
    • dune build
  • install the unikernel binary: dune install.

Updating libraries

  • A new package is merged in opam - run opam update to get the latest packages in the current switch.
  • To update the duniverse/ dir: mirage configure -t <target> && make depends.
  • dune build to rebuild the unikernel.

Locally editing a package in the dependency tree

  • cd duniverse && rm -rf <package> && opam source <package> (or do a local clone of the corresponding git repository)
  • hack on it (dune build && dune test) at the repository root
  • push your changes upstream and modify config.ml to add a new pin-depends to share the changes with others.

Use a custom opam repository

  • all installed packages must use dune to be built in an unikernel
  • add the repo to the current opam switch opam repo add <name> <url>
  • make depends: duniverse will pick up the newly configured opam repository.

Specific points

Testing packages / CI ?

A problem is the fact that packages destined to be built with duniverse only (such as mirage-solo5) will still be published on opam, and their installation has no meaning anymore. However publishing them on opam means that they still can be tested by the opam CI. Packages that aim to be cross-compiled (thus installed by duniverse) must not have any non-dune dependency. So to test these packages, non-dune dependencies (such as ocaml-freestanding) need to be set as {with-test} dependencies.
I'm currently experimenting with an ocurrent pipeline that uses mirage-dev and mirage-skeleton to test unikernel builds: https://github.com/TheLortex/mirage-ci.

Dune

A big change in the mirage workflow is that the whole unikernel dependency tree must be built with dune. As a consequence, non-dune dependencies have been forked, these forks being picked up by duniverse using an overlay repository: https://github.com/dune-universe/opam-overlays.
For now, we have to maintain forks, but other solutions are to be discussed:

  • perform the switch to dune in the upstream repository.
  • sandbox the build process using dune features.

For a general-purpose library to be mirage-compatible, there are several requirements:

  • It must not depend (runtime) on the unix library or target-specific mirage libraries such as mirage-solo5, mirage-xen.
  • It must not depend (either runtime or {build}) on packages that are not built by dune.
  • C stubs needs to be standalone and not use glibc to ease cross-compilation with mirage targets (e.g. be platform-independent).
  • test dependencies are not constrained because they are installed by opam.

For target-specific mirage libraries, the requirements are the following:

  • It must not depend (either runtime or {build}) on packages that are not built by dune.
  • It may depend on same-target dune-built packages.
  • It may have arbitrary test dependencies.
@avsm
Copy link
Member

avsm commented Oct 28, 2020

This scheme looks good to me. I'd just note that @dra27 is also working on integration of cross-compilation directly upstream in OCaml, so the situation with the dune forks should hopefully be constrained to a few releases of upstream OCaml before we can use the native support directly.

@hannesm
Copy link
Member

hannesm commented Oct 29, 2020

Thanks for your high-level overview. Are there any further details available? Especially taking mirage/mirage-xen#23 into account: how are CFLAGS being passed around (at the moment via pkg-config, will the new workflow be different?)? If I understand correctly, solo5 and ocaml-freestanding will still be built using "the host opam"? What are more specifically the pain libraries when it comes to cross-compilation? I imagine gmp-freestanding & zarith(-freestanding) that do not use dune (yet?), it'd be nice to more clearly lay out the strategy for them.

Will the opam file generated by mirage configure be sufficient to ship (and run opam install . on a different system to compile the preconfigured unikernel)? Or are the other generated artifacts (dune, dune-project, ...) required?

I guess it boils down to: considering all the dependencies are using dune, is opam monorepo needed at all?

@TheLortex
Copy link
Member Author

how are CFLAGS being passed around (at the moment via pkg-config, will the new workflow be different?)?

CFLAGS that are common to all Solo5 targets are configured in the cross-compiler ocaml-freestanding.
Target-specific CFLAGS may be passed by pkg-config (in solo5-bindings-<...> packages for example) and set-up in the dune-workspace file.

If I understand correctly, solo5 and ocaml-freestanding will still be built using "the host opam"?

Yes, and installed as usual using make depends. By the way, a nice change is that all solo5 targets become co-installable.

What are more specifically the pain libraries when it comes to cross-compilation? I imagine gmp-freestanding & zarith(-freestanding) that do not use dune (yet?), it'd be nice to more clearly lay out the strategy for them.

Indeed these are the pain libraries. The goal is to have single packages (gmp and zarith) that sandbox the build processes in dune, making sure that the dune workspace flags and compilers are used.

Will the opam file generated by mirage configure be sufficient to ship (and run opam install . on a different system to compile the preconfigured unikernel)? Or are the other generated artifacts (dune, dune-project, ...) required?
I guess it boils down to: considering all the dependencies are using dune, is opam monorepo needed at all?

The goal is to keep that workflow possible or at least very similar. For now, all generated artifacts are required to build the unikernel, including the dependencies locally fetched by opam monorepo. This also means that when the dependencies are fetched, opam monorepo is not needed anymore to deploy the unikernel. As I don't have a lot of experience in unikernel deployment, I'll take the time to learn how it's done in order not to break anything.

@EduardoRFS
Copy link

GMP and Zarith

For zarith cross compilation, I have a patch on top of the zarith for duniverse

https://github.com/EduardoRFS/reason-mobile/tree/master/patches/_opam_zarith

Also for gmp, but gmp is quite straightforward to cross compile

https://github.com/EduardoRFS/reason-mobile/blob/master/patches/esy_gmp/package.json

Cross Compile

Currently unix unikernels can easily be cross compiled with this patches as it may at esy-mirage-kernel who is building Mirage for a couple of platforms, Android, FreeBSD, iOS and Linux

@TheLortex
Copy link
Member Author

TheLortex commented Nov 12, 2020

Two weeks have passed, and after some testing and implementations I feel confident pushing forward and to start submitting the PRs. I just want to enumerate the issues I encountered, and make sure everyone agrees with how changes will be performed.

Specific issues

Mirage-skeleton/testing

To test the mirage 4 tool, I focused on building unikernels from the mirage-skeleton on the unix, xen and hvt targets. They all work, except from ping6 and dns which have some dependency problem. If you think that I haven't covered some use case that would fail with the new tool, please comment on this issue.

GMP/Zarith

GMP/Zarith are the pain libraries of this update because of a complex configure script and the quantity of C stubs. Because the unikernel dependency tree has to be built with Dune, I had to wrap the gmp build process in a sandbox, exporting libgmp.a and gmp.h to be used by zarith. I haven't found a way to switch between system gmp and self-built gmp, so for now gmp is always self-built when building a unikernel which needs it. This might take some time depending on the configuration -- so I advise users to enable the dune cache so that the build needs to be done only once. Documentation on the build cache is available: https://dune.readthedocs.io/en/stable/caching.html.

Mirage CLI

Tool-specific changes will be discussed in the appropriate PR. In summary:

  • mirage configure generates files and dune rules to build the unikernel according to the target.
  • additional files may be generated in a special directory (design decision to take).
  • build artifacts may be promoted in the source tree (design decision to take).
  • an opam file is generated: what are the build rules when opam install is performed ? If someone use it, I would like a description of the constrains for that opam install workflow.

Dune/cross-compilation

A lot of changes are based from this issue: ocaml/dune#3917.
Basically the problem is that executable dependencies (such as configure scripts) are compiled in the target workspace if they are defined explicitely, which leads to compilation errors as the target workspace don't have the unix package, for example.
Document should be included to explain how to package OCaml code so that it's cross-compilation-friendly.

Release plan

Main changes see a major/minor version bump. Packages that have the ocaml/dune#3917 issue may see a patch version bump. Packages that have C stubs (and thus -freestanding, -unix variations) will be broken on solo5 targets with mirage 4, so they should have a major version bump.
With regards to the release order, should everything be done at the same time ?
Minor releases can be done separately, I'll starting by issuing PRs on these packages to clear things up.

Packages

Main changes

repository packages reason fork upgrade
mirage/Solo5 solo5-headers, solo5-bindings-* Separate headers and bindings, allow bindings to be co-installed. TheLortex/solo5#mirage-4 0.6.0 -> 0.7.0
N/A solo5-libc Moved from ocaml-freestanding to correctly set-up compilation flags. TheLortex/solo5-libc#mirage-4 N/A -> 0.7.0
mirage/ocaml-freestanding ocaml-freestanding Turn ocaml-freestanding into a cross-compiler. Move libc in a separate package. TheLortex/ocaml-freestanding#mirage-4 0.6.0 -> 0.7.0
mirage/mirage mirage, mirage-runtime, functoria, functoria-runtime Build unikernels with dune. TheLortex/mirage#mirage-4 3.9.0 -> 4.0.0
mirage/mirage-unix mirage-unix Don't use io-page-unix TheLortex/mirage-unix#mirage-4 4.0.0 -> 4.0.1
mirage/mirage-solo5 mirage-solo5 Rely on cross-compiler and workspace flags. TheLortex/mirage-solo5#mirage-4 0.6.4 -> 0.7.0
mirage/mirage-xen mirage-xen Rely on cross-compiler and workspace flags. TheLortex/mirage-xen#mirage-4 6.0.0 -> 7.0.0

Updates

repository packages reason fork upgrade PR
janestreet/base base ocaml/dune#3917 TheLortex/base#fix-dep v0.14.0 -> v0.14.1 janestreet/base#100
mirage/ocaml-base64 base64 ocaml/dune#3917 master branch 3.4.1 -> 3.4.2
mirage/cohttp cohttp-* ocaml/dune#3917 TheLortex/ocaml-cohttp#v2.5.4-fix-dep TheLortex/ocaml-cohttp#fix-dep 2.5.4 -> 2.5.5 or 3.0.1 -> 3.0.2 mirage/ocaml-cohttp#735
ocaml/dune dune Waiting for dune 1.8
N/A gmp Build GMP using Dune, ensuring the correct flags are used. TheLortex/ocaml-gmp New package: 6.2.0 New repo: https://github.com/mirage/ocaml-gmp
mirage/ocaml-magic-mime magic-mime ocaml/dune#3917 TheLortex/ocaml-magic-mime#fix-dep 1.1.2 -> 1.1.3 mirage/ocaml-magic-mime#19
mirage/mirage-block-unix mirage-block-unix Don't use io-page-unix samoht/mirage-block-unix#dune 2.12.1 -> 2.12.2
mirage/mirage-clock mirage-clock, mirage-clock-unix, mirage-clock-freestanding ocaml/dune#3917 TheLortex/mirage-clock#fix-dep 3.0.1 -> 3.1.0 OK
mirage/mirage-crypto mirage-crypto-pk Depend only on zarith (not zarith-freestanding) N/A 0.8.6 -> 0.8.7
mirage/mirage-*-solo5 mirage-*-solo5 Higher upper bound for mirage-solo5 constraint (<= "0.8.0") N/A 0.6.1 -> 0.6.2
ocaml/num (forked in dune-universe/num) num ocaml/dune#3917 TheLortex/num#fix-dep 1.3 -> 1.4 dune-universe/num#1
janestreet/parsexp parsexp ocaml/dune#3917 TheLortex/parsexp#fix-dep 0.14.0 -> 0.14.1 janestreet/parsexp#6
mirage/mirage-tcpip tcpip ocaml/dune#3917 TheLortex/mirage-tcpip#fix-dep 5.0.1 -> 5.0.2
mirage/ocaml-uri uri, uri-re, uri-sexp ocaml/dune#3917 TheLortex/ocaml-uri#fix-dep 4.0.0 -> 4.0.1 mirage/ocaml-uri#151
ocaml/Zarith (forked in dune-universe/Zarith) zarith Use dune and self-build gmp. TheLortex/Zarith#mirage-4 1.11 -> 1.12

Testing

The work can be tested by using the following opam repositories and packages:

  • opam repo add dune-universe git+https://github.com/dune-universe/opam-overlays.git
  • opam repo add mirage-dev git+https://github.com/TheLortex/mirage-dev.git#mirage-4
  • opam pin add git+https://github.com/ocamllabs/duniverse.git
  • opam install mirage

I'm asking for feedback, especially on the mirage tool changes because that's what impacts directly the user workflow.

@avsm
Copy link
Member

avsm commented Nov 13, 2020

@TheLortex could you open a PR for the Base deps change? That will have some lead time to get into a release, so it would be good to get it done early.

@hannesm
Copy link
Member

hannesm commented Nov 14, 2020

First of all, thanks for your amazing work on this topic.

With regards to the release order, should everything be done at the same time ?

The most convenient is to release from the bottom up to opam repository -- i.e. those packages that work fine with the current opam repository. See #1159 (comment) how we did the last release.

an opam file is generated: what are the build rules when opam install is performed ? If someone use it, I would like a description of the constrains for that opam install workflow.

I use the opam file generated by mirage configure in the following way:

  • I execute mirage configure -t hvt once for the unikernel I'd like to deploy
  • I store the generated mirage-unikernel-YYY-hvt.opam in a custom opam repository
  • On a daily basis, a opam install mirage-unikernel-YYY-hvt in a fresh switch and preserve (a) the used opam packages (opam switch export) and (b) the generated binary
  • This allows me to track changes to the opam repository, spot incompatibilities, and using orb to bit-by-bit reproduce the binary unikernel (the rebuild subcommand takes the above switch export as input and produces the same output).

With your work, I'm slightly scared that "duniverse" is a dependency which (a) is unreleased, (b) uses some "opam overlays" (with an unclear maintenance status or how&whether the changes will be upstreamed (e.g. num is still at 1.3, zarith at 1.9)) (c) has still a rather unclear purpose from my point of view (similarly to opam it downloads packages).

See my earlier attempt at using the "MirageOS+dune+duniverse" at #1153 (comment) -- are the issues fixed now, should I test it again? What is the story about how sources are gathered? I prefer - similar to what we have now - a way to "download sources" and a separate step to build and install the unikernel -- i.e. no network access during the build & install phase.

Maybe the way to move forward is having mirage emit a duniverse build dependency into the opam file? And having duniverse init part of the build instructions? Or it should all work different now, and mirage/duniverse should output a shell script that can be used to generate the very same (bit-by-bit identical) unikernel image without using duniverse and opam at all...

@TheLortex
Copy link
Member Author

@avsm

@TheLortex could you open a PR for the Base deps change? That will have some lead time to get into a release, so it would be good to get it done early.

If you're talking about janestreet/base#100 it's already open ! I'm planning to continue opening PRs this week.

@hannesm

With your work, I'm slightly scared that "duniverse" is a dependency which (a) is unreleased, (b) uses some "opam overlays" (with an unclear maintenance status or how&whether the changes will be upstreamed (e.g. num is still at 1.3, zarith at 1.9)) (c) has still a rather unclear purpose from my point of view (similarly to opam it downloads packages).

(a) I agree that duniverse needs to be released before releasing mirage 4.
(b) Opam-overlays is an opam repository of packages that have been ported to dune, but were the changes are not upstreamed. This is used by duniverse in the resolution process (opam monorepo init) because all unikernel dependencies need to be built with dune. With regards to maintenance and upstream status, maybe @avsm can tell us more about that -- I think that some packages will stay as forks and thus require some maintenance.

See my earlier attempt at using the "MirageOS+dune+duniverse" at #1153 (comment) -- are the issues fixed now, should I test it again? What is the story about how sources are gathered? I prefer - similar to what we have now - a way to "download sources" and a separate step to build and install the unikernel -- i.e. no network access during the build & install phase.

I think you can test it but the "deploy" part that you described is missing, so you might prefer to wait a little. Most of the issues are fixed. To answer your questions, the steps are:

  • mirage configure: generates the build rules and an opam file describing both unikernel dependencies (to be locally fetched by duniverse) and build dependencies (to be installed by opam) .
  • make depends: download sources
    • OPAMVAR_build_context=1 opam install . --deps-only: install build dependencies in the opam switch (such as ocaml-freestanding).
    • opam monorepo init: resolves unikernel library dependencies using the current opam switch configuration (so, for now, your opam switch needs to include opam-overlays and mirage-dev so that the resolution succeeds.) and writes an <unikernel>.opam.locked containing the unikernel dependency resolutions.
    • opam monorepo pull: fetch all dependencies in a duniverse/ subfolder.
  • dune build: build the unikernel and promote it in a dist/ directory.

Maybe the way to move forward is having mirage emit a duniverse build dependency into the opam file? And having duniverse init part of the build instructions? Or it should all work different now, and mirage/duniverse should output a shell script that can be used to generate the very same (bit-by-bit identical) unikernel image without using duniverse and opam at all...

It's not possible to use duniverse in opam build instructions, because opam uses a sandbox that blocks all network calls. That's were the story needs to be figured out: a solution is to commit the duniverse/ folder and generated files so that only the dune build instruction remains in the opam file. Maybe the mirage tool can take care of that, for example in a separate branch. In any case, the workflow cannot be opamless because of the build dependencies (dune, ocaml-freestanding, solo5-bindings-*).

Now that the "deployment" workflow is clearer to me, I'll try and come up with a solution !

@hannesm
Copy link
Member

hannesm commented Dec 2, 2020

FWIW with tcpip 6.0.0 (and #1204 being merged) there's no need for customization of tcpip (now the checksum stubs are provided by mirage-solo5 / mirage-xen -- tcpip's build system was simplified)

@samoht
Copy link
Member

samoht commented Oct 19, 2021

There are three sets of packages which are needed for the release:

  1. the packages needed to configure a unikernels (e.g. code generation + linking against config.ml). This requires a release of the mirage tool and the mirage libs (mirage, mirage-runtime, functoria, functoria-runtime and opam-monorepo)
  2. the packages needed to install a cross-compiler in the current opam switch. This requires a release of solo5 and freestanding.
  3. the packages used unikernels and which need cross-compile fixes (currently base, mirage-solo5, mirage-xen, parsexp, zarith)

At the moment, all of these packages live in https://github.com/mirage/mirage-dev/tree/master/packages ; my short-term plan is to:

  • move packages from 3. into https://github.com/mirage/opam-overlays where they will be picked up by opam monorepo when building the x-compilation workspace.
  • cut releases for packages from sets 1. and 2. into opam.

@samoht
Copy link
Member

samoht commented Jan 7, 2022

I'm closing this issue now, the remaining bits of the release are tracked on #1261

@samoht samoht closed this as completed Jan 7, 2022
hannesm referenced this issue in dune-universe/Zarith Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants