diff --git a/src/data.ml b/src/data.ml index 6c1ded5a5..b3f9ac271 100644 --- a/src/data.ml +++ b/src/data.ml @@ -154,6 +154,12 @@ module Blog = struct let open Cowabloga.Date in let open Cowabloga.Blog.Entry in [ + { updated = date (2017, 02, 27, 14, 00); + authors = [hannes]; + subject = "Size matters: how Mirage got smaller and less magical"; + body = "mirage-3-smaller.md"; + permalink = "mirage-3-smaller"; + }; { updated = date (2017, 02, 23, 17, 00); authors = [yomimono]; subject = "Announcing MirageOS 3.0.0"; diff --git a/tmpl/blog/mirage-3-smaller.md b/tmpl/blog/mirage-3-smaller.md new file mode 100644 index 000000000..4fdd6ee02 --- /dev/null +++ b/tmpl/blog/mirage-3-smaller.md @@ -0,0 +1,209 @@ +# Size matters: how Mirage got smaller and less magical + +In this article, some technical background and empirical evidence is given how +we reduced the lines of code in Mirage3, which has about 25% fewer lines of +code than Mirage2, while providing [more features](https://mirage.io/blog/announcing-mirage-30-release). + +Mirage does a fair amount of code generation since its initial release to extend +target-agnostic unikernels to target-specific virtual machine images (or Unix +binaries). +Until Mirage 2.7, string concatenation [was used +heavily](https://github.com/mirage/mirage/blob/v2.6.1/lib/mirage.ml). Since the +Mirage 2.7.0 release (February 2016), it is based on +[functoria](https://mirage.io/blog/introducing-functoria), "a DSL to describe a +set of modules and functors, their types and how to apply them in order to +produce a complete application". +The code generated by Mirage3 is less complex than the Mirage2 one and contains up to 45% fewer +lines of code. + +## Generating code considered harmful + +Code generated by a program with intricate control flow and automatically +generated identifier names is difficult to understand by a human - in case the +generated code is incorrect and needs to be debugged (or the compiler chokes on +it with an error message pointing in the middle of intricate generated code). +It is also a burden on the developer, since generated code should not be part of +the version control system, thus the build system needs to include another step. +If the code generator is buggy, or not easily extendible for new features, +developers may want to manually modify the generated code - which then turns +into a release nightmare, since you need to maintain a set of patches on top of +generated code, while the code generator may is developed alongside. Generating +code is best avoided - maybe there is a feature in the programming language to +solve the boilerplate without code generators. + +Having said this, there's nothing wrong with LISP macros or MetaOCaml. + +Mirage uses code generation to complete backend-agnostic unikernels with the +required boilerplate to compile for a specific backend - by selecting the +network device driver, the console, the network stack, and other devices - +taking user-supplied configuration arguments into account. In Mirage, the OCaml +TCP/IP stack requires any network device which implements the +[`Mirage_net.S`](http://docs.mirage.io/mirage-net/Mirage_net/module-type-S/index.html) +module type. + +At the end of the day, some mechanism needs to be in place which links the +[mirage-net-solo5](https://github.com/mirage/mirage-net-solo5) library if +compiling for Solo5 (or +[mirage-net-xen](https://github.com/mirage/mirage-net-xen) if compiling for xen, +or [mirage-net-unix](https://github.com/mirage-net-unix) for Unix, or +[mirage-net-macosx](https://github.com/mirage/mirage-net-macosx) for MacOSX). +This can be left to each unikernel developer, which would require having the +same boilerplate code all over, which needs to be updated if a new backend +becomes available (Mirage2 knew about Xen, Unix, and MacOSX, Mirage3 extends +this with Solo5 and Qubes). Instead, the mirage tool generates this boilerplate +by knowing about all supported devices, and which library a unikernel has to +link for a device depending on the target and command line arguments. +That's not exactly the ideal solution. But it works good enough for us right +now ([more or less](https://github.com/mirage/mirage/pull/750)). A single place - the mirage tool - needs to be extended whenever a new backend becomes +available. + +## Device initialisation - `connect` + +Devices may depend on each other, e.g. a TCP stack requires a monotonic clock and a +random number generator, which influences the initialisation order. Mirage +generates the device initialisation startup code based on the configuration and +data dependencies (which hopefully form an acyclic graph). Mirage2 allowed to +handle initialisation errors (the type of `connect` used to be ``unit -> [ `Ok of t | `Error of error ] io``), but calls to `connect` were automatically +generated, and the error handler always spit out an error message and exited. +Becaus the `error` was generic, Mirage2 didn't know how to properly print it, +and instead failed with some incomprehensible error message. Pretty printing +errors is solved in Mirage3 by our [re-work of errors](https://github.com/mirage/mirage/pull/743), which now use the `result` +type, are extendible, and can be pretty printed. Calls to `connect` are +automatically generated, and handling errors gracefully is out of scope for a +unikernel -- where should it get the other 2 network devices promised at +configuration time from, if they're not present on the (virtual) PCI bus? + +The solution we [discussed](https://lists.xenproject.org/archives/html/mirageos-devel/2016-09/msg00050.html) +and [implemented](https://github.com/mirage/mirage/pull/602) (also in [functoria](https://github.com/mirage/functoria/pull/71)) was to always fail hard (i.e. crash) in `connect : unit -> t`. This lead to a series of patches for all implementors of `connect`, +where lots of patches removed control flow complexity (and less complex test +cases, see e.g. +[mirage-net-unix](https://github.com/mirage/mirage-net-unix/pull/27/files), or +[tcpip](https://github.com/mirage/mirage-tcpip/pull/251/files)). Lots of common +boilerplate (like `or_error`, which throws an exception if `connect` errored) +could be removed. + +Comparing the generated `main.ml` between Mirage 2.9.1 and 3.0.0 for various +unikernels on both unix and xen code reductions up to 45% ([diffs are +here](http://www.cl.cam.ac.uk/~hm519/mirage-2.9.1-3.0.0-diffs/)) + +- console (device-usage) xen: +35 -41 (now 81) unix: +32 -39 (now 80) +- block (device-usage) xen: +36 -45 (now 87) unix: +34 -44 (now 86) +- kv_ro (device-usage) xen: +34 -59 (now 75) unix: +39 -51 (now 86) +- network (device-usage) xen: +82 -134 (now 178) unix: +79 -133 (now 177) +- conduit_server (device-usage) xen: +86 -152 (now 200) unix: +84 -213 (now 199) +- dhcp (applications) xen: +44 -51 (now 93) unix: +41 -49 (now 92) +- dns (applications) xen: +86 -143 (now 190) unix: +83 -141 (now 189) +- static_website_tls (applications) xen: +97 -176 (now 230) unix: +108 -168 (now 237) +- nqsb.io xen: +122 -171 (now 223) unix: +65 -85 (now 133) +- btc-pinata xen: +119 -155 (now 217) unix: +64 -73 (now 127) +- canopy xen: +106 -180 (now 245) unix: +61 -106 (now 159) + +## Workflow, phase separation, versioned opam dependencies + +The workflow to build a unikernel used to be `mirage configure` followed by +`make`. During the configure phase, a `Makefile` was generated with the right +build and link commands (depending on configuration target and other +parameters). Mirage2 installed opam packages and system packages as a side +effect during configuration. This lead to several headaches: you needed to have the +target-specific libraries installed while you were configuring (you couldn't +even test the configuration for xen if you didn't have xen headers and support +libraries installed). Reconfiguration spawned yet another `opam` process (which +even if it does not install anything since everything required is already +installed, takes some time since the solver has to evaluate the universe) - +unless the `--no-opam` option was passed to `mirage configure`. + +A second issue with the Mirage2 approach was that dependent packages were listed +in the unikernel `config.ml`, and passed as string to opam. When version +constraints were included, this lead either shell (calling out `opam`) or make +(embedding the packages in the Makefile) or both to choke. Being able to +express version constraints for dependencies in `config.ml` was one of the most +wanted features for Mirage3. It is crucial for further development (to continue +allowing API breakage and removing legacy): a unikernel author, and the mirage +tool, can now embed versioned dependencies onto device interfaces. Instead of a +garbled error message from mirage trying to compile a unikernel where the +libraries don't fit the generated code, opam will inform which updates are +necessary. + +In a [first rampage](https://github.com/mirage/mirage/pull/691) ([functoria](https://github.com/mirage/functoria/pull/82)) instead of +manual executions of `opam` processes, an opam package file was generated by +mirage at configuration time for the given target. This allowed to express +version constraints in each `config.ml` file (via the `package` function). This +change also separated the configuration phase, the dependency installation +phase, and the build phase - which included delayed invocations of `pkg-config` +to pass parameters to `ld`. A mess, especially if your goal is to generate +Makefiles which run both on GNU make and BSD make. + +A [second approach](https://github.com/mirage/mirage/pull/703) ([functoria](https://github.com/mirage/functoria/pull/84)) digged a bit +deeper down the rabbit hole, and removed complex selection and adjustment of +strings to output the Makefile, by implementing this logic in OCaml (and calling +out to `ocamlbuild` and `ld`). Removing an uneeded layer of code generation is +easier to read and understand, less code, and includes stronger guarantees. +More potential errors are caught during compile time, instead of generating +(possible ill-formed) Makefiles. [Bos](http://erratique.ch/software/bos) is a +concise library interacting with basic operating system services, and solves +once and for all common issues in that area, such as properly escaping of +arguments. + +Mirage3 contains, instead of a single `configure_makefile` function which +generated the entire makefile, the build and link logic is separated into +functions, and only a simplistic makefile is generated which invokes `mirage +build` to build the unikernel, and expects all dependent libraries to be +installed (e.g. using `make depend`, which invokes `opam`) -- no need for +delaying `pkg-config` calls anymore. + +This solution has certainly less complex string concatenation, and mirage has +now a clearer phase distinction - configure, depend, compile & link. (This +workflow (still) [lacks a provisioning +step](https://github.com/mirage/mirage/issues/694) (e.g. private key material, +if provided as static binary blob, needs to be present during compilation atm), +but can easily be added later.) There are drawbacks: the mirage utility is now +needed during compilation and linking, and needs to preserve command line +arguments between configuration and build phase. Maybe the build step should be +in the opam file, then we would need to ensure unique opam package names and we +would need to communicate to the user where the binary got built and installed. + +## Other functionality removed or replaced + +The first commit to mirage is from 2004, back then opam was an infant. Mirage2 +ensured that a [not-too-ancient version of +OCaml](https://github.com/mirage/mirage/blob/v2.9.1/lib/mirage.ml#L1462-L1487) +is installed ([functoria contained a similar piece of +code](https://github.com/mirage/functoria/blob/1.1.1/lib/functoria_misc.ml#L298-L309)). +Mirage3 relies on opam to require a certain OCaml version (at the moment 4.03). + +Mirage and functoria were developed while support libraries were not yet +available - worth mentioning [bos](http://erratique.ch/software/bos) (mentioned +above), [fpath](http://erratique.ch/software/fpath), +[logs](http://erratique.ch/software/logs), and +[astring](http://erratique.ch/software/astring). Parts of those libraries were +embedded in functoria, and are now replaced by the libraries. (See +[mirage#703](https://github.com/mirage/mirage/pull/703) and +[functoria#84](https://github.com/mirage/functoria/pull/84) in case you want to +know the details.) + +Functoria support for OCaml `<4.02` has been +[dropped](https://github.com/mirage/functoria/pull/75), also +[astring](https://github.com/mirage/functoria/pull/77) is now in use. +Mirage support for OCaml `<4.01` has been +[dropped](https://github.com/mirage/mirage/blob/v2.9.1/lib/mirage.ml#L1318-L1355) +from Mirage. + +Some C bits and pieces, namely `str`, `bignum`, and `libgcc.a`, are no longer linked and part +of every unikernel. This is documented in +[mirage#544](https://github.com/mirage/mirage/pull/544) and +[mirage#663](https://github.com/mirage/mirage/issues/663). + +## Conclusion + +The overall statistics of Mirage3 look promising: more libraries, more +contributors, less code, uniform error treatment, unified logging support. Individual unikernels +contain slightly less boilerplate code (as shown +[by these unified diffs](http://www.cl.cam.ac.uk/~hm519/mirage-2.9.1-3.0.0-diffs/)). + +The binary sizes of the above mentioned examples (mirage-skeleton, nqsb, Canopy, +pinata) between Mirage2 and Mirage3 results on both Unix and Xen only in small +differences (in the range of kilobytes). We are working on a [performance harness](https://github.com/mirage/mirage/issues/685) +to evaluate the performance of +[flambda](https://blogs.janestreet.com/flambda/) intermediate language in OCaml +and [dead code elimination](https://github.com/ocaml/ocaml/pull/608). These should +decrease the binary size and improve the performance.