Merge pull request mirage#533 from yomimono/smaller-post

add size matters blog post
samoht · Feb 27, 2017 · 4e29127 · 4e29127
2 parents 1ede254 + adf3f8e
commit 4e29127
Show file tree

Hide file tree

Showing 2 changed files with 215 additions and 0 deletions.
diff --git a/src/data.ml b/src/data.ml
@@ -154,6 +154,12 @@ module Blog = struct
     let open Cowabloga.Date in
     let open Cowabloga.Blog.Entry in
     [
+      { updated    = date (2017, 02, 27, 14, 00);
+        authors    = [hannes];
+        subject    = "Size matters: how Mirage got smaller and less magical";
+        body       = "mirage-3-smaller.md";
+        permalink  = "mirage-3-smaller";
+      };
       { updated    = date (2017, 02, 23, 17, 00);
         authors    = [yomimono];
         subject    = "Announcing MirageOS 3.0.0";

diff --git a/tmpl/blog/mirage-3-smaller.md b/tmpl/blog/mirage-3-smaller.md
@@ -0,0 +1,209 @@
+# Size matters: how Mirage got smaller and less magical
+
+In this article, some technical background and empirical evidence is given how
+we reduced the lines of code in Mirage3, which has about 25% fewer lines of
+code than Mirage2, while providing [more features](https://mirage.io/blog/announcing-mirage-30-release).
+
+Mirage does a fair amount of code generation since its initial release to extend
+target-agnostic unikernels to target-specific virtual machine images (or Unix
+binaries).
+Until Mirage 2.7, string concatenation [was used
+heavily](https://github.com/mirage/mirage/blob/v2.6.1/lib/mirage.ml).  Since the
+Mirage 2.7.0 release (February 2016), it is based on
+[functoria](https://mirage.io/blog/introducing-functoria), "a DSL to describe a
+set of modules and functors, their types and how to apply them in order to
+produce a complete application".
+The code generated by Mirage3 is less complex than the Mirage2 one and contains up to 45% fewer
+lines of code.
+
+## Generating code considered harmful
+
+Code generated by a program with intricate control flow and automatically
+generated identifier names is difficult to understand by a human - in case the
+generated code is incorrect and needs to be debugged (or the compiler chokes on
+it with an error message pointing in the middle of intricate generated code).
+It is also a burden on the developer, since generated code should not be part of
+the version control system, thus the build system needs to include another step.
+If the code generator is buggy, or not easily extendible for new features,
+developers may want to manually modify the generated code - which then turns
+into a release nightmare, since you need to maintain a set of patches on top of
+generated code, while the code generator may is developed alongside.  Generating
+code is best avoided - maybe there is a feature in the programming language to
+solve the boilerplate without code generators.
+
+Having said this, there's nothing wrong with LISP macros or MetaOCaml.
+
+Mirage uses code generation to complete backend-agnostic unikernels with the
+required boilerplate to compile for a specific backend - by selecting the
+network device driver, the console, the network stack, and other devices -
+taking user-supplied configuration arguments into account.  In Mirage, the OCaml
+TCP/IP stack requires any network device which implements the
+[`Mirage_net.S`](http://docs.mirage.io/mirage-net/Mirage_net/module-type-S/index.html)
+module type.
+
+At the end of the day, some mechanism needs to be in place which links the
+[mirage-net-solo5](https://github.com/mirage/mirage-net-solo5) library if
+compiling for Solo5 (or
+[mirage-net-xen](https://github.com/mirage/mirage-net-xen) if compiling for xen,
+or [mirage-net-unix](https://github.com/mirage-net-unix) for Unix, or
+[mirage-net-macosx](https://github.com/mirage/mirage-net-macosx) for MacOSX).
+This can be left to each unikernel developer, which would require having the
+same boilerplate code all over, which needs to be updated if a new backend
+becomes available (Mirage2 knew about Xen, Unix, and MacOSX, Mirage3 extends
+this with Solo5 and Qubes).  Instead, the mirage tool generates this boilerplate
+by knowing about all supported devices, and which library a unikernel has to
+link for a device depending on the target and command line arguments.
+That's not exactly the ideal solution.  But it works good enough for us right
+now ([more or less](https://github.com/mirage/mirage/pull/750)).  A single place - the mirage tool - needs to be extended whenever a new backend becomes
+available.
+
+## Device initialisation - `connect`
+
+Devices may depend on each other, e.g. a TCP stack requires a monotonic clock and a
+random number generator, which influences the initialisation order.  Mirage
+generates the device initialisation startup code based on the configuration and
+data dependencies (which hopefully form an acyclic graph).  Mirage2 allowed to
+handle initialisation errors (the type of `connect` used to be ``unit -> [ `Ok of t | `Error of error ] io``), but calls to `connect` were automatically
+generated, and the error handler always spit out an error message and exited.
+Becaus the `error` was generic, Mirage2 didn't know how to properly print it,
+and instead failed with some incomprehensible error message.  Pretty printing
+errors is solved in Mirage3 by our [re-work of errors](https://github.com/mirage/mirage/pull/743), which now use the `result`
+type, are extendible, and can be pretty printed.  Calls to `connect` are
+automatically generated, and handling errors gracefully is out of scope for a
+unikernel -- where should it get the other 2 network devices promised at
+configuration time from, if they're not present on the (virtual) PCI bus?
+
+The solution we [discussed](https://lists.xenproject.org/archives/html/mirageos-devel/2016-09/msg00050.html)
+and [implemented](https://github.com/mirage/mirage/pull/602) (also in [functoria](https://github.com/mirage/functoria/pull/71)) was to always fail hard (i.e. crash) in `connect : unit -> t`.  This lead to a series of patches for all implementors of `connect`,
+where lots of patches removed control flow complexity (and less complex test
+cases, see e.g.
+[mirage-net-unix](https://github.com/mirage/mirage-net-unix/pull/27/files), or
+[tcpip](https://github.com/mirage/mirage-tcpip/pull/251/files)).  Lots of common
+boilerplate (like `or_error`, which throws an exception if `connect` errored)
+could be removed.
+
+Comparing the generated `main.ml` between Mirage 2.9.1 and 3.0.0 for various
+unikernels on both unix and xen code reductions up to 45% ([diffs are
+here](http://www.cl.cam.ac.uk/~hm519/mirage-2.9.1-3.0.0-diffs/))
+
+- console (device-usage) xen: +35 -41 (now 81) unix: +32 -39 (now 80)
+- block (device-usage) xen: +36 -45 (now 87) unix: +34 -44 (now 86)
+- kv_ro (device-usage) xen: +34 -59 (now 75) unix: +39 -51 (now 86)
+- network (device-usage) xen: +82 -134 (now 178) unix: +79 -133 (now 177)
+- conduit_server (device-usage) xen: +86 -152 (now 200) unix: +84 -213 (now 199)
+- dhcp (applications) xen: +44 -51 (now 93) unix: +41 -49 (now 92)
+- dns (applications) xen: +86 -143 (now 190) unix: +83 -141 (now 189)
+- static_website_tls (applications) xen: +97 -176 (now 230) unix: +108 -168 (now 237)
+- nqsb.io xen: +122 -171 (now 223) unix: +65 -85 (now 133)
+- btc-pinata xen: +119 -155 (now 217) unix: +64 -73 (now 127)
+- canopy xen: +106 -180 (now 245) unix: +61 -106 (now 159)
+
+## Workflow, phase separation, versioned opam dependencies
+
+The workflow to build a unikernel used to be `mirage configure` followed by
+`make`.  During the configure phase, a `Makefile` was generated with the right
+build and link commands (depending on configuration target and other
+parameters).  Mirage2 installed opam packages and system packages as a side
+effect during configuration.  This lead to several headaches: you needed to have the
+target-specific libraries installed while you were configuring (you couldn't
+even test the configuration for xen if you didn't have xen headers and support
+libraries installed).  Reconfiguration spawned yet another `opam` process (which
+even if it does not install anything since everything required is already
+installed, takes some time since the solver has to evaluate the universe) -
+unless the `--no-opam` option was passed to `mirage configure`.
+
+A second issue with the Mirage2 approach was that dependent packages were listed
+in the unikernel `config.ml`, and passed as string to opam.  When version
+constraints were included, this lead either shell (calling out `opam`) or make
+(embedding the packages in the Makefile) or both to choke.  Being able to
+express version constraints for dependencies in `config.ml` was one of the most
+wanted features for Mirage3.  It is crucial for further development (to continue
+allowing API breakage and removing legacy): a unikernel author, and the mirage
+tool, can now embed versioned dependencies onto device interfaces.  Instead of a
+garbled error message from mirage trying to compile a unikernel where the
+libraries don't fit the generated code, opam will inform which updates are
+necessary.
+
+In a [first rampage](https://github.com/mirage/mirage/pull/691) ([functoria](https://github.com/mirage/functoria/pull/82)) instead of
+manual executions of `opam` processes, an opam package file was generated by
+mirage at configuration time for the given target.  This allowed to express
+version constraints in each `config.ml` file (via the `package` function).  This
+change also separated the configuration phase, the dependency installation
+phase, and the build phase - which included delayed invocations of `pkg-config`
+to pass parameters to `ld`.  A mess, especially if your goal is to generate
+Makefiles which run both on GNU make and BSD make.
+
+A [second approach](https://github.com/mirage/mirage/pull/703) ([functoria](https://github.com/mirage/functoria/pull/84)) digged a bit
+deeper down the rabbit hole, and removed complex selection and adjustment of
+strings to output the Makefile, by implementing this logic in OCaml (and calling
+out to `ocamlbuild` and `ld`).  Removing an uneeded layer of code generation is
+easier to read and understand, less code, and includes stronger guarantees.
+More potential errors are caught during compile time, instead of generating
+(possible ill-formed) Makefiles.  [Bos](http://erratique.ch/software/bos) is a
+concise library interacting with basic operating system services, and solves
+once and for all common issues in that area, such as properly escaping of
+arguments.
+
+Mirage3 contains, instead of a single `configure_makefile` function which
+generated the entire makefile, the build and link logic is separated into
+functions, and only a simplistic makefile is generated which invokes `mirage
+build` to build the unikernel, and expects all dependent libraries to be
+installed (e.g. using `make depend`, which invokes `opam`) -- no need for
+delaying `pkg-config` calls anymore.
+
+This solution has certainly less complex string concatenation, and mirage has
+now a clearer phase distinction - configure, depend, compile & link.  (This
+workflow (still) [lacks a provisioning
+step](https://github.com/mirage/mirage/issues/694) (e.g. private key material,
+if provided as static binary blob, needs to be present during compilation atm),
+but can easily be added later.)  There are drawbacks: the mirage utility is now
+needed during compilation and linking, and needs to preserve command line
+arguments between configuration and build phase.  Maybe the build step should be
+in the opam file, then we would need to ensure unique opam package names and we
+would need to communicate to the user where the binary got built and installed.
+
+## Other functionality removed or replaced
+
+The first commit to mirage is from 2004, back then opam was an infant.  Mirage2
+ensured that a [not-too-ancient version of
+OCaml](https://github.com/mirage/mirage/blob/v2.9.1/lib/mirage.ml#L1462-L1487)
+is installed ([functoria contained a similar piece of
+code](https://github.com/mirage/functoria/blob/1.1.1/lib/functoria_misc.ml#L298-L309)).
+Mirage3 relies on opam to require a certain OCaml version (at the moment 4.03).
+
+Mirage and functoria were developed while support libraries were not yet
+available - worth mentioning [bos](http://erratique.ch/software/bos) (mentioned
+above), [fpath](http://erratique.ch/software/fpath),
+[logs](http://erratique.ch/software/logs), and
+[astring](http://erratique.ch/software/astring).  Parts of those libraries were
+embedded in functoria, and are now replaced by the libraries. (See
+[mirage#703](https://github.com/mirage/mirage/pull/703) and
+[functoria#84](https://github.com/mirage/functoria/pull/84) in case you want to
+know the details.)
+
+Functoria support for OCaml `<4.02` has been
+[dropped](https://github.com/mirage/functoria/pull/75), also
+[astring](https://github.com/mirage/functoria/pull/77) is now in use.
+Mirage support for OCaml `<4.01` has been
+[dropped](https://github.com/mirage/mirage/blob/v2.9.1/lib/mirage.ml#L1318-L1355)
+from Mirage.
+
+Some C bits and pieces, namely `str`, `bignum`, and `libgcc.a`, are no longer linked and part
+of every unikernel.  This is documented in
+[mirage#544](https://github.com/mirage/mirage/pull/544) and
+[mirage#663](https://github.com/mirage/mirage/issues/663).
+
+## Conclusion
+
+The overall statistics of Mirage3 look promising: more libraries, more
+contributors, less code, uniform error treatment, unified logging support.  Individual unikernels
+contain slightly less boilerplate code (as shown
+[by these unified diffs](http://www.cl.cam.ac.uk/~hm519/mirage-2.9.1-3.0.0-diffs/)).
+
+The binary sizes of the above mentioned examples (mirage-skeleton, nqsb, Canopy,
+pinata) between Mirage2 and Mirage3 results on both Unix and Xen only in small
+differences (in the range of kilobytes).  We are working on a [performance harness](https://github.com/mirage/mirage/issues/685)
+to evaluate the performance of
+[flambda](https://blogs.janestreet.com/flambda/) intermediate language in OCaml
+and [dead code elimination](https://github.com/ocaml/ocaml/pull/608).  These should
+decrease the binary size and improve the performance.