Pioneer Projects

Hannes Mehnert edited this page Oct 20, 2016 · 71 revisions

NEWS: moved to http://canopy.mirage.io/Projects (see where help needed (or subscribe to canopy atom feed))

This is an area to gather suggestions for projects involving MirageOS and OCaml, suitable for different skill levels. The intent is that these are projects with well-defined boundaries, a mentor, and which someone could contribute to without getting overwhelmed in dependencies or wider systems.

Drop a line to mirageos-devel@lists.xenproject.org if you're interested in starting on one of these.

Web stack testing

MirageOS has an emerging web toolstack that's broken up as a series of libraries -- for example, Cohttp, Uri, Cow, Ipaddr, RSS and Cowabloga. This project will get you familiar with them by building a protocol testing framework that can generate traffic using off-the-shelf tools such as httperf, and evaluate the results vs applications such as Apache or Nginx. Outcomes would be (1) a test harness for HTTP and (2) some results of the evaluation using the test harness.

Mentor: Anil Madhavapeddy

Difficulty: ★★★☆☆

Fuzz testing Xen with Mirage

We would like to use the Mirage/Xen libraries to fuzz test all levels of a typical cloud toolstack. Mirage has low-level bindings for Xen hypercalls, mid-level bindings for domain management, and high-level bindings to XCP for cluster management. This project would build a QuickCheck-style fuzzing mechanism that would perform millions of random operations against a real cluster, and identify bugs with useful backtraces. The first task would be to become familiar with a specification-based testing tool like Kaputt (see http://kaputt.x9c.fr/). The second task would be to choose an interface for testing; perhaps one of the hypercall ones. Outcomes would be (1) a repo containing a fuzz testing tool and (2) some unexpected behaviour with a backtrace (NB it's not required that we find a critical bug, we just need to show the approach works).

See also: http://wiki.xenproject.org/wiki/GSoC_2013#fuzz-testing-mirage

Mentor: Anil Madhavapeddy

Difficulty: ★★★☆☆

Documentation and Outreach

Screencasts

As we produce more libraries we also try to produce material around them to ease the process of trying things out and getting more involved. Currently we do this through blog posts, examples on mirage-skeleton and links to implementations. It would be really helpful to add screencasts to this list of resources and we've made early steps already. This is a slightly unorthodox project as it's not about code but it would have a substantial and positive impact on the project. It's likely to be an ongoing project as screencasts about anything MirageOS-related are fair game! If you need some ideas, then things that would be useful are: installing the dev environment on various architectures, walk-throughs of the existing tutorials and demos, and examples of deployment steps (and many more). I'm offering support/guidance to anyone who'd like to have a go at this.

Mentor: Amir Chaudhry

Difficulty: ★☆☆☆☆


Projects in progress

Bigarray parser generator

FastParsers is a Scala parser library which uses macros to transform easy-to-write parser combinators into efficient recursive-descent backtracking parsers. The generated parsers are about 20x faster than Scala's parser combinator library even though its interface stay about the same.

An OCaml equivalent that uses Cstruct under the hood to do zero-copy parsing would permit a big speed boost in Mirage's protocol stacks.

Mentor: Jeremy Yallop, Anil Madhavapeddy, and Rudi Grinberg

Mentee: Runhang Li

Difficulty: ★★★★☆

Status : working in progress

Macros for OCaml

There are currently two ways of generating OCaml code from within OCaml programs: camlp4 (and its successor, ppx), which produces untyped syntax, and MetaOCaml, which produces typed code.

We have a design for an OCaml extension which combines advantages of the two approaches. The system will allow users to write type MetaOCaml-style code generators that both interact cleanly with the language abstractions like modules and run entirely during compilation. There are various applications within Mirage and more widely, including generic programming, HTML templates, foreign function interface generation and embedded DSLs.

There's an abstract with further details about the design, which was presented at OCaml 2015.

Mentors: Jeremy Yallop and Leo White

Mentee: Olivier Nicole

Difficulty: ★★★★☆

Status: working in progress

Completed projects!

Below is a list of projects that we've marked as 'completed'. However, as these are driven by real and ongoing needs there is likely to be scope to continue the work (and we simply haven't gotten around to defining a follow-on project). If you're interested in working on the continuation of any of these, please send a note to mirageos-devel@lists.xenproject.org.

OCaml Implementation of libmacaroons

libmacaroons is an implementation of macaroons which is a cryptographic bearer token construct (like cookies) that can be attenuated by third-parties. Macaroons provide a decentralized authorization framework for access control which excels at delegation. The system is based on the libsodium library which wraps djb's NaCl.

Macaroons was initially implemented in C and projects are underway to implement it in Java, JavaScript, Go, and Python. We would like an OCaml implementation of macaroons based on ocaml-sodium to use in Mirage.

Mentor: David Sheets

Difficulty: ★★★★☆

Pull request for macaroons 0.1.0

Irmin inside the browser

Irmin is a library for creating Git-like stores. It is written in pure OCaml and it should be possible to compile to it JavaScript to run in a browser (modulo implementing in Javascript the few missing external symbols). But someone has to try it and fix the inevitable glitches will will happen. The ultimate goal is to have a version controlled local-storage, with asynchronous synchronisation between the browser and the server via Git over websockets.

Related issue: mirage/irmin#96

Mentor: Thomas Gazagnaire

Difficulty: ★★☆☆☆

See Thomas Leonard's email to the list

Encryption layer for Irmin

Irmin is a library for creating Git-like stores. It provides a nice abstraction on top of various lower-level backend (such as the Git format) and it is (relatively) easy to add new ones. It would be nice to design a new backend to support encryption.

Related issue: mirage/irmin#96, work done

Mentor: Thomas Gazagnaire

Difficulty: ★★★☆☆

Semantics of mergeable data-structures

See online

Mentor: Thomas Gazagnaire

Difficulty: ★★★★☆

Fix warnings in Xen C code

mirage-platform contains C code that produces various warnings when compiled (make xen-build). An easy but useful way to increase our confidence in the code would be to go through these and fix them.

Done by Len Maxwell; see https://github.com/mirage/mirage-platform/pull/141

Mentor: Thomas Leonard

Difficulty: ★☆☆☆☆

DHCP Server

DHCP is a common protocol for automatically discovering and managing network settings. Mirage already includes a minimal DHCP client in the mirage-tcpip repository (for configuring network settings on unikernels on networks that have a working DHCP server), but currently there is no implementation which allows Mirage to serve and manage DHCP leases for other hosts on a network. Even a minimal IPv4 implementation would be helpful for demonstration purposes.

Done by Christiano Haesbaert, https://github.com/haesbaert/charrua-core and related; and by Alistair Fisher, https://github.com/alistairfisher/irmin-dhcp.

Mentor: Richard Mortier

Difficulty: ★★☆☆☆

Deflate (zlib) in pure OCaml

Multiple pure-OCaml implementations of the zlib inflate algorithm already exist: for instance, in extlib or here. A pure implementation in Haskell of both inflate and deflate also exists and is available here. What is missing is a non-blocking, streaming implementation of both inflate and deflate in pure OCaml, in a style similar to jsonm and ocaml-imap. This is an independent project which will be useful on its own -- it will also allow to use lzip in the browser with js_of_ocaml.

Mentor: Thomas Gazagnaire

Status: https://github.com/oklm-wsh/Decompress

Difficulty: ★★☆☆☆

SSL Command-line utilities

Basically all the utilities known from openssl (being it s_client, s_server, asn1parse, dgst, enc, and verify) should be implemented using the mirleft libraries (tls, asn.1, nocrypto, x.509) as standalone Unix executables (using cmdliner). Plus points for drop-in replacement (full argument compatibility where applicable). This is also applicable for other, non-security, libraries as well (e.g. Syndic). Please contact the mailing-list if you'd like to work on any of these.

Mentor: Hannes Mehnert (and others)

Work which has been done: rand certify tlsclient tlstunnel

Difficulty: ★☆☆☆☆

Password based key derivation

How to encrypt some private data using a human-enterable and human-memorable password? The answer is to derive a key from a given password (because passwords are biased towards printable alphanumerical ASCII characters). This also increases the required computational power for brute-forcing passwords, because the derivation function is computationally expensive. More details about PBKDF2, spec in RFC 2898, also bcrypt and scrypt are of interest (best would be to implement all of them :).

Mentor: Hannes Mehnert

Work which has been done: PBKDF SCRYPT

Difficulty: ★★☆☆☆

ANSI Terminal emulation

Currently, a Xen console is available. For interactive unikernels (network configuration as a terminal shell), a terminal emulation is useful. Once this is finished, a telnet server is straightforward to implement, and we can interactively explore unikernels!! A recent mail thread already contains some advice.

Status: notty - a declarative terminal graphics library ; telnet

Mentor: Hannes Mehnert (and others)

Difficulty: ★★☆☆☆