Encore is a little library to provide an interface to generate an Angstrom's decoder and a Condorcet's encoder from a shared description. The goal is specifically for ocaml-git and ensure isomorphism when we decode and encode a Git object - and keep the same hash/identifier.
A good example can be found in test/
directory. It provides a description of a
Git object and, by this way, make an Angstrom decoder and a Condorcet encoder.
Then, we test the Encore git repository itself to check integrity after a
serialization and a de-serialization.
Encore integrates a little overhead when you compare generated decoder/encoder
with an decoder and a decoder generated by your hand. We integrate a benchmark
which compare a specific version of ocaml-git
(encore branch) and
decoder/encoder produced by Encore. You can run locally this benchmarch with
jbuilder build @runbench
but you need to pin before ocaml-git
on:
$ opam pin add git https://github.com/dinosaure/ocaml-git.git#encore
$ opam pin add git-http https://github.com/dinosaure/ocaml-git.git#encore
$ opam pin add git-unix https://github.com/dinosaure/ocaml-git.git#encore
Then, on my computer (Thinkpad X1 Carbon - Inteli i7-7500U CPU @ 2.70 Ghz - 2.90 Ghz), I get this result:
┌────────┬──────────┬─────────┬──────────┬──────────┬────────────┐
│ Name │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
├────────┼──────────┼─────────┼──────────┼──────────┼────────────┤
│ encore │ 37.24ms │ 3.45Mw │ 194.32kw │ 18.09kw │ 100.00% │
│ git │ 32.84ms │ 3.52Mw │ 229.67kw │ 13.92kw │ 88.16% │
└────────┴──────────┴─────────┴──────────┴──────────┴────────────┘
So, we can observe a little overhead but assumptions provided by Encore is more interesting than a faster decoder/encoder.
Condorcet is a little encoder which takes care about the memory consumption when you serialize an OCaml value with a description. We use a bounded bigarray and when it's full, we explicitely ask to the user to flush it.
Condorcet was build on a CPS mind like Angstrom and use only pure functionnal data-structure. This is a big difference with Faraday. So, obviously, Condorcet is slower than Faraday (3 times), however, we can not use in this context Faraday, precisely about alteration.
In fact, when Condorcet fails, we raise an exception to short-cut to the other branch. With a mutable structure, it's little bit hard to rollback to the old state of encoder and retry the other branch. With Condorcet, we don't need to trick to rollback because, at any step we make a new pure state.
This project is inspired to the finale project which is focus on a pretty-printer at the end. Encore is close to provide a low-level encoder like Faraday than a generator of a pretty-printer.