Skip to content

Commit

Permalink
doc: more on wrapping
Browse files Browse the repository at this point in the history
  • Loading branch information
pveber committed Nov 14, 2017
1 parent 8a49bbc commit 12413bd
Showing 1 changed file with 66 additions and 3 deletions.
69 changes: 66 additions & 3 deletions doc/wrapping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,10 @@ Let's describe what we wrote:

Basically defining a workflow amounts to providing a list of commands
that are expected to produce a result at the location represented by
the token ``dest``. The value ``touch`` we have defined has type ``'a
workflow``, and represents a recipe (right, a very simple one) to
the token ``dest``. **Note that a workflow that doesn't use ``dest``
is necessarily incorrect** since it has no means to produce its output
at the expected location. The value ``touch`` we have defined has type
``'a workflow``, and represents a recipe (right, a very simple one) to
produce a result file. This type is too general and we'd have to
restrict it to prevent run-time error, but we'll see that later. Let's
now see how we make make a pipeline on some parameter.
Expand Down Expand Up @@ -121,4 +123,65 @@ argument is a workflow. If you ask OCaml, it will say that ``sort``
has type ``'a workflow -> 'b workflow``. That is, given a first
workflow, this function is able to build a new one. This new workflow
will call ``sort`` redirecting the standard output to the expected
destination and giving it ``text_file`` as an argument.
destination and giving it ``text_file`` as an argument. More
precisely, ``bistro`` will inject the location it decided for the
output of workflow ``text_file`` in the command invocating
``sort``. By combining the use of ``dep`` and ``dest``, you can write
entire collections of interdependent scripts without ever caring about
where the generated files are stored.


Typing workflows
================

We have seen that the ``workflow`` function from ``Bistro.EDSL`` can
be used to make new workflows that call external programs. This
function has of course no means to know what the format of the result
file or directory will be. For this reason, it outputs a value of type
``'a workflow``, which means a result whose format is compatible with
any other. This is obviously wrong in the general case, and could lead
to run-time errors by feeding a tool with inputs of an unsupported
format. In order to prevent such run-time errors, we can provide more
precise types to our functions producing workflows, when we have more
information. Let's see that on an example. FASTA files have the
property that when you concatenate several of them, the result is
still a FASTA file (this is false in general case of course). We are
now going to write a workflow that concatenates several FASTA files,
and make sure its typing reflects this property.

Both ``Bistro.Std`` and ``Bistro_bioinfo.Std`` define a few type
definitions for annotating workflows. In particular we'll use
``Bistro_bioinfo.Std.fasta`` for our example. Here's how it looks:

.. code-block:: ocaml
open Bistro.Std
open Bistro.EDSL
open Bistro_bioinfo.Std
let fasta_concat (x : fasta workflow) (y : fasta workflow) : fasta workflow =
workflow ~descr:"fasta-concat" [
cmd "cat" ~stdout:dest [ dep x ; dep y ] ;
]
Alternatively, you can define your workflow in a ``.ml`` file:

.. code-block:: ocaml
open Bistro.EDSL
let fasta_concat x y =
workflow ~descr:"fasta-concat" [
cmd "cat" ~stdout:dest [ dep x ; dep y ] ;
]
and constraint its type in the corresponding ``.mli`` file:

.. code-block:: ocaml
open Bistro.Std
open Bistro_bioinfo.Std
val fasta_concat : fasta workflow -> fasta workflow -> fasta workflow

0 comments on commit 12413bd

Please sign in to comment.