Skip to content

Commit

Permalink
improved docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Mark Fiers committed May 30, 2012
1 parent bde341f commit 2ddc0bb
Show file tree
Hide file tree
Showing 3 changed files with 77 additions and 0 deletions.
1 change: 1 addition & 0 deletions sphinx/api/plugin/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.rst
75 changes: 75 additions & 0 deletions sphinx/sync.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
Synchronizing jobs
==================

It is quite often usefull to repeat a jobs on a number of different
input files. For simple operations, one liners, this ca be
accomplished using `moa map`. More complex operations, or those
requiring a template other than `map` can be replicated using job
syncronization. Assume you have a set of fastq libraries, each in it's
own directory::

./fq/set1/set1_1.fq
./fq/set1/set1_2.fq
./fq/set2/set2_1.fq
./fq/set2/set2_2.fq
./fq/set3/set3_1.fq
./fq/set3/set3_2.fq

And you want to run a bowtie alignment for each separately. The
approach to take is to create a directory containing all alignments::

mkdir bowtie
cd bowtie

and, in that directory, create one job running bowtie, in a directory
named **exactly** as the input directories::

mkdir set1
cd set1
moa new bowtie -t 'run bowtie for {{_}}'

Note the magic variable `{{_}}`. This variable is replaced by the name
of the current directory. So when running `moa show`, the title would
show up as "run bowtie for set1". This magic variable can be used in
all variables, and we'll use it here to set this job up in such a way
that it can be reused for the other datasets::

set moa fq_forward_input='../../fq/{{_}}/*_1.fq'
# .. configure the remaining variables

Now - we replicate this directory in the following manner. We'll move
one directory up, to the `bowtie` directory, and create a `sync` job::
cd ..
moa new sync -t 'run bowtie for all fq datasets'
moa set source=../fq/

The sync template keeps directories synchronized, based on the
`source` directory. If you now run `moa run` in the `bowtie`
directory, two more directories will be created: `set2` and `set3`,
each containing a verbatim copy of the original bowtie job created.

If, at a certain moment you obtain more fastq datasets::

./fq/set4/set4_1.fq
./fq/set4/set4_2.fq

you can repeat `moa run` in the `./bowtie` sync directory, and a new
directory will be created. Note that the `sync` template will not
remove directories. Also if you want to update the configuration of
the syncronized bowtie jobs, you only need to change the configuration
in one directory, run `moa run` again in the `./bowtie` directory and
the configuration is synchronized across all jobs.












1 change: 1 addition & 0 deletions sphinx/templates/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.rst

0 comments on commit 2ddc0bb

Please sign in to comment.