Merge branch 'moa.0.11' of ssh.github.com:mfiers/Moa into moa.0.11

mfiers · Nov 9, 2012 · 87fc517 · 87fc517
2 parents 1684295 + 828441b
commit 87fc517
Show file tree

Hide file tree

Showing 5 changed files with 99 additions and 11 deletions.
diff --git a/sphinx/commands/index.rst b/sphinx/commands/index.rst
@@ -465,14 +465,16 @@ Create a new job.
 
 This command creates a new job with the specified template in the
 current directory. If the directory already contains a job it
-needs to be forced using '-f'. It is possible to define arguments
-for the job on the commandline using KEY=VALUE after the
-template. Note: do not use spaces around the '=' sign. Use quotes
+needs to be forced using '-f'. It is possible to redefine arguments
+for the job on the command line using KEY=VALUE pairs after a
+template has been created. 
+
+Note: Do not use spaces around the '=' sign. Use quotes
 if you need spaces in variables (KEY='two values')
 
 positional arguments:
   template              name of the template to use for this moa job 
-  parameter             arguments for this job, specifyas KEY=VALUE without spaces (default: None)
+  parameter             arguments for this job, specified as KEY=VALUE without spaces (default: None)
 
 optional arguments:
   -h, --help            show this help message and exit

diff --git a/sphinx/configuration.rst b/sphinx/configuration.rst
@@ -1,7 +1,7 @@
 Configuring Moa
 ===============
 
-Moa is configured using the command line tool `Moa`. For example, you
+Moa is configured using the command line tool `moa`. For example, you
 are creating a simple job somewhere::
 
     $ moa simple -t 'test job' -- echo "Hello"

diff --git a/sphinx/examples/blastExample b/sphinx/examples/blastExample
@@ -0,0 +1,88 @@
+#!/usr/bin/env bash
+
+## Mark: The input data for this script needs to be modified to conform to your original example
+## which included a bash post command doing 'grep dicer ...'
+
+mkdir test.project && cd test.project
+mkdir 00.blastdb && cd 00.blastdb
+
+## copy or create symbolic links to protein sequences in 00.proteins
+cat - > ex.fasta << 'EOF'
+>gi|6850311|gb|AAF29388.1|AC009999_8 Contains similarity to a vacuolar sorting receptor homolog from Arabidopsis thaliana gb|U79959 [Arabidopsis thaliana]
+MSLPPFTCRLLAAAAALYLIGLLCVGADTKDVTAPKIPGCSNEFQMVKVENWVNGENGETFTAMTAQFGT
+MLPSDKDKAVKLPVALTTPLDSCSNLTSKLSWSIALSVRGECAFTVKAQVAQAGGAAALVLINDKEELDE
+MVCGEKDTSLNVSIPILMITTSSGDALKKSIMQNKKVELLLYAPKSPIVDDSCSNLSVGTVFVASVW
+SHVTSPKKNDEQYDELSPKKSSNVDATKGGAEEETLDISAMGAVIFVISASTFLVLLFFFMSSWFILILT
+IFFVIGGMQGMHNINVTLITRRCSKCGQKNLKLPLLGNTSILSLVVLLFCFVVAILWFMNRKTSHAWAGQ
+DIFGICMMINVLQVARLPNIRVATILLCCAFFYDIFWVFISPLIFKQSVMIAVARGSKDTGESIPMLLRI
+PRLSDPWGGYNMIGFGDILFPGLLICFIFRFDKENNKGVSNGYFPWLMFGYGLGLFLTYLGLYVMNGHGQ
+PALLYLVPCTLGITVILGLVRKELRDLWNYGTQQPSAADVNPSPEA
+
+>gi|4850398|gb|AAD31068.1|AC007357_17 Strong similarity to gi|3313615 F21J9.9 from Arabidopsis thaliana and is a member of the PF|00067 Cytochrome P450 family [Arabidopsis thaliana]
+MSEISSSMPLTERVYNHLCLSDVSLALLGLFVFCCVREKVTKKLGPTIWPVFGITPEFFFHRNDVYGWAT
+RCLKKCRGTFLYNGIWLGGSYGAVTCVPANVEYMLKTNFKNFPKGAFFKERFNDLLEDGIFNADAESWKE
+QRRIIITEMHSTRFVEHSFQTTQDLVRKKLLKVMESFARSQEAFDLQDVLLRLTFDNICIAGLGDDPGTL
+DSDLPLVPFAQAFEEATESTMFRFMIPPFIWKPLKFFDIGYEKGLRKAVDVSMSLSTRWLWIVSASSKKK
+EQSHKTTDEKDPSTIKFFRQFCTSFILAGRDTSSVALTWFFWVIQKHPEVENKIIREISEILRQRGDSPT
+SKNESLFTVKELNDMVYLQAALSETMRLYPPIPMEMKQAIEDDVFPDGTFIRKGSRVYFATYAMGRMESI
+WGKDCESFKPERWIQSGNFVNDDQFKYVVFNAGPRLCLGKTFAYLQMKTIAASVLSRYSIKVAKDHVVVP
+RVTTTLYMRHGLKVTISSKSLEEKIHVQD
+
+>gi|4850394|gb|AAD31064.1|AC007357_13 Identical to gb|X97864 cytochrome P450 from Arabidopsis thaliana and is a member of the PF|00067 Cytochrome P450 family. ESTs gb|T44875, gb|T04814, gb|R65111, gb|T44310 and gb|T04541 come from this gene [Arabidopsis thaliana]
+MSILLCFLCLLPVFLVSLSILSKRLKPSKWKLPPGPKTLPIIGNLHNLTGLPHTCFRNLSQKFGPVMLLH
+FGFVPVVVISSKEGAEEALKTQDLECCSRPETVATRMISYNFKDIGFAPYGEEWKALRKLVVMELLNTKK
+FQSFRYIREEENDLLIKKLTESALKKSPVNLKKTLFTLVASIVCRLAFGVNIHKCEFVDEDNVADLVNKF
+EMLVAGVAFTDFFPGVGWLVDRISGQNKTLNNVFSELDTFFQNVLDDHIKPGRQVSENPDVVDVMLDLMK
+KQEKDGESFKLTTDHLKGIISDIFLAGVNTSAVTLNWAMAELIRNPRVMKKVQDEIRTTLGDKKQRITEQ
+DLSQVHYFKLVVKEIFRLHPAAPLLLPRETMSHVKIQGYDIPVKTQMMINIYSIARDPKLWTNPDEFNPD
+RFLDSSIDYRGLNFELLPFGSGRRICPGMTLGITTVELGLLNLLYFFDWVVPVGKNVKDINLEETGSIII
+SKKTTLELVPLVHH
+EOF
+
+## Create blast db 
+formatdb -i ex.fasta -p T 
+
+mkdir ../10.fasta && cd ../10.fasta
+
+## Generate query files
+cat - > query1.fasta << 'EOF'
+>query1
+LLAAAAALYLIGLLCVGADKDVTAPKIPGCSNEFQMVKVEWVNGENGETFTAMTAQFGT
+MLPSDKDKAVKLPVALTTLDSCSNLTSKLSWSIALSVRGECAFTVKAQVAQAGG
+EOF
+
+cat - > query2.fasta << 'EOF'
+>query2
+KCRGTFLYNGIWLGGSYGAVTCVPANVEYMLTSSVALTWFFWVQKHPPVEVENKIIREISEILRQRGDS
+EOF
+
+## Using formatdb from older blast version
+## makeblastdb -in refseq_protein –input_type blastdb -dbtype prot
+
+mkdir ../20.blast && cd ../20.blast
+
+## Create new moa blast template
+moa new blast -t "demo run"
+
+## Define blast database location
+moa set db=../00.blastdb/ex.fasta
+
+## Define input sequence(s) location
+moa set input=../10.fasta/*.fasta
+
+## Define blast application to use
+moa set program=blastp
+
+
+## List template variables
+moa show
+
+## Execute the moa job 
+moa run
+
+## Manually check output descriptions for the pattern 'dicer'
+grep -i dicer blast_report
+
+## Set a moa post command (Check: can this be configured without STDIN intervention?)
+## moa set postcommand
+## > grep -i dicer blast_report > dicer.out
diff --git a/sphinx/filesets.rst b/sphinx/filesets.rst
@@ -1,5 +1,5 @@
 Filesets
 ========
 
-Filesets are an important part of Moa - they are used to define in-
-and output files of Moa jobs
+Filesets are an important part of Moa - they are used to define input
+and output files for Moa jobs
diff --git a/sphinx/intro.rst b/sphinx/intro.rst
@@ -27,10 +27,8 @@ The best way to understand how Moa can help you to achieve this is by an example
     $ mkdir test.project && cd test.project
     $ mkdir 00.proteins
     
-    ( copy or link some protein sequences into 00.proteins )
-   
-    $ mkdir 10.blast
-    $ cd 10.blast
+    ## copy or create symbolic links some protein sequences in 00.proteins 
+    $ mkdir 10.blast && cd 10.blast
 
 An important feature of Moa is that each separate analysis step is contained within a separate directory. Two Moa jobs never share a directory. This forces a Moa user to break a workflow down to atomic parts, which is typically beneficial to the organization and coherence of a workflow. The order of steps is easily ordered by prefixing directory names with a number. Note that these prefixes are not enforced by Moa; any alphabetical organization would work as well. Once a directory is created, a Moa job can be created::