update snakemake instructions

GavinHaLab · Aug 7, 2018 · 316292a · 316292a
1 parent 3d557f8
commit 316292a
Showing 1 changed file with 32 additions and 13 deletions.
diff --git a/README.md b/README.md
@@ -51,27 +51,46 @@ pairings:
 
 
 ## snakefiles
-1. `moleculeCoverage.snakefile`
-2. `getPhasedAlleleCounts.snakefile`
-3. `TitanCNA.snakefile`
+1. [moleculeCoverage.snakefile](moleculeCoverage.snakefile)
+2. [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile)
+3. [TitanCNA.snakefile](TitanCNA.snakefile)
 
-Invoking the full snakemake workflow for TITAN
+# Run the analysis
+
+## 1. Invoking the full snakemake workflow for TITAN
+This will also run both [moleculeCoverage.snakefile](moleculeCoverage.snakefile) and [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile) which generate the necessary inputs for [TitanCNA.snakefile](TitanCNA.snakefile).
 ```
 # show commands and workflow
 snakemake -s TitanCNA.snakefile -np
 # run the workflow locally using 5 cores
 snakemake -s TitanCNA.snakefile --cores 5
-# run the workflow on qsub using a maximum of 50 jobs. Broad UGER cluster parameters can be set directly in config/cluster.sh. 
-snakemake -s TitanCNA.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime}" -j 50 --jobscript config/cluster.sh
 ```
-This will also run both `moleculeCoverage.snakefile` and `getPhasedAlleleCounts.snakefile` which generate the necessary inputs for `TitanCNA.snakfile`.
-
-`moleculeCoverage.snakefile` and `getPhasedAlleleCounts.snakefile` can also be invoked separately. If only one but not both results are needed, then you can invoke the snakefiles independently.
+Users can use launch the jobs on a cluster.  
+An implementation that works with Broad UGER (qsub) is provided. Parameters for memory, runtime, and parallel environment can be specified directly in the snakemake files; default values for each rule has already been set in `params` within the [config.yaml](config/config.yaml) and the command below can be used as-is. Other cluster parameters can be set directly in [cluster.sh](config/cluster.sh).  
+Note: users will need to adjust these for use with their cluster-specific settings
+```
+snakemake -s TitanCNA.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
 ```
-snakemake -s moleculeCoverage.snakefile --cores 5
-# OR
-snakemake -s getPhasedAlleleCounts.snakefile --cores 5
-``` 
+
+
+## 2. Invoking individual steps in the workflow
+Users can run the snakemake files individually. This can be helpful for testing each step or if you only wish to generate results for a particular step. The snakefiles need to be run in this same order since input files are generated by the previous steps.
+  # a. [moleculeCoverage.snakefile](moleculeCoverage.snakefile)
+  This part of the workflow
+  ```
+  snakemake -s moleculeCoverage.snakefile -np
+  snakemake -s moleculeCoverage.snakefile --cores 5
+  # OR
+  snakemake -s moleculeCoverage.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
+  ```
+
+  # b. [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile) 
+  ```
+  snakemake -s getPhasedAlleleCounts.snakefile -np
+  snakemake -s getPhasedAlleleCounts.snakefile --cores 5
+  # OR
+  snakemake -s getPhasedAlleleCounts.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
+  ```