From 316292af9257cb043b377553bc2436530136eb76 Mon Sep 17 00:00:00 2001
From: Gavin Ha <gavinha@broadinstitute.org>
Date: Tue, 7 Aug 2018 13:05:44 -0400
Subject: [PATCH] update snakemake instructions

---
 README.md | 45 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/README.md b/README.md
index 86a1882..31a00ac 100644
--- a/README.md
+++ b/README.md
@@ -51,27 +51,46 @@ pairings:
 
 
 ## snakefiles
-1. `moleculeCoverage.snakefile`
-2. `getPhasedAlleleCounts.snakefile`
-3. `TitanCNA.snakefile`
+1. [moleculeCoverage.snakefile](moleculeCoverage.snakefile)
+2. [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile)
+3. [TitanCNA.snakefile](TitanCNA.snakefile)
 
-Invoking the full snakemake workflow for TITAN
+# Run the analysis
+
+## 1. Invoking the full snakemake workflow for TITAN
+This will also run both [moleculeCoverage.snakefile](moleculeCoverage.snakefile) and [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile) which generate the necessary inputs for [TitanCNA.snakefile](TitanCNA.snakefile).
 ```
 # show commands and workflow
 snakemake -s TitanCNA.snakefile -np
 # run the workflow locally using 5 cores
 snakemake -s TitanCNA.snakefile --cores 5
-# run the workflow on qsub using a maximum of 50 jobs. Broad UGER cluster parameters can be set directly in config/cluster.sh. 
-snakemake -s TitanCNA.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime}" -j 50 --jobscript config/cluster.sh
 ```
-This will also run both `moleculeCoverage.snakefile` and `getPhasedAlleleCounts.snakefile` which generate the necessary inputs for `TitanCNA.snakfile`.
-
-`moleculeCoverage.snakefile` and `getPhasedAlleleCounts.snakefile` can also be invoked separately. If only one but not both results are needed, then you can invoke the snakefiles independently.
+Users can use launch the jobs on a cluster.  
+An implementation that works with Broad UGER (qsub) is provided. Parameters for memory, runtime, and parallel environment can be specified directly in the snakemake files; default values for each rule has already been set in `params` within the [config.yaml](config/config.yaml) and the command below can be used as-is. Other cluster parameters can be set directly in [cluster.sh](config/cluster.sh).  
+Note: users will need to adjust these for use with their cluster-specific settings
+```
+snakemake -s TitanCNA.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
 ```
-snakemake -s moleculeCoverage.snakefile --cores 5
-# OR
-snakemake -s getPhasedAlleleCounts.snakefile --cores 5
-``` 
+
+
+## 2. Invoking individual steps in the workflow
+Users can run the snakemake files individually. This can be helpful for testing each step or if you only wish to generate results for a particular step. The snakefiles need to be run in this same order since input files are generated by the previous steps.
+  # a. [moleculeCoverage.snakefile](moleculeCoverage.snakefile)
+  This part of the workflow
+  ```
+  snakemake -s moleculeCoverage.snakefile -np
+  snakemake -s moleculeCoverage.snakefile --cores 5
+  # OR
+  snakemake -s moleculeCoverage.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
+  ```
+  
+  # b. [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile) 
+  ```
+  snakemake -s getPhasedAlleleCounts.snakefile -np
+  snakemake -s getPhasedAlleleCounts.snakefile --cores 5
+  # OR
+  snakemake -s getPhasedAlleleCounts.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
+  ```