Permalink
Browse files

update snakemake instructions

  • Loading branch information...
gavinha committed Aug 7, 2018
1 parent 3d557f8 commit 316292af9257cb043b377553bc2436530136eb76
Showing with 32 additions and 13 deletions.
  1. +32 −13 README.md
@@ -51,27 +51,46 @@ pairings:
## snakefiles
1. `moleculeCoverage.snakefile`
2. `getPhasedAlleleCounts.snakefile`
3. `TitanCNA.snakefile`
1. [moleculeCoverage.snakefile](moleculeCoverage.snakefile)
2. [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile)
3. [TitanCNA.snakefile](TitanCNA.snakefile)
Invoking the full snakemake workflow for TITAN
# Run the analysis
## 1. Invoking the full snakemake workflow for TITAN
This will also run both [moleculeCoverage.snakefile](moleculeCoverage.snakefile) and [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile) which generate the necessary inputs for [TitanCNA.snakefile](TitanCNA.snakefile).
```
# show commands and workflow
snakemake -s TitanCNA.snakefile -np
# run the workflow locally using 5 cores
snakemake -s TitanCNA.snakefile --cores 5
# run the workflow on qsub using a maximum of 50 jobs. Broad UGER cluster parameters can be set directly in config/cluster.sh.
snakemake -s TitanCNA.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime}" -j 50 --jobscript config/cluster.sh
```
This will also run both `moleculeCoverage.snakefile` and `getPhasedAlleleCounts.snakefile` which generate the necessary inputs for `TitanCNA.snakfile`.
`moleculeCoverage.snakefile` and `getPhasedAlleleCounts.snakefile` can also be invoked separately. If only one but not both results are needed, then you can invoke the snakefiles independently.
Users can use launch the jobs on a cluster.
An implementation that works with Broad UGER (qsub) is provided. Parameters for memory, runtime, and parallel environment can be specified directly in the snakemake files; default values for each rule has already been set in `params` within the [config.yaml](config/config.yaml) and the command below can be used as-is. Other cluster parameters can be set directly in [cluster.sh](config/cluster.sh).
Note: users will need to adjust these for use with their cluster-specific settings
```
snakemake -s TitanCNA.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
```
snakemake -s moleculeCoverage.snakefile --cores 5
# OR
snakemake -s getPhasedAlleleCounts.snakefile --cores 5
```
## 2. Invoking individual steps in the workflow
Users can run the snakemake files individually. This can be helpful for testing each step or if you only wish to generate results for a particular step. The snakefiles need to be run in this same order since input files are generated by the previous steps.
# a. [moleculeCoverage.snakefile](moleculeCoverage.snakefile)
This part of the workflow
```
snakemake -s moleculeCoverage.snakefile -np
snakemake -s moleculeCoverage.snakefile --cores 5
# OR
snakemake -s moleculeCoverage.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
```
# b. [getPhasedAlleleCounts.snakefile](getPhasedAlleleCounts.snakefile)
```
snakemake -s getPhasedAlleleCounts.snakefile -np
snakemake -s getPhasedAlleleCounts.snakefile --cores 5
# OR
snakemake -s getPhasedAlleleCounts.snakefile --cluster-sync "qsub -l h_vmem={params.mem},h_rt={params.runtime} {params.pe}" -j 50 --jobscript config/cluster.sh
```

0 comments on commit 316292a

Please sign in to comment.