Skip to content

Commit

Permalink
Chore: Updating docs for WES pipeline options. (#18)
Browse files Browse the repository at this point in the history
  • Loading branch information
skchronicles committed Feb 2, 2024
1 parent 6988c86 commit 2f94263
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 9 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<h1>genome-seek 🔬</h1>

**_Whole Genome Clinical Sequencing Pipeline._**
**_Whole Genome and Exome Clinical Sequencing Pipeline._**

[![tests](https://github.com/OpenOmics/genome-seek/workflows/tests/badge.svg)](https://github.com/OpenOmics/genome-seek/actions/workflows/main.yaml) [![docs](https://github.com/OpenOmics/genome-seek/workflows/docs/badge.svg)](https://github.com/OpenOmics/genome-seek/actions/workflows/docs.yml) [![GitHub issues](https://img.shields.io/github/issues/OpenOmics/genome-seek?color=brightgreen)](https://github.com/OpenOmics/genome-seek/issues) [![GitHub license](https://img.shields.io/github/license/OpenOmics/genome-seek)](https://github.com/OpenOmics/genome-seek/blob/main/LICENSE)

Expand All @@ -20,7 +20,7 @@ The **`./genome-seek`** pipeline is composed of several interrelated sub-command
* [<code>genome-seek <b>unlock</b></code>](https://openomics.github.io/genome-seek/usage/unlock/): Unlocks a previous runs output directory.
* [<code>genome-seek <b>cache</b></code>](https://openomics.github.io/genome-seek/usage/cache/): Cache software containers locally.

**genome-seek** is a comprehensive clinical WGS pipeline that is focused on speed. Each tool in the pipeline was benchmarked and selected due to its low run times without sacrificing accuracy or precision. It relies on technologies like [Singularity<sup>1</sup>](https://singularity.lbl.gov/) to maintain the highest level of reproducibility. The pipeline consists of a series of data processing and quality-control steps orchestrated by [Snakemake<sup>2</sup>](https://snakemake.readthedocs.io/en/stable/), a flexible and scalable workflow management system, to submit jobs to a cluster.
**genome-seek** is a comprehensive clinical WGS and WES pipeline that is focused on speed. Each tool in the pipeline was benchmarked and selected due to its low run times without sacrificing accuracy or precision. It relies on technologies like [Singularity<sup>1</sup>](https://singularity.lbl.gov/) to maintain the highest level of reproducibility. The pipeline consists of a series of data processing and quality-control steps orchestrated by [Snakemake<sup>2</sup>](https://snakemake.readthedocs.io/en/stable/), a flexible and scalable workflow management system, to submit jobs to a cluster.

The pipeline is compatible with data generated from Illumina short-read sequencing technologies. As input, it accepts a set of FastQ files and can be run locally on a compute instance or on-premise using a cluster (recommended). A user can define the method or mode of execution. The pipeline can submit jobs to a cluster using a job scheduler like SLURM (more coming soon!). A hybrid approach ensures the pipeline is accessible to all users.

Expand Down Expand Up @@ -53,4 +53,4 @@ This site is a living document, created for and by members like you. genome-seek

## References
<sup>**1.** Kurtzer GM, Sochat V, Bauer MW (2017). Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459.</sup>
<sup>**2.** Koster, J. and S. Rahmann (2018). "Snakemake-a scalable bioinformatics workflow engine." Bioinformatics 34(20): 3600.</sup>
<sup>**2.** Koster, J. and S. Rahmann (2018). "Snakemake-a scalable bioinformatics workflow engine." Bioinformatics 34(20): 3600.</sup>
6 changes: 3 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<h1 style="font-size: 250%">genome-seek 🔬</h1>

<b><i>Whole Genome Clinical Sequencing Pipeline</i></b><br>
<b><i>Whole Genome and Exome Clinical Sequencing Pipeline</i></b><br>
<a href="https://github.com/OpenOmics/genome-seek/actions/workflows/main.yaml">
<img alt="tests" src="https://github.com/OpenOmics/genome-seek/workflows/tests/badge.svg">
</a>
Expand All @@ -24,7 +24,7 @@


## Overview
Welcome to genome-seek's documentation! This guide is the main source of documentation for users who are getting started with the OpenOmics [whole genome sequencing pipeline](https://github.com/OpenOmics/genome-seek/).
Welcome to genome-seek's documentation! This guide is the main source of documentation for users who are getting started with the OpenOmics [genome-seek pipeline](https://github.com/OpenOmics/genome-seek/).

The **`./genome-seek`** pipeline is composed of several interrelated sub-commands to set up and run the pipeline across different systems. Each of the available sub-commands performs different functions:

Expand Down Expand Up @@ -58,7 +58,7 @@ The **`./genome-seek`** pipeline is composed of several interrelated sub-command

</section>

**genome-seek** is a comprehensive clinical WGS pipeline that is focused on speed. Each tool in the pipeline was benchmarked and selected due to its low run times without sacrificing accuracy or precision. It relies on technologies like [Singularity<sup>1</sup>](https://singularity.lbl.gov/) to maintain the highest level of reproducibility. The pipeline consists of a series of data processing and quality-control steps orchestrated by [Snakemake<sup>2</sup>](https://snakemake.readthedocs.io/en/stable/), a flexible and scalable workflow management system, to submit jobs to a cluster.
**genome-seek** is a comprehensive clinical WGS and WES pipeline that is focused on speed. Each tool in the pipeline was benchmarked and selected due to its low run times without sacrificing accuracy or precision. It relies on technologies like [Singularity<sup>1</sup>](https://singularity.lbl.gov/) to maintain the highest level of reproducibility. The pipeline consists of a series of data processing and quality-control steps orchestrated by [Snakemake<sup>2</sup>](https://snakemake.readthedocs.io/en/stable/), a flexible and scalable workflow management system, to submit jobs to a cluster.

The pipeline is compatible with data generated from Illumina short-read sequencing technologies. As input, it accepts a set of FastQ files and can be run locally on a compute instance or on-premise using a cluster (recommended). A user can define the method or mode of execution. The pipeline can submit jobs to a cluster using a job scheduler like SLURM (more coming soon!). A hybrid approach ensures the pipeline is accessible to all users.

Expand Down
25 changes: 22 additions & 3 deletions docs/usage/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@ Setting up the genome-seek pipeline is fast and easy! In its most basic form, <c
```text
$ genome-seek run [--help] \
[--mode {slurm,local}] [--job-name JOB_NAME] [--batch-id BATCH_ID] \
[--call-cnv] [--call-sv] [--call-hla] [--skip-qc] [--open-cravat] \
[--oc-annotators OC_ANNOTATORS] [--oc-modules OC_MODULES] \
[--tmp-dir TMP_DIR] [--silent] [--sif-cache SIF_CACHE] \
[--call-cnv] [--call-sv] [--call-hla] [--call-somatic] [--open-cravat] \
[--skip-qc] [--oc-annotators OC_ANNOTATORS] [--oc-modules OC_MODULES] \
[--pairs PAIRS] [--pon PANEL_OF_NORMALS] [--wes-mode] [--wes-bed WES_BED] \
[--tmp-dir TMP_DIR] [--silent] [--sif-cache SIF_CACHE] \
[--singularity-cache SINGULARITY_CACHE] \
[--dry-run] [--threads THREADS] \
--input INPUT [INPUT ...] \
Expand Down Expand Up @@ -96,6 +97,23 @@ Each of the following arguments are optional, and do not need to be provided.
>
> ***Example:*** `--skip-qc`
---
`--wes-mode`
> **Runs the whole exome pipeline.**
> *type: boolean flag*
>
> By default, the whole genome sequencing (WGS) pipeline is run. This option allows a user to process and analyze whole exome sequencing data. Please note when this mode is enabled, a sub-set of the WGS rules will run. Please see the option below for more information about providing a custom exome targets/capture-kit BED file.
>
> ***Example:*** `--wes-mode`
---
`--wes-bed WES_BED`
> **Path to exome targets/capture-kit BED file.**
> *type: BED file*
>
> This file can be obtained from the manufacturer of the target capture kit that was used. By default, a set of BED files generated from GENCODE's exon annotation for protein coding gene's exon is used. Please note: This BED file should contain at least 6 columns.
>
> ***Example:*** `--wes-bed Agilent_SS_AllExons_V7_Regions.bed`
---
`--batch-id BATCH_ID`
Expand All @@ -107,6 +125,7 @@ Each of the following arguments are optional, and do not need to be provided.
>
> ***Example:*** `--batch-id WGS_2022-04-19`

### 2.3 Anotation options

Each of the following arguments are optional, and do not need to be provided.
Expand Down

0 comments on commit 2f94263

Please sign in to comment.