Skip to content

Commit

Permalink
Update the documentation (#357)
Browse files Browse the repository at this point in the history
This update fixes the documentation to reflect recent changes. In particular the quick start demo and the tutorial now refer to the Snakemake workflow instead of the (now obsolete) `kevlar simplex` command.
  • Loading branch information
standage committed Feb 13, 2019
1 parent 0f1f366 commit 8985554
Show file tree
Hide file tree
Showing 8 changed files with 155 additions and 140 deletions.
42 changes: 40 additions & 2 deletions docs/banding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,43 @@ If memory is a limiting factor for *k*-mer counting, kevlar supports a scatter/g
In brief, kevlar can achieve an N-fold reduction in memory usage in exchange for counting *k*-mers N batches.
For each batch, kevlar ignores all *k*-mers except those whose hash values fall within a specified numerical range (band), reducing the memory required to achieve accurate *k*-mer counts.

The ``kevlar count``, ``kevlar effcount``, and ``kevlar novel`` commands support *k*-mer banding.
The output of multiple ``kevlar novel`` invocations can be combined using ``kevlar filter``.
The :ref:`kevlar_count_api` and :ref:`kevlar_novel_api` commands support *k*-mer banding, and the :ref:`kevlar_unband_api` command merges novel reads from multiple batches into a single read set suitable for downstream analysis.

The example below is not intended to show the most efficient way of invoking the banding workflow, simply how it is intended to work.

.. code::
# Count k-mers in 6 passes for a 6x reduction in memory
for $band in {1..6}
do
for indiv in mother father proband
do
kevlar count \
--threads 16 --memory 8G --counter-size 8 --ksize 31 \
--mask refr-univec-mask.nodetable \
--num-bands 6 --band $band \
${indiv}.kmer-counts.band${band}.counttable ${indiv}.reads.fastq.gz
done
done
# Find novel k-mers in 6 passes
for $band in {1..6}
do
kevlar novel \
--case proband.reads.fastq.gz --case-counts ${indiv}.kmer-counts.band${band}.counttable \
--control-counts mother.kmer-counts.band${band}.counttable mother.kmer-counts.band${band}.counttable \
--case-min 5 --ctrl-max 1 --ksize 31\
--num-bands 6 --band $band \
--out proband-novel-${band}.augfastq.gz
done
# Combine the novel k-mers from all 6 passes
kevlar unband --out proband-novel.augfastq.gz proband-novel-{1..6}.augfastq.gz
# The rest of the kelvar simplex workflow:
# - kevlar filter
# - kevlar partition
# - kevlar kevlar assemble
# - kevlar localize
# - kevlar call
# - kevlar simlike
35 changes: 35 additions & 0 deletions docs/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ From this one command, a variety of tasks and procedures can be invoked using se
Once kevlar is installed, available subcommands can be listed by executing :code:`kevlar -h`.
To see instructions for running a specific subcommand, execute :code:`kevlar <subcommand> -h` (of course replacing :code:`subcommand` with the actual name of the subcommand).


.. _kevlar_count_api:

kevlar count
------------

Expand All @@ -17,6 +20,8 @@ kevlar count
:prog: kevlar
:path: count

.. _kevlar_novel_api:

kevlar novel
------------

Expand All @@ -27,6 +32,8 @@ kevlar novel
:prog: kevlar
:path: novel

.. _kevlar_filter_api:

kevlar filter
-------------

Expand All @@ -37,6 +44,20 @@ kevlar filter
:prog: kevlar
:path: filter

.. _kevlar_partition_api:

kevlar partition
----------------

.. argparse::
:module: kevlar.cli.__init__
:func: parser
:nodefault:
:prog: kevlar
:path: partition

.. _kevlar_assemble_api:

kevlar assemble
---------------

Expand All @@ -47,6 +68,8 @@ kevlar assemble
:prog: kevlar
:path: assemble

.. _kevlar_localize_api:

kevlar localize
---------------

Expand All @@ -57,6 +80,8 @@ kevlar localize
:prog: kevlar
:path: localize

.. _kevlar_call_api:

kevlar call
-----------

Expand All @@ -67,6 +92,8 @@ kevlar call
:prog: kevlar
:path: call

.. _kevlar_simlike_api:

kevlar simlike
--------------

Expand All @@ -77,6 +104,8 @@ kevlar simlike
:prog: kevlar
:path: simlike

.. _kevlar_alac_api:

kevlar alac
-----------

Expand All @@ -87,6 +116,8 @@ kevlar alac
:prog: kevlar
:path: alac

.. _kevlar_unband_api:

kevlar unband
-------------

Expand All @@ -97,6 +128,8 @@ kevlar unband
:prog: kevlar
:path: unband

.. _kevlar_augment_api:

kevlar augment
----------------

Expand All @@ -107,6 +140,8 @@ kevlar augment
:prog: kevlar
:path: augment

.. _kevlar_mutate_api:

kevlar mutate
-------------

Expand Down
1 change: 0 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ Documentation for **kevlar**
terms
formats
banding
sim
cli
conduct

Expand Down
3 changes: 2 additions & 1 deletion docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Otherwise, we suggest reading through the entire installation instructions befor

.. code::
pip3 install pysam networkx pandas scipy git+https://github.com/dib-lab/khmer.git
pip3 install pysam networkx pandas scipy intervaltree git+https://github.com/dib-lab/khmer.git
pip3 install biokevlar
Virtual environment
Expand All @@ -33,6 +33,7 @@ The kevlar package **requires Python 3** and has several dependencies that are n
- the `pysam module <http://pysam.readthedocs.io/>`_
- the `pandas library <http://pandas.pydata.org/>`_
- the `scipy library <https://www.scipy.org/>`_
- the `intervaltree library <https://github.com/chaimleib/intervaltree>`_
- the `khmer package <http://khmer.readthedocs.io/>`_

Also, kevlar requires the `bwa <https://github.com/lh3/bwa>`_ command to be callable from your ``$PATH`` environmental variable.
Expand Down
26 changes: 14 additions & 12 deletions docs/quick-start.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,32 @@
Quick start
===========

If you have not already done so, install kevlar using :doc:`the following instructions <install>`.
If you have not already done so, install kevlar using :doc:`the following instructions <install>`, and Snakemake version 5.0 or greater.

This gives a crash course on running kevlar's simplex analysis workflow.
The ``kevlar simplex`` command should be able to run on a laptop in less than 5 minutes while consuming less than 200 Mb of RAM for this demo data set.
The results (``variant-calls.vcf``) should include 5 variant calls: a 300 bp insertion and 4 single-nucleotide variants.
The simplest way to execute kevlar's entire *de novo* variant discovery workflow is using the provided Snakemake workflow configuration.
Processing the example data set below should be able to run on a laptop in less than 5 minutes while consuming less than 200 Mb of RAM.
The results (``workdir/calls.scored.sorted.vcf.gz``) should include 5 variant calls: a 300 bp insertion and 4 single-nucleotide variants.

A :doc:`more detailed tutorial is available <tutorial>`, and a complete listing of all available configuration options for each script can be found in :doc:`the CLI documentation <cli>`, or by executing ``kevlar <subcommand> -h`` in the terminal.
A :doc:`more detailed tutorial is available <tutorial>`, and a complete listing of all available configuration options for each kevlar command can be found in :doc:`the CLI documentation <cli>`, or by executing ``kevlar <subcommand> -h`` in the terminal.

----------

.. highlight:: none

.. code::
# Download data
curl -L https://s3-us-west-1.amazonaws.com/noble-trios/helium-mother-reads.fq.gz -o mother.fq.gz
curl -L https://s3-us-west-1.amazonaws.com/noble-trios/helium-father-reads.fq.gz -o father.fq.gz
curl -L https://s3-us-west-1.amazonaws.com/noble-trios/helium-proband-reads.fq.gz -o proband.fq.gz
curl -L https://s3-us-west-1.amazonaws.com/noble-trios/helium-refr.fa.gz -o refr.fa.gz
bwa index refr.fa.gz
kevlar simplex \
--case proband.fq.gz --control mother.fq.gz --control father.fq.gz \
--novel-memory 50M --filter-memory 1M --filter-fpr 0.005 --mask-memory 5M \
--mask-files refr.fa.gz \
--threads 4 --ksize 31 \
--out variant-calls.vcf \
refr.fa.gz
# Download and format configuration file
curl -L https://s3-us-west-1.amazonaws.com/noble-trios/helium-config.json \
| sed "s:/home/user/Desktop:$(pwd):g" > helium-config.json
# Invoke the workflow
snakemake \
--snakefile kevlar/workflows/mark-I/Snakefile \
--configfile helium-config.json --cores 4 --directory workdir -p calls
25 changes: 0 additions & 25 deletions docs/sim.rst

This file was deleted.

0 comments on commit 8985554

Please sign in to comment.