Skip to content

Commit

Permalink
Documentation update (#102)
Browse files Browse the repository at this point in the history
* README update to include Baylor counts
* updated drop installation command
* update install command docs to include conda-forge plus better descriptions

Co-authored-by: Vicente <yepez@in.tum.de>
Co-authored-by: Michaela Müller <mi.mueller@tum.de>
Co-authored-by: Christian Mertes <mertes@in.tum.de>
  • Loading branch information
4 people committed Aug 2, 2020
1 parent 8679afe commit 966da08
Show file tree
Hide file tree
Showing 5 changed files with 99 additions and 77 deletions.
31 changes: 12 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,40 +8,29 @@ The manuscript main file, supplementary figures and table can be found in the ma

<img src="drop_sticker.png" alt="drop logo" width="200" class="center"/>

## Installation
DROP is available on [bioconda](https://anaconda.org/bioconda/drop) for python 3.6 and above.
We recommend using a dedicated conda environment.

## Quickstart
DROP is available on [bioconda](https://anaconda.org/bioconda/drop).
We recommend using a dedicated conda environment. (installation time: ~ 10min)
```
# create environment
conda create -n drop_env python=3.6
conda activate drop_env
# install drop
conda install -c bioconda drop
conda install -c conda-forge -c bioconda drop
```
Installation time: ~ 10min

Test whether the pipeline runs through by setting up the demo dataset in an empty directory (e.g. ``~/drop_demo``).

Test installation with demo project
```
mkdir ~/drop_demo
cd ~/drop_demo
# demo will download the necessary data and pipeline files
drop demo
```

The pipeline can be run using `snakemake` commands

The pipeline can be run using [snakemake](https://snakemake.readthedocs.io/) commands
```
snakemake -n # dryrun
snakemake
snakemake --cores 1
```

Expected runtime: 25 min

For more information on different installation options, check out the
For more information on different installation options, refer to the
[documentation](https://gagneurlab-drop.readthedocs.io/en/latest/installation.html)

## Set up a custom project
Expand All @@ -66,4 +55,8 @@ The following publicly-available datasets of gene counts can be used as controls

* 119 non-strand specific fibroblasts: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3887451.svg)](https://doi.org/10.5281/zenodo.3887451)

* 139 strand specific fibroblasts: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3963474.svg)](https://doi.org/10.5281/zenodo.3963474)

* 125 strand specific blood: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3963470.svg)](https://doi.org/10.5281/zenodo.3963470)

If you want to contribute with your own count matrices, please contact us: yepez at in.tum.de
23 changes: 14 additions & 9 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ DROP - Detection of RNA Outliers Pipeline

DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression,
aberrant splicing and mono-allelic expression. It consists of three independent modules for each of those strategies.
After installing DROP, the user needs to fill in the config file and sample annotation table (:ref:`prepare`).
Then, DROP can be executed in multiple ways (:ref:`pipeline`).
After installing DROP, the user needs to fill in the config file and sample annotation table (:doc:`prepare`).
Then, DROP can be executed in multiple ways (:doc:`pipeline`).

.. toctree::
:maxdepth: 2
Expand All @@ -19,23 +19,28 @@ Then, DROP can be executed in multiple ways (:ref:`pipeline`).
Quickstart
-----------

DROP is available on `bioconda <https://anaconda.org/bioconda/drop>`_ for python 3.6 and above.
We recommend using a dedicated conda environment.
DROP is available on `bioconda <https://anaconda.org/bioconda/drop>`_.
We recommend using a dedicated conda environment. (installation time: ~ 10min)

.. code-block:: bash
conda install -c bioconda drop
conda install -c conda-forge -c bioconda drop
Initialize project
Test installation with demo project

.. code-block:: bash
cd <path-to-project>
mkdir ~/drop_demo
cd ~/drop_demo
drop demo
Call the pipeline
The pipeline can be run using `snakemake <https://snakemake.readthedocs.io/>`_ commands

.. code-block:: bash
snakemake
snakemake -n # dryrun
snakemake --cores 1
Expected runtime: 25 min

For more information on different installation options, refer to :doc:`installation`.
109 changes: 66 additions & 43 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
Installation
============

DROP is available on `bioconda <https://anaconda.org/bioconda/drop>`_ for python 3.6 and above.
We recommend using a dedicated conda environment.
DROP is available on `bioconda <https://anaconda.org/bioconda/drop>`_ .
In case the conda channel priority is set to ``strict``, it should be reset to ``flexible``:

.. code-block:: bash
.. code-block::
conda config --set channel_priority true
# create environment
conda create -n drop_env python=3.6
conda activate drop_env
We recommend using a dedicated conda environment (here: ``drop_env``) for installing drop.

.. code-block:: bash
# install drop
conda install -c bioconda drop
conda create -n drop_env -c conda-forge -c bioconda drop
Installation time: ~ 10min

Expand All @@ -25,12 +26,12 @@ Test whether the pipeline runs through by setting up the demo dataset in an empt
# demo will download the necessary data and pipeline files
drop demo
The pipeline can be run using ``snakemake`` commands
The pipeline can be run using `snakemake <snakemake.readthedocs.io/>`_ commands

.. code-block:: bash
snakemake -n # dryrun
snakemake
snakemake --cores 1
Initialize a project
--------------------
Expand All @@ -39,12 +40,13 @@ Alternatively, a new DROP project can be set up using ``drop init``.

.. code-block:: bash
cd <path-to-project>
cd <path/to/project>
drop init
This will create an empty ``config.yaml`` file that needs to be filled according to the project data.
You also need to prepare a sample annotation file.
Go to :ref:`prepare` for more details.
Go to :doc:`prepare` for more details.


.. _otherversions:

Expand All @@ -53,62 +55,83 @@ Other DROP versions

The developer version of DROP can be found in the `repository <https://github.com/gagneurlab/drop>`_ under the branch
``dev``.
Make sure that the :any:`dependencies` are installed.
Make sure that the :any:`prerequisites` are installed, preferably in a conda environment.
Then install DROP from github using ``pip``.

.. code-block:: bash
# activate your python environment if you are using one, e.g. drop_env
conda activate drop_env
pip install git+https://github.com/gagneurlab/drop.git@dev
Then install DROP from github using ``pip``.
For this recursively clone the repository with all its submodules and then install from directory.
Alternatively, you can clone the desired branch of the repository and install from directory.

.. code-block:: bash
git clone -b dev https://github.com/gagneurlab/drop.git --recurse-submodules
git clone -b dev https://github.com/gagneurlab/drop.git
pip install ./drop
Alternatively, you can also install it directly without cloning
If the package needs to be updated frequently, it is more useful to use the ``-e` option of ``pip``.
Any new update pulled from the repository will be available without reinstall.
Note, that this requires an explicit call to update any existing project (:any:`dropUpdate`).

.. code-block:: bash
.. code-block::
pip install git+https://github.com/gagneurlab/drop.git@dev
pip install -e ./drop
.. _dependencies:
# update project directory
cd <path/to/project>
drop update
Dependencies
------------
The easiest way to ensure that all dependencies are installed is to install the
`bioconda package <https://anaconda.org/bioconda/drop>`_ into a conda environment.
.. code-block:: bash
.. _prerequisites:

conda install -c bioconda drop
Prerequisites
-------------

Other versions of drop can be installed after the bioconda package has been installed.
The easiest way to ensure that all dependencies are installed is to install the bioconda package, as described above.
Once the environment is set up and installation was successful, other versions of drop can be installed with ``pip``,
overwriting the conda version of ``DROP`` (see :any:`otherversions`).


Installation without conda
++++++++++++++++++++++++++
Alternatively, DROP can be installed without ``conda``. In this case the following dependencies must be met:

* python >= 3.6
* pip >= 19.1
* `samtools <https://www.htslib.org/download/>`_ >= 1.7
* `bcftools <https://github.com/samtools/bcftools>`_ >= 1.7
* `tabix <https://www.htslib.org/download/>`_
* `GATK <https://software.broadinstitute.org/gatk/>`_
* `graphviz <https://www.graphviz.org/>`_
* `pandoc <https://pandoc.org/>`_
* `R <https://www.r-project.org/>`_ >= 3.5 and corresponding `bioconductor <https://bioconductor.org/install/>`_ version

If you are using an already existing R installation, make sure that the R and ``bioconductor`` versions match.
Otherwise, use the newest versions of R and bioconductor.
The necessary R packages will be installed with the first pipeline call.
* Programming languages:

* `python <https://www.python.org/>`_ >= 3.6 and `pip <https://pip.pypa.io/en/stable/installing/>`_ >= 19.1

* `R <https://www.r-project.org/>`_ >= 3.6 and corresponding `bioconductor <https://bioconductor.org/install/>`_ version

* Commandline tools:

* `GNU bc <https://www.gnu.org/software/bc/>`_

* `GNU wget <https://www.gnu.org/software/wget/>`_

* `tabix <https://www.htslib.org/download/>`_

* `samtools <https://www.htslib.org/download/>`_ >= 1.7

* `bcftools <https://github.com/samtools/bcftools>`_ >= 1.7

* `GATK <https://software.broadinstitute.org/gatk/>`_ >= 4.0.4

* `graphviz <https://www.graphviz.org/>`_

* `pandoc <https://pandoc.org/>`_


.. note::

If you are using an already existing R installation, make sure that the R and bioconductor versions match.
Otherwise, use the newest versions of R and bioconductor.

At first invocation, all necessary R packages will be installed with the first pipeline call.
As this is a lengthy process, it might be desirable to install them in advance, if a local copy of the repository exists.

.. code-block:: bash
# optional
Rscript <path-to-drop-repo>/drop/installRPackages.R drop/requirementsR.txt
Rscript <path/to/drop/repo>/drop/installRPackages.R drop/requirementsR.txt
7 changes: 4 additions & 3 deletions docs/source/pipeline.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
.. _pipeline:

Pipeline Commands
=================

Expand Down Expand Up @@ -81,10 +79,13 @@ While running, Snakemake *locks* the directory. If, for a whatever reason, the p
to unlock it. This will call snakemake's ``unlock`` command for every module

.. _dropUpdate:

Updating DROP
+++++++++++++
Everytime a project is initialized, a temporary folder ``.drop`` will be created in the project folder. If a new version of drop is installed, the ``.drop`` folder has to be updated for each project that has been initialized using an older version.
Every time a project is initialized, a temporary folder ``.drop`` will be created in the project folder.
If a new version of drop is installed, the ``.drop`` folder has to be updated for each project that has been
initialized using an older version.
To do this run:

.. code-block:: bash
Expand Down
6 changes: 3 additions & 3 deletions docs/source/prepare.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
.. _prepare:

Preparing the Input Data
========================

Expand Down Expand Up @@ -95,6 +93,7 @@ groups list Same as in aberrant expression.
minIds numeric Same as in aberrant expression. ``1``
recount boolean If true, it forces samples to be recounted. ``false``
longRead boolean Set to true only if counting Nanopore or PacBio long reads. ``false``
keepNonStandardChrs boolean Set to true if non standard chromosomes are to be kept for further analysis. ``true``
filter boolean If false, no filter is applied. We recommend filtering. ``true``
minExpressionInOneSample numeric The minimal read count in at least one sample required for an intron to pass the filter. ``20``
minDeltaPsi numeric The minimal variation (in delta psi) required for an intron to pass the filter. ``0.05``
Expand All @@ -118,6 +117,7 @@ padjCutoff numeric Same as in aberrant expression.
allelicRatioCutoff numeric A number between [0.5, 1) indicating the maximum allelic ratio allele1/(allele1+allele2) for the test to be significant. ``0.8``
addAF boolean Whether or not to add the allele frequencies from gnomAD ``true``
maxAF numeric Maximum allele frequency (of the minor allele) cut-off. Variants with AF equal or below this number are considered rare. ``0.001``
maxVarFreqCohort numeric Maximum variant frequency among the cohort. ``0.05``
qcVcf character Full path to the vcf file used for VCF-BAM matching ``/path/to/qc_vcf.vcf.gz``
qcGroups list Same as “groups”, but for the VCF-BAM matching ``# see aberrant expression example``
===================== ========= ======================================================================================================================== ======
Expand Down Expand Up @@ -172,7 +172,7 @@ Specifically, the number of threads allowed for a computational step can be modi

.. note::

DROP needs to be installed from a local directory :ref:`otherversions` using ``pip install -e <path-to-drop-repo>``
DROP needs to be installed from a local directory :ref:`otherversions` using ``pip install -e <path/to/drop-repo>``
so that any changes in the code will be available in the next pipeline run.
Any changes made to the R code need to be updated with ``drop update`` in the project directory.

Expand Down

0 comments on commit 966da08

Please sign in to comment.