Skip to content

Commit

Permalink
Merge pull request #163 from epigen/master
Browse files Browse the repository at this point in the history
bring dev up to date with master docs
  • Loading branch information
nsheff committed Aug 1, 2017
2 parents 7851c38 + 2e9e46c commit 9dcc26c
Show file tree
Hide file tree
Showing 7 changed files with 55 additions and 27 deletions.
2 changes: 1 addition & 1 deletion doc/source/define-your-project.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _project-config-file:

Defining a project
How to define a project
=============================================

To use ``looper`` with your project, you must define your project using Looper's standard project definition format. If you follow this format, then your project can be read not only by looper for submitting pipelines, but also for other tasks, like: summarizing pipeline output, analysis in R (using the ``project.init`` package), or building UCSC track hubs.
Expand Down
4 changes: 3 additions & 1 deletion doc/source/hello-world.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,6 @@ Now, to test looper, follow the commands in the `Hello, Looper! example reposito
.. HINT::

If the looper executable isn't in your path, add it with ``export PATH=~/.local/bin:$PATH`` -- check out the :doc:`FAQ <faq>`.
If the looper executable isn't in your path, add it with ``export PATH=~/.local/bin:$PATH`` -- check out the :doc:`FAQ <faq>`.

Now just read the explanation in the `Hello, Looper! example repository <https://github.com/databio/hello_looper>`_ to understand what you've accomplished.
2 changes: 1 addition & 1 deletion doc/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Welcome
^^^^^^^^

Deploying pipelines just got easier. Looper is a lightweight python toolkit that deploys your pipeline across samples with minimal effort. To get started, proceed with the :doc:`Introduction <intro>` or use the table of contents below to navigate the docs.
Deploying pipelines just got easier. Looper is a python application that deploys pipelines across samples with minimal effort. Looper is **not** a pipeline development framework; it does not help develop pipelines, but sits a layer above the pipeline to manage projects and samples for any type of pipeline. To get started, proceed with the :doc:`Introduction <intro>`.

Contents
^^^^^^^^
Expand Down
4 changes: 3 additions & 1 deletion doc/source/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
Introduction
=====================================

Looper is a job submitting engine. If you have a pipeline and a bunch of samples you want to run, looper can help you organize the inputs and outputs. It's scalable: by default, it runs your jobs sequentially on the local computer, but with a small configuration change, it will create and submit jobs to any cluster resource manager (like SLURM, SGE, or LFS).
Looper is a job submitting engine. Do not confuse it with a pipeline workflow engine, which is used to build pipelines. Looper assumes you already have pipelines built, and it helps you map samples to those pipelines. If you have a pipeline and a bunch of samples you want to run, looper can help you organize the inputs and outputs.

It's scalable: by default, it runs your jobs sequentially on the local computer, but with a small configuration change, it will create and submit jobs to any cluster resource manager (like SLURM, SGE, or LFS).

The basics: We provide a format specification (the :ref:`project config file <project-config-file>`), which you use to describe your project. You create this single configuration file (in `yaml format <http://www.yaml.org/>`_), pass this file as input to ``looper``, which parses it, reads your sample list, maps each sample to the appropriate pipeline, and creates and runs (or submits) job scripts. Easy.

Expand Down
29 changes: 25 additions & 4 deletions doc/source/pipeline-interface.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,32 @@
Linking the pipeline interface
How to link a pipeline to your project
=============================================

Looper links to pipelines through a file called the `pipeline_interface`. If you're using pre-made looper pipelines, you don't need to create a new interface; you just use the one that comes with the pipeline. If you need to link a new pipeline to looper, then you'll need to create a new pipeline interface file. The instructions below show you how to use pipelines in either category.


Linking a looper-compatible pipeline
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Many projects will require only existing pipelines that are already looper-compatible. We maintain a (growing) list of known publicly available `looper-compatible pipelines <https://github.com/pepkit/hello_looper/blob/master/looper_pipelines.md>`_ that will give you a good place to start. This list includes pipelines for data types like RNA-seq, bisulfite sequencing, etc.

To use one of these pipelines, just clone the repository and the point your project to that pipeline's `pipeline_interface` file. You do this with the `pipeline_interfaces` attribute in the `metadata` section of your `project_config` file:

.. code-block:: yaml
metadata:
pipeline_interfaces: path/to/pipeline_interface.yaml
This value should be the absolute path to the pipeline interface file. After that, you just need to make sure your project definition provides all the necessary sample metadata that is required by the pipeline you want to use. For example, you will need to make sure your sample annotation sheet specifies the correct value under `protocol` that your linked pipeline understands. These details are specific to each pipeline and should be defined in the pipeline's README.


Linking a custom pipeline
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. HINT::

Pipeline users don't need to worry about this section. This is for those who develop pipelines, or those who want to use a currently defined looper project to submit to an existing pipeline that isn't already configured for looper.
If you're just a pipeline **user**, you don't need to worry about this section. This is for those who develop pipelines, or those who want to use a currently defined looper project to submit to an existing pipeline that isn't already configured for looper.

Looper can connect samples to any pipeline, as long as it runs on the command line and uses text command-line arguments. These pipelines could be simple shell scripts, python scripts, perl scripts, or even pipelines built using a framework. Typically, we use python pipelines built using the `pypiper <https://databio.org/pypiper>`_ package, which provides some additional power to looper, but this is optional.
Looper can connect samples to any pipeline, as long as it runs on the command line and uses text command-line arguments. These pipelines could be simple shell scripts, python scripts, perl scripts, or even pipelines built using a framework. Typically, we use python pipelines built using the `pypiper <http://pypiper.readthedocs.io>`_ package, which provides some additional power to looper, but this is optional.

Regardless of what pipelines you use, you will need to tell looper how to interface with your pipeline. You do that by specifying a **pipeline interface file**. The **pipeline interface** is a ``yaml`` file with two subsections:

Expand All @@ -29,7 +50,7 @@ Let's start with a very simple example. A basic ``pipeline_interface.yaml`` file
"--input": data_path
The first section specifies that samples of protocol ``RRBS`` will be mapped to the pipeline specified by key ``rrbs_pipeline``. The second section describes where the pipeline named ``rrbs_pipeline`` is located and what command-line arguments it requires. Pretty simple. Let's go through each of these sections in more detail:
The first section specifies that samples of protocol ``RRBS`` will be mapped to the pipeline specified by key ``rrbs_pipeline``. The second section describes where the pipeline with key ``rrbs_pipeline`` is located and what command-line arguments it requires. Pretty simple. Let's go through these 2 sections in more detail:

.. include:: pipeline-interface-mapping.rst.inc

Expand Down
27 changes: 15 additions & 12 deletions doc/source/tutorials.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
Extended tutorial
***************************************************

The best way to learn is by example, so here's a quick tutorial to get you started using looper to run pre-made pipelines on a pre-made project.
The best way to learn is by example, so here's an extended tutorial to get you started using looper to run pre-made pipelines on a pre-made project.

First, install looper and pypiper (since our tutorial uses pypiper pipelines):
First, install looper and pypiper. `Pypiper <https://pypiper.readthedocs.io>`_ is our pipeline development framework; it is not required to use looper, which can work with any command-line pipeline, but this tutorial uses pypiper pipelines so we must install it now:

.. code:: bash
Expand All @@ -29,29 +29,32 @@ Now you can run this project with looper! Just use ``looper run``:
.. HINT::

If the looper executable isn't in your path, add it with ``export PATH=~/.local/bin:$PATH`` -- check out the :doc:`FAQ <faq>`.
If the looper executable isn't in your path, add it with ``export PATH=~/.local/bin:$PATH``.

Pipeline outputs
^^^^^^^^^^^^^^^^^^^^^^^^^^
Outputs of pipeline runs will be under the directory specified in the ``output_dir`` variable under the ``paths`` section in the project config file (see :doc:`config-files` ) this is usually the name of the project being run.

Inside there will be two directories:

- ``results_pipeline`` [1]_ - a directory containing one directory with the output of the pipelines, for each sample.
- ``submissions`` [2]_ - which holds yaml representations of the samples and log files of the submited jobs.
- ``results_pipeline`` - a directory containing one directory with the output of the pipelines, for each sample.
- ``submissions`` - which holds yaml representations of the samples and log files of the submited jobs.

In this example, we just ran one example sample (an amplicon sequencing library) through a pipeline that processes amplicon data (to determine percentage of indels in amplicon).

The sample-specific output of each pipeline type varies.
From here to running hundreds of samples of various sample types is virtually the same effort!

To use pre-made pipelines with your project, all you have to do is :doc:`define your project <define-your-project>` using looper's standard format. To link your own, custom built pipelines, you can :doc:`connect your pipeline to looper with a pipeline interface <pipeline-interface>`.
On your own
^^^^^^^^^^^^^^^^^^^^^^^^^^

In this example, we just ran one example sample (an amplicon sequencing library) through a pipeline that processes amplicon data (to determine percentage of indels in amplicon).
To use looper on your own, you will need to prepare 2 things: your project (what data do you want to process), and your pipelines (what do you want to do with that data). The next sections provide detailed instructions on how to tell looper about these 2 things:

1. **Project**. To link your project to looper, you will need to :doc:`define your project <define-your-project>` using looper's standard format.


2. **Pipelines**. You will want to either use pre-made looper-compatible pipelines, or link your own, custom built pipelines. Either way, the next section includes detailed instructions on how to :doc:`connect your pipeline to looper <pipeline-interface>`.

From here to running hundreds of samples of various sample types is virtually the same effort!



.. rubric:: Footnotes

.. [1] This variable can also be specified in the ``results_subdir`` variable under the ``paths`` section of the project config file
.. [2] This variable can also be specified in the ``submission_subdir`` variable under the ``paths`` section of the project config file
14 changes: 7 additions & 7 deletions doc/source/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper [-h] [-V] {run,summarize,destroy,check,clean} ...
looper - Loop through samples and submit pipelines.
Expand All @@ -49,7 +49,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper run [-h] [-t TIME_DELAY] [--ignore-flags] [--compute COMPUTE]
[--env ENV] [--limit LIMIT] [--file-checks] [-d]
[--sp SUBPROJECT]
Expand Down Expand Up @@ -82,7 +82,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper summarize [-h] [--file-checks] [-d] [--sp SUBPROJECT]
config_file
Expand All @@ -103,7 +103,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper destroy [-h] [--file-checks] [-d] [--sp SUBPROJECT] config_file
Remove all files of the project.
Expand All @@ -123,7 +123,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper check [-h] [--file-checks] [-d] [--sp SUBPROJECT] config_file
Checks flag status of current runs.
Expand All @@ -143,7 +143,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper clean [-h] [--file-checks] [-d] [--sp SUBPROJECT] config_file
Runs clean scripts to remove intermediate files of already processed jobs.
Expand All @@ -163,7 +163,7 @@ Here you can see the command-line usage instructions for the main looper command

.. code-block:: none
version: 0.6.0-dev
version: 0.6.0
usage: looper [-h] [-V] [--logfile LOGFILE] [--verbosity {0,1,2,3,4}] [--dbg]
{run,summarize,destroy,check,clean} ...
Expand Down

0 comments on commit 9dcc26c

Please sign in to comment.