Merge old docs update

pepkit · Feb 17, 2017 · e8f553b · e8f553b
1 parent 09a7195
commit e8f553b
Show file tree

Hide file tree

Showing 9 changed files with 80 additions and 17 deletions.
diff --git a/doc/source/advanced.rst b/doc/source/advanced.rst
@@ -65,9 +65,15 @@ Check out the complete working example in the `microtest repository <https://git
 Using cluster resource managers
 ****************************************
 
-For each sample, ``looper`` will create one or more submission scripts for that sample. The ``compute`` settings specify how these scripts will be both produced and run. This makes it very portable and easy to change cluster management systems by just changing a few variables in a configuration file. By default, looper builds a shell script for each sample and runs them serially: the shell will block until the each run is finished and control is returned to ``looper`` for the next iteration. Compute settings can be changed using an environment configuration file called ``looperenv``. Several common engines (SLURM and SGE) come by default, but the system gives you complete flexibility, so you can easily configure looper to work with your resource manager.
+.. warning:: This is still in progress
 
-For complete instructions on configuring your compute environment, see the looperenv repository at https://github.com/epigen/looperenv. Here's a brief overview. Here's an example `looperenv` file:
+Looper uses a template-based system for building scripts. By default, looper will just build a shell script and run them serially. Compute settings can be changed using an environment script, which you point to with a shell environment variable called ``LOOPERENV``.
+
+Complete instructions for configuring your compute environment are availble in the looperenv repository at https://github.com/epigen/looperenv.
+
+For each iteration, `looper` will create one or more submission scripts for that sample. The `compute` settings specify how these scripts will be both produced and run.  This makes it very portable and easy to change cluster management systems, or to just use a local compute power like a laptop or standalone server, by just changing the two variables in the `compute` section.
+
+Example:
 
 .. code-block:: yaml
 
@@ -81,15 +87,13 @@ For complete instructions on configuring your compute environment, see the loope
 	    partition: queue_name
 
 
-There are two sub-parameters in the compute section. First, ``submission_template`` is a (relative or absolute) path to the template submission script. Looper uses a template-based system for building scripts. This is a template with variables (encoded like ``{VARIABLE}``), which will be populated independently for each sample as defined in ``pipeline_inteface.yaml``. The one variable ``{CODE}`` is a reserved variable that refers to the actual shell command that will run the pipeline. Otherwise, you can use any variables you define in your `pipeline_interface.yaml`.
+There are two sub-parameters in the compute section. First, `submission_template` is a (relative or absolute) path to the template submission script. This is a template with variables (encoded like `{VARIABLE}`), which will be populated independently for each sample as defined in `pipeline_inteface.yaml`. The one variable ``{CODE}`` is a reserved variable that refers to the actual python command that will run the pipeline. Otherwise, you can use any variables you define in your `pipeline_interface.yaml`.
 
-Second, the ``submission_command`` is the command-line command that ``looper`` will prepend to the path of the produced submission script to actually run it (``sbatch`` for SLURM, `qsub` for SGE, ``sh`` for localhost, etc).
+Second, the `submission_command` is the command-line command that `looper` will prepend to the path of the produced submission script to actually run it (`sbatch` for SLURM, `qsub` for SGE, `sh` for localhost, etc).
 
 In `Templates <https://github.com/epigen/looper/tree/master/templates>`__ are examples for submission templates for `SLURM <https://github.com/epigen/looper/blob/master/templates/slurm_template.sub>`__, `SGE <https://github.com/epigen/looper/blob/master/templates/sge_template.sub>`__, and `local runs <https://github.com/epigen/looper/blob/master/templates/localhost_template.sub>`__. 
 
 
-
-
 Handling multiple input files with a merge table
 ****************************************
 

diff --git a/doc/source/config-files.rst b/doc/source/config-files.rst
@@ -4,6 +4,7 @@ Configuration files
 
 Looper uses `YAML <http://www.yaml.org/>`_ configuration files to describe a project. Looper is a very modular system, so there are few different YAML files. Here's an explanation of each. Which ones you need to know about will depend on whether you're a pipeline user (running pipelines on your project) or a pipeline developer (building your own pipeline).
 
+
 Pipeline users
 *****************
 
@@ -28,4 +29,3 @@ If you want to add a new pipeline to looper, then there are two YAML files that
 Finally, if you're using Pypiper to develop pipelines, it uses a pipeline-specific configuration file (detailed in the Pypiper documentation):
 
 -   `Pypiper pipeline config file <http://pypiper.readthedocs.io/en/latest/advanced.html#pipeline-config-files>`_: Each pipeline may have a configuration file describing where software is, and parameters to use for tasks within the pipeline
-.pypiper 
diff --git a/doc/source/define-your-project.rst b/doc/source/define-your-project.rst
@@ -10,7 +10,7 @@ The format is simple and modular, so you only need to define the components you
 1. **Project config file** - a ``yaml`` file describing input and output file paths and other (optional) project settings
 2. **Sample annotation sheet** - a ``csv`` file with 1 row per sample
 
-The first file (**project config**) is just a few lines of ``yaml`` in the simplest case. Here's a minimal example **project_config.yaml**:
+In the simplest case, ``project_config.yaml`` is just a few lines of ``yaml``. Here's a minimal example **project_config.yaml**:
 
 
 .. code-block:: yaml
@@ -21,9 +21,7 @@ The first file (**project config**) is just a few lines of ``yaml`` in the simpl
 	  pipelines_dir: /path/to/pipelines/repository
 
 
-The **output_dir** describes where you want to save pipeline results, and **pipelines_dir** describes where your pipeline code is stored.
-
-The second file (**sample annotation sheet**) is where you list your samples, which is a comma-separated value (``csv``) file containing at least a few defined columns: a unique identifier column named ``sample_name``; a column named ``library`` describing the sample type (e.g. RNA-seq); and some way of specifying an input file. Here's a minimal example of **sample_annotation.csv**:
+The **output_dir** describes where you want to save pipeline results, and **pipelines_dir** describes where your pipeline code is stored. You will also need a second file to describe samples, which is a comma-separated value (``csv``) file containing at least a unique identifier column named ``sample_name``, a column named ``library`` describing the sample type, and some way of specifying an input file. Here's a minimal example of **sample_annotation.csv**:
 
 
 .. csv-table:: Minimal Sample Annotation Sheet
@@ -36,9 +34,11 @@ The second file (**sample annotation sheet**) is where you list your samples, wh
    "frog_4", "RNA-seq", "frog4.fq.gz"
 
 
-With those two simple files, you could run looper, and that's fine for just running a quick test on a few files. You just type: ``looper run path/to/project_config.yaml`` and it will run all your samples through the appropriate pipeline. In practice, you'll probably want to use some of the more advanced features of looper by adding additional information to your configuration ``yaml`` file and your sample annotation ``csv`` file. These advanced options are detailed below.
+With those two simple files, you could run looper, and that's fine for just running a quick test on a few files. In practice, you'll probably want to use some of the more advanced features of looper by adding additional information to your configuration ``yaml`` file and your sample annotation ``csv`` file.
+
+For example, by default, your jobs will run serially on your local computer, where you're running ``looper``. If you want to submit to a cluster resource manager (like SLURM or SGE), you just need to specify a ``compute`` section.
 
-Now, let's go through the more advanced details of both annotation sheets and project config files:
+Let's go through the more advanced details of both annotation sheets and project config files:
 
 .. include:: sample-annotation-sheet.rst
 

diff --git a/doc/source/faq.rst b/doc/source/faq.rst
@@ -6,3 +6,4 @@ FAQ
 - How can I run my jobs on a cluster? See :ref:`cluster resource managers <cluster-resource-managers>`
 
 - Which configuration file has which settings? Here's a list: :doc:`config files <config-files>`
+
diff --git a/doc/source/features.rst b/doc/source/features.rst
@@ -6,7 +6,6 @@ Simplicity for the beginning, power when you need to expand.
 
 - **Flexible pipelines:**  Use looper with any pipeline, any library, in any domain. We designed it to work with `Pypiper <http://pypiper.readthedocs.io/>`_, but looper has an infinitely flexible command-line argument system that will let you configure it to work with  any script (pipeline) that accepts command-line arguments. You can also configure looper to submit multiple pipelines per sample.
 
-
 - **Flexible compute:**  If you don't change any settings, looper will simply run your jobs serially. But Looper includes a templating system that will let you process your pipelines on any cluster resource manager (SLURM, SGE, etc.). We include default templates for SLURM and SGE, but it's easy to add your own as well. Looper also gives you a way to determine which compute queue/partition to submit on-the-fly, by passing the ``--compute`` parameter to your call to ``looper run``, making it simple to use by default, but very flexible if you have complex resource needs.
 
 - **Standardized project definition:** Looper defines a flexible standard format for describing projects, and there are other tools that can read these same formats. For example, we are working on an R package that will read the same project definition and provide all your sample metadata (and pipeline results) in an R analysis environment, with no additional effort. With a standardized project definition, the possibilities are endless.

diff --git a/doc/source/index.rst b/doc/source/index.rst
@@ -38,7 +38,6 @@ Contents
 	faq.rst
 	changelog.rst
 
-
 Indices and tables
 ==================
 

diff --git a/doc/source/intro.rst b/doc/source/intro.rst
@@ -2,6 +2,7 @@
 Introduction
 =====================================
 
+
 Looper is a job submitting engine. If you have a pipeline and a bunch of samples you want to run through it, looper can help you organize the inputs and outputs. By defeault, it will just run your jobs sequentially on the local computer, but with a small configuration change, it will create and submit jobs to any cluster resource manager (like SLURM, SGE, or LFS).
 
 Here's the idea: We essentially provide a format specification (the :ref:`project config file <project-config-file>`), which you use to describe your project. You create this single configuration file (in `yaml format <http://www.yaml.org/>`_), which includes: 
@@ -18,6 +19,17 @@ Looper is modular and totally configurable, so it scales as your needs grow. We
 
 
 
+Installing
+******************************
+
+You can install directly from GitHub using pip:
+
+.. code-block:: bash
+
+	pip install --user https://github.com/epigen/looper/zipball/master
+
+
+
 Support
 ******************************
 Please use the issue tracker at GitHub to file bug reports or feature requests: https://github.com/epigen/looper/issues.

diff --git a/doc/source/tutorials.rst b/doc/source/tutorials.rst
@@ -14,6 +14,55 @@ First, install looper and pypiper (since our tutorial uses pypiper pipelines):
 Now, you will need to grab a project to run, and some pipelines to run on it. We have a functional working project example and an open source pipeline repository on github.
 
 
+.. code:: bash
+
+	git clone https://github.com/epigen/microtest.git
+	git clone https://github.com/epigen/open_pipelines.git
+
+
+Now you can run this project with looper! Just use ``looper run``:
+
+.. code:: bash
+
+	looper run microtest/config/microtest_config.yaml
+
+
+If the looper executable isn't in your path, check out the :doc:`FAQ <faq>`.
+
+Pipeline outputs
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+Outputs of pipeline runs will be under the directory specified in the ``output_dir`` variable under the ``paths`` section in the project config file (see :doc:`config-files` ) this is usually the name of the project being run.
+
+Inside there will be two directories:
+
+-  ``results_pipeline`` [1]_ - a directory containing one directory with the output of the pipelines, for each sample.
+-  ``submissions`` [2]_ - which holds yaml representations of the samples and log files of the submited jobs.
+
+
+The sample-specific output of each pipeline type varies and is described in :doc:`pipelines`.
+
+To use pre-made pipelines with your project, all you have to do is :doc:`define your project <define-your-project>` using looper's standard format. To link your own, custom built pipelines, you can :doc:`connect your pipeline to looper <connecting-pipelines>`.
+
+
+
+.. rubric:: Footnotes
+
+.. [1] This variable can also be specified in the ``results_subdir`` variable under the ``paths`` section of the project config file
+.. [2] This variable can also be specified in the ``submission_subdir`` variable under the ``paths`` section of the project config file
+
+
+First, install looper and pypiper (since our tutorial uses pypiper pipelines):
+
+
+.. code:: bash
+
+	pip install --user https://github.com/epigen/looper/zipball/master
+	pip install --user https://github.com/epigen/pypiper/zipball/master
+
+
+Now, you will need to grab a project to run, and some pipelines to run on it. We have a functional working project example and an open source pipeline repository on github.
+
+
 .. code:: bash
 
 	git clone https://github.com/epigen/microtest.git
@@ -49,9 +98,7 @@ Inside there will be two directories:
 To use pre-made pipelines with your project, all you have to do is :doc:`define your project <define-your-project>` using looper's standard format. To link your own, custom built pipelines, you can :doc:`connect your pipeline to looper <connecting-pipelines>`.
 
 
-
 .. rubric:: Footnotes
 
 .. [1] This variable can also be specified in the ``results_subdir`` variable under the ``metadata`` section of the project config file
 .. [2] This variable can also be specified in the ``submission_subdir`` variable under the ``metadata`` section of the project config file
-
diff --git a/doc/source/usage-and-commands.rst b/doc/source/usage-and-commands.rst
@@ -1,6 +1,7 @@
 Usage and commands
 ******************************
 
+
 Looper doesn't just run pipelines, it can also check and summarize the progress of your jobs, as well as remove all files created by them.
 
 Each task is controlled by one of the four main commands ``run``, ``summarize``, ``destroy``, ``check``: