Skip to content
Permalink
Browse files
docs: Fix problems with code blocks and broken internal link. (#1424)
Fix missing newlines between code:: console and code block (tutorial/setup.rst/snakefiles/deployment.rst).
Fix broken link to gitpod. (from tutorial/basics.rst to tutorials/setup.rst).
Remove some leading spaces (various places).
  • Loading branch information
hwalinga committed Feb 24, 2022
1 parent 6ccb3d8 commit 5d4e7d8c4d7901c41bfb8f01c4b2c6551add59f7
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 11 deletions.
@@ -36,7 +36,7 @@ following structure:
└── resources
In other words, the workflow code goes into a subfolder ``workflow``, while the configuration is stored in a subfolder ``config``.
Inside of the ``workflow`` subfolder, the central ``Snakefile`` marks the entrypoint of the workflow (it will be automatically discovered when running snakemake from the root of above structure.
Inside of the ``workflow`` subfolder, the central ``Snakefile`` marks the entrypoint of the workflow (it will be automatically discovered when running snakemake from the root of above structure.
This main structure and the recommendations below are implemented in `this Snakemake workflow template <https://github.com/snakemake-workflows/snakemake-workflow-template>`_ that you can use to `create your own workflow repository with a single click on "Use this template" <https://github.com/snakemake-workflows/snakemake-workflow-template/generate>_.
In addition to the central ``Snakefile``, rules can be stored in a modular way, using the optional subfolder ``workflow/rules``.
Such modules should end with ``.smk``, the recommended file extension of Snakemake.
@@ -172,7 +172,7 @@ We can extend above example in the following way:
rules.rna_seq_all.input,
default_target: True
Above, several things have changed.
Above, several things have changed.

* First, we have added another module ``rna_seq``.
* Second, we have added a prefix to all non-absolute input and output file names of both modules (``prefix: "dna-seq"`` and ``prefix: "rna-seq"``) in order to avoid file name clashes.
@@ -277,13 +277,13 @@ Instead of using a concrete path, it is also possible to provide a path containi

Note that conda environments are only used with ``shell``, ``script`` and the ``wrapper`` directive, not the ``run`` directive.
The reason is that the ``run`` directive has access to the rest of the Snakefile (e.g. globally defined variables) and therefore must be executed in the same process as Snakemake itself.
Further, note that search path modifying environment variables like ``R_LIBS`` and ``PYTHONPATH`` can interfere with your conda environments.

Further, note that search path modifying environment variables like ``R_LIBS`` and ``PYTHONPATH`` can interfere with your conda environments.
Therefore, Snakemake automatically deactivates them for a job when a conda environment definition is used.
If you know what you are doing, in order to deactivate this behavior, you can use the flag ``--conda-not-block-search-path-envvars``.

Snakemake will store the environment persistently in ``.snakemake/conda/$hash`` with ``$hash`` being the MD5 hash of the environment definition file content. This way, updates to the environment definition are automatically detected.
Note that you need to clean up environments manually for now. However, in many cases they are lightweight and consist of symlinks to your central conda installation.
Note that you need to clean up environments manually for now. However, in many cases they are lightweight and consist of symlinks to your central conda installation.

Conda deployment also works well for offline or air-gapped environments. Running ``snakemake --use-conda --conda-create-envs-only`` will only install the required conda environments without running the full workflow. Subsequent runs with ``--use-conda`` will make use of the local environments without requiring internet access.

@@ -301,6 +301,7 @@ Therefore, the approach using environment definition files described above is hi
Nevertheless, in case you are still sure that you want to use an existing named environment, it can simply be put into the conda directive, e.g.

.. code-block:: python
rule NAME:
input:
"table.txt"
@@ -314,7 +315,7 @@ Nevertheless, in case you are still sure that you want to use an existing named
For such a rule, Snakemake will just activate the given environment, instead of automatically deploying anything.
Instead of using a concrete name, it is also possible to provide a name containing wildcards (which must also occur in the output files of the rule), analogous to the specification of input files.

Note that Snakemake distinguishes file based environments from named ones as follows:
Note that Snakemake distinguishes file based environments from named ones as follows:
if the given specification ends on ``.yaml`` or ``.yml``, Snakemake assumes it to be a path to an environment definition file; otherwise, it assumes the given specification
to be the name of an existing environment.

@@ -140,7 +140,7 @@ Nevertheless, we can **execute our workflow** with
$ snakemake --cores 1 mapped_reads/A.bam
Whenever executing a workflow, you need to specify the number of cores to use.
For this tutorial, we will use a single core for now.
For this tutorial, we will use a single core for now.
Later you will see how parallelization works.
Note that, after completion of above command, Snakemake will not try to create ``mapped_reads/A.bam`` again, because it is already present in the file system.
Snakemake **only re-runs jobs if one of the input files is newer than one of the output files or one of the input files will be updated by another job**.
@@ -232,7 +232,7 @@ We add the following rule beneath the ``bwa_map`` rule:
.. sidebar:: Note

In the shell command above we split the string into two lines, which are however automatically concatenated into one by Python.
This is a handy pattern to avoid too long shell command lines. When using this, make sure to have a trailing whitespace in each line but the last,
This is a handy pattern to avoid too long shell command lines. When using this, make sure to have a trailing whitespace in each line but the last,
in order to avoid arguments to become not properly separated.

This rule will take the input file from the ``mapped_reads`` directory and store a sorted version in the ``sorted_reads`` directory.
@@ -283,7 +283,7 @@ By executing
.. sidebar:: Note

If you went with: `Run tutorial for free in the cloud via Gitpod`_, you can easily view the resulting ``dag.svg`` by right-clicking on the file in the explorer panel on the left and selecting ``Open With -> Preview``.
If you went with: :ref:`tutorial-free-on-gitpod`, you can easily view the resulting ``dag.svg`` by right-clicking on the file in the explorer panel on the left and selecting ``Open With -> Preview``.


we create a **visualization of the DAG** using the ``dot`` command provided by Graphviz_.
@@ -50,6 +50,8 @@ To go through this tutorial, you need the following software installed:

However, don't install any of these this manually now, we guide you through better ways below.

.. _tutorial-free-on-gitpod:

Run tutorial for free in the cloud via Gitpod
:::::::::::::::::::::::::::::::::::::::::::::

@@ -69,7 +71,7 @@ Running the tutorial on your local machine

If you prefer to run the tutorial on your local machine, please follow the steps below.

The easiest way to set these prerequisites up, is to use the Mambaforge_ Python 3 distribution
The easiest way to set these prerequisites up, is to use the Mambaforge_ Python 3 distribution
(Mambaforge_ is a Conda based distribution like Miniconda_, which however uses Mamba_ a fast and more robust replacement for the Conda_ package manager).
The tutorial assumes that you are using either Linux or MacOS X.
Both Snakemake and Mambaforge_ work also under Windows, but the Windows shell is too different to be able to provide generic examples.
@@ -170,11 +172,13 @@ First, we download some example data on which the workflow shall be executed:
Next we extract the data. On Linux, run

.. code:: console
$ tar --wildcards -xf snakemake-tutorial-data.tar.gz --strip 1 "*/data" "*/environment.yaml"
On MacOS, run

.. code:: console
$ tar -xf snakemake-tutorial-data.tar.gz --strip 1 "*/data" "*/environment.yaml"
This will create a folder ``data`` and a file ``environment.yaml`` in the working directory.
@@ -194,7 +198,7 @@ The ``environment.yaml`` file that you have obtained with the previous step (Ste
$ mamba env create --name snakemake-tutorial --file environment.yaml
If you don't have the Mamba_ command because you used a different conda distribution than Mambaforge_, you can also first install Mamba_
If you don't have the Mamba_ command because you used a different conda distribution than Mambaforge_, you can also first install Mamba_
(which is a faster and more robust replacement for Conda_) in your base environment with

.. code:: console

0 comments on commit 5d4e7d8

Please sign in to comment.