-
Notifications
You must be signed in to change notification settings - Fork 39
/
structure.rst
executable file
·66 lines (49 loc) · 5.49 KB
/
structure.rst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
.. _structure:
==================
Software Structure
==================
Overview
========
Sunbeam is a snakemake pipeline with a python library acting as a wrapper (``sunbeamlib``). Calling ``sunbeam run [args] [options]`` is a call to this wrapper library which then invokes the necessary snakemake commands. The main Snakefile can be found in the ``workflow/`` directory and it makes use of rules from ``workflow/rules/`` and ``extensions/``, scripts from ``workflow/scripts/``, and environments from ``workflow/envs/``. Tests are run with pytest and live in the ``tests/`` directory. Documentation lives in ``docs/`` and is served by ReadTheDocs.
.. tip::
Some of these sections won't exist if you install via tar.
Sections
========
sunbeam/ (root directory)
-------------------------
The root sunbeam directory holds a few important files including ``environment.yml``, ``pyproject.toml``, ``Dockerfile``, and ``install.sh``. The environment file defines the dependencies required to run sunbeam and is used to create the main sunbeam environment. The pyproject file defines the structure and dependencies of the sunbeamlib_ and makes it installable via pip. The Dockerfile defines the image containing all internal environments for containerized runs. The install script is used to install sunbeam and has its own page in the documentation.
.. tip::
``environment.yml`` defines the main sunbeam environment that you activate in order to run the pipeline. Internally, sunbeam then manages a number of other environments (defined in envs_) on a per-rule basis.
There is also ``.readthedocs.yaml``, which sets up the Sphinx build of the documentation to be able to import sunbeamlib, and ``MANIFEST.in``, which tells sunbeamlib to include the ``data/`` subdirectory while installing.
docs/
-----
Each page of the sunbeam documentation is here in the form of a ``.rst`` file. The additional files are all involved in the setup and deployment of the docs to ReadTheDocs using Sphinx. Most of these are autogenerated by Sphinx. The one bit of trickiness comes from importing the version of sunbeam into the docs build. This is done in ``conf.py`` by adding the sunbeam root to ``sys.path`` and then importing ``sunbeamlib`` which stores the version tag in ``__version__``.
.. _envs:
workflow/envs/
-----
This directory contains ``.yml`` files defining environments that will be managed by snakemake as it runs. Anywhere that a rule is defined with ``conda: /path/to/ENV_NAME.yml``, when snakemake reaches that rule, that environment will be created if it doesn't exist already and then activated while running the rule. These environments are created in ``sunbeam/.snakemake/`` by default.
The accompanying files named something like ``ENV_NAME.ARCH.pin.txt`` are generated with ``snakedeploy``. They list all the packages and exact versions in a given environment (for the architecture they were generated on, e.g. linux-64) so that snakemake can first try to use that exact environment and only if it fails, try to solve the ``.yml`` file for itself.
extensions/
-----------
This directory will contain any extensions you install with ``sunbeam extend`` or any extensions that you develop as well as a ``.placeholder`` file that is just there to make sure the directory always exists. Any extensions should be in their own directories that start with ``sbx_``.
workflow/rules/
------
This directory contains all of the snakemake rules that get imported by the main ``Snakefile``. The rules are organized into subdirectories by function and each subdirectory has an associated environment to run its rules in ``envs/``.
workflow/scripts/
--------
This directory contains any python code that needs to be executed by snakemake rules. Each is named according to the rule that calls it.
.. _sunbeamlib:
src/sunbeamlib/
-----------
This directory contains the python library that acts as a runner/utility for the underlying snakemake. Many python files contain utility functions whiles those prefixed by ``script_`` define the commands for sunbeam. ``script_sunbeam.py`` takes in ``sunbeam [cmd]`` and then routes it to the file matching the given command. The ``.yml/.yaml`` data files include the default config file as well as some sample config templates for running on a cluster. It also contains the default profile template and one for slurm.
tests/
------
This directory contains the tests for the core sunbeam pipeline. Under ``data/`` are raw, shortened bacterial genomes and host genomes used for generating the reads used as input. ``e2e/`` contains end-to-end tests for each sunbeam programm: config, extend, init, list_samples, and run. ``unit/`` contains unit tests broken into two sections, ``rules/``, which tests each rule's logic individually, and ``sunbeamlib``, which tests functions within sunbeamlib.
Hidden Directories
------------------
.github/
********
This directory contains the ``PULL_REQUEST_TEMPLATE.md`` file which defines a template for any pull requests on the sunbeam repository and ``ISSUE_TEMPLATE/`` which contains issue templates for the repository. It is also where CI/CD job workflows live.
.snakemake/
***********
This directory is created the first time you run sunbeam. It will contain all the auxiliary environments created by snakemake (each environment will be named by a hash of the ``.yml`` file, so any changes to those files will result in a new environment being built). It also includes things like logs of previous runs and singularity images/builds if you use singularity.