Skip to content
Computational pipelines for processing of paired-end Illumina reads
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs Sample location (#31) Jul 15, 2019
enviroments
example
legacy
pipeline/viral_identification/read_based Smolak_fve (#18) Mar 16, 2019
rules
src
.gitignore
License.txt
README.rst
config.yaml
snakelines.snake Sample location (#31) Jul 15, 2019

README.rst

Welcome to SnakeLines

Motivation

With decreasing price of massive parallel sequencing technologies, more and more laboratories are utilizing resulting sequences of DNA fragments for genomic analysis. An substantial obstacle for interpretation is transforming of sequenced data into results interpretable by clinicians and researchers without computational background. Laboratories are generally using computational pipelines consisting of several bioinformatic tools.

We propose several computational pipelines for processing of paired-end Illumina reads; including mapping, assembly, variant calling, viral identification, RNA-seq and metagenomics analysis. All provided pipelines are embedded into virtual environments that ensures isolation of required resources from host operating system, rapid deployment and reproducibilty of results accross different platforms.

How to execute pipelines

Source code of the SnakeLines pipelines can be downloaded from the Github repository. The documentation is accessible from the ReadTheDocs. See the Quick start or the Running SnakeLines section of the documentation to see instructions for execution of pipelines.

Why use SnakeLines

Several workflow management systems have been proposed to date, most notably Galaxy and Snakemake. SnakeLines extends traditional SnakeMake rules with following benefits:

Ready-to-use pipelines

SnakeLines contains wide set of computational pipelines that are ready to use immediately after downloading SnakeLines sources. In addition to standard secondary analysis, such as mapping and assembly, SnakeMake facilitate:

  • identification of variation in mapped reads
  • metagenomics analysis
  • transcriptomics
  • methylation profiles

Reporting

Each step of analysis is supported by graphical or tabular reports. Essential output and report files are stored in separate, report directory. In addition to reports that are generated for each sample individually, SnakeLines store also aggregated reports to examine quality of several samples at once.

No need to install additional tools

SnakeLines requires only Snakemake and Miniconda to be installed. The rest of bioinformatic tools required for analysis are compiled automatically. Tools are installed into separated virtual environments. This way, they do not break any installation or dependencies on the operating system.

Easily configurable

Each pipeline is parametrized in a single, YAML based configuration file. Configuration is useful overview over analysis, since every step of the analysis is configured there in sequential order. You may easily swap tools or change their parameters by adjusting the configuration file.

Easily extensible

Structure of the SnakeLines sources allows to easily extend existing pipelines, or just replace method in some step with custom solution.

You can’t perform that action at this time.