Skip to content

Nextflow pipeline to run sourmash on input genomes

License

Notifications You must be signed in to change notification settings

fmalmeida/sourmash-nf

Repository files navigation

fmalmeida/sourmash-nf

Cite with Zenodo GitHub Actions CI Status GitHub Actions CI Status Nextflow run with conda run with docker run with singularity Launch on Nextflow Tower GitHub release (latest by date including pre-releases)

Introduction

fmalmeida/sourmashnf is a small and straightforward bioinformatics pipeline that uses sourmash to compare genome sequences and plot it like in https://sourmash.readthedocs.io/en/latest/tutorial-basic.html#compare-many-signatures-and-build-a-tree.

Usage

Note If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

The pipeline is very simple and does not have many optional parameters. Customization to the sourmash modules can be done using the special ext.args directive as explained here: https://nf-co.re/developers/modules#general

The usual command line looks like this:

nextflow run \
   fmalmeida/sourmash-nf \
      -profile <docker/singularity/conda> \
      --input <path to directory with input genomes> \
      --outdir <OUTDIR>

Warning: Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

Command line help

Users can check the parameters in the command line with:

nextflow run fmalmeida/sourmash-nf --help

Pipeline output

To quickly generate and check the outputs generated by the pipeline, one can execute:

nextflow run \
   fmalmeida/sourmash-nf \
      -profile docker,test \
      --outdir ./results

All the outputs will be available in the directory called "results".

Credits

fmalmeida/sourmashnf was originally written by Felipe Marques de Almeida (@fmalmeida).

Contributions and Support

If you would like to contribute to this pipeline, please do so 😄.

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.