A pipeline to run mapping, mash screen and assembly methods for pATLAS.
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
docker-images
lib
templates
.gitignore
.gitmodules
LICENSE
README.md
main.nf
nextflow.config

README.md

pATLASflow

A pipeline to run mapping, mash screen and assembly methods for pATLAS.

TOC

TL;DR

  1. Read files must be placed in <current working dir>/reads/ folder

  2. Fasta files must be placed in <current working dir>/fasta/ folder

  3. Run the pipeline nextflow run tiagofilipe12/pATLASflow with the options you require:

    • Assembly: nextflow run tiagofilipe12/pATLASflow --assembly
    • Mapping: nextflow run tiagofilipe12/pATLASflow --mapping
    • Mash screen: nextflow run tiagofilipe12/pATLASflow --mash_screen

    Note: you can even run all approaches by doing: nextflow run tiagofilipe12/pATLASflow --assembly --mapping --mash_screen

Brief description

This Nextflow script is an implementation of mash-wrapper for mash screen module. It will output a JSON file that can be imported into pATLAS.

Requirements

First of all, pATLASflow is a Nextflow pipeline and thus does not require the installation of third party programs since they are provided through docker container that can be used both with singularity and docker.

Conda recipe for nextflow

If you prefer you can use this conda recipe for nextflow: install with bioconda

conda install nextflow

Usage

Usage: nextflow run tiagofilipe12/pATLASflow [options] or nextflow run main.nf [options] or ./main.nf [options]

  Nextflow magic options:
       -profile    Forces nextflow to run with docker or singularity.   Default: standard     Choices: standard, singularity, slurm
   Main options:
       --help  Opens this help. It will open only when --help is provided. So, yes, this line is pretty useless since you already know that if you reached here.
       --version   Prints the version of the pipeline script.
       --mash_screen   Enables mash screen run.
       --assembly  Enables mash dist run to use fasta file against plasmid db
       --mapping   Enables mapping pipeline.
   Mash options:
       --kMer  the length of the kmer to be used by mash.   Default: 21
       --pValue    The p-value cutoff. Default: 0.05
   Mash screen exclusive options:
       --identity  The minimum identity value between two sequences. Default: 0.9
       --noWinner  This option allows to disable the -w option of mash screen  Default: false
   Mash dist exclusive options:
       --mash_distance     Provide the maximum distance between two plasmids to be reported.   Default: 0.1
   Reads options:
       --reads The path to the read files. Here users may provide many samples in the same directory. However be assured that glob pattern is unique (e.g. 'path/to/*_{1,2}.fastq').
       --singleEnd Provide this option if you have single-end reads. By default the pipeline will assume that you provide paired-end reads.    Default: false
   Fasta options:
       --fasta     Provide fasta file pattern to be searched by nextflow.  Default: 'fasta/*.fas'
   Bowtie2 options:
       --trim5     Provide parameter -5 to bowtie2 allowing to trim 5' end.    Default: 0
       --cov_cutoff    Provide a cutoff value to filter results for coverage results.  Default: 0.60

Example run

nextflow run tiagofilipe12/pATLASflow --assembly

Slurm profile

One can run this pipeline using the slurm profile, which enalbes to use it with shifter and slurm.

IMM-lobo users

In order to avoid the usage of compute-1 you need to uncomment line 62 in nextflow.config file.

Results

Results will be placed in a folder named results within the current working directory.