nf-core/smrnaseq is a bioinformatics best-practice analysis pipeline used for small RNA sequencing data.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
-
Install
nextflow
(>=20.04.0
) -
Install any of
Docker
,Singularity
,Podman
,Shifter
orCharliecloud
for full pipeline reproducibility (please only useConda
as a last resort; see docs) -
Download the pipeline and test it on a minimal dataset with a single command:
nextflow run nf-core/smrnaseq -profile test,<docker/singularity/podman/shifter/charliecloud/conda/institute>
Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use
-profile <institute>
in your command. This will enable eitherdocker
orsingularity
and set the appropriate execution settings for your local compute environment. -
Start running your own analysis!
nextflow run nf-core/smrnaseq -profile <docker/singularity/podman/shifter/charliecloud/conda/institute> --input '*_R{1,2}.fastq.gz' --genome GRCh37
See usage docs for all of the available options when running the pipeline.
- Raw read QC (
FastQC
) - Adapter trimming (
Trim Galore!
)- Insert Size calculation
- Collapse reads (
seqcluster
)
- Alignment against miRBase mature miRNA (
Bowtie1
) - Alignment against miRBase hairpin
- Post-alignment processing of miRBase hairpin
- Alignment against host reference genome (
Bowtie1
)- Post-alignment processing of alignment against host reference genome (
SAMtools
)
- Post-alignment processing of alignment against host reference genome (
- Novel miRNAs and known miRNAs discovery (
MiRDeep2
)- Mapping against reference genome with the mapper module
- Known and novel miRNA discovery with the mirdeep2 module
- miRNA quality control (
mirtrace
) - Present QC for raw read, alignment, and expression results (
MultiQC
)
The nf-core/smrnaseq pipeline comes with documentation about the pipeline: usage and output.
nf-core/smrnaseq was originally written for use at the National Genomics Infrastructure at SciLifeLab in Stockholm, Sweden, by Phil Ewels (@ewels), Chuan Wang (@chuan-wang) and Rickard Hammarén (@Hammarn). Updated by Lorena Pantano (@lpantano) from MIT.
If you would like to contribute to this pipeline, please see the contributing guidelines.
For further information or help, don't hesitate to get in touch on the Slack #smrnaseq
channel (you can join with this invite).
You can cite the nf-core
publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
In addition, references of tools and data used in this pipeline are as follows: