Nextflow Pipelines

Bulk RNA-seq

We are recommending the popular nextflow pipeline for RNA sequencing analysis pipeline using STAR, HISAT2 and Salmon with gene counts and quality control.

Useful links:

Nextflow RNA-seq main: https://nf-co.re/rnaseq
Nextflow RNA-seq usage: https://nf-co.re/rnaseq/docs/usage#running-the-pipeline
Nextflow configuration: https://www.nextflow.io/docs/latest/config.html
Nextflow pipelines: https://nf-co.re/pipelines
Solution to SGE using the wrong shell: https://github.com/nextflow-io/nextflow/issues/21

Installation and test run

Log into wynton

ssh user@log2.wynton.ucsf.edu

ssh into a dev node

ssh dev2

From your home directory, download nextflow

curl -fsSL get.nextflow.io | bash

Make a bin directory (if you haven't already) and move nextflow there

mkdir bin
mv nextflow ~/bin/

Create a nextflow configuration file to specify SGE settings

printf 'process.executor = "sge"\nprocess.penv = "smp"\nprocess.clusterOptions = "-S /bin/bash"' > .nextflow/config

Run the nextflow test pipeline specifying the singularity profile. The console will display the progress in realtime. A warning message will appear during the first run regarding the automatic creation of a singularity cache directory.

nextflow run nf-core/rnaseq -profile test,singularity

The output be in the results directory. Pipeline reports are in results/pipeline_info/. Note: if you get an error, try running it a second time.

ls results/
ls results/pipeline_info

Custom runs

Now you can setup and run the pipeline on your own data with step like the following:

Copy your fastq files over to wynton (see How to move data)
Specify max_memory, genome, reads and optional skip* arguments in the command (see docs on reads, genome and many others args that considered carefully)

nextflow run nf-core/rnaseq --max_memory '8.GB' --skipBiotypeQC --skipFastQC --skipTrimming --genome GRCh38 --reads '*_R{1,2}.fastq.gz' -profile singularity

Pro-tips:

Review the execution_report.html to determine the necessary max_memory value for your analysis.
You may want to use screen or tmux to manage longer runs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nextflow Pipelines

Bulk RNA-seq

Installation and test run

Custom runs

ATAC-seq

Other analyses

Documentation

Clone this wiki locally