Skip to content
A DSL for data-driven computational pipelines
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci Sync build scripts May 3, 2019
.github Added pull request template Jan 13, 2019
config/codenarc Updated copyright disclaimer Jan 10, 2019
docker Updated copyright disclaimer Jan 10, 2019
docs [release 19.07.0-edge] Update timestamp and build numbers Jul 9, 2019
gradle/wrapper Upgrade Gradle to version 4.10 Oct 3, 2018
modules Prevent build deploy when version already exist Jul 9, 2019
tests Fix Custom lib path is not added to the classpath #1213 Jul 3, 2019
validation Improve S3 download retry #1164 Jun 9, 2019
.gitignore Add to gitignore build subdirectories May 2, 2019
.travis.yml Sync build scripts May 3, 2019
CONTRIBUTING.md Fix urls and typos in readme and contributing files May 19, 2019
COPYING Bumped license to Apache 2.0 #788 Oct 22, 2018
GIT-README.md Update git-readme Dec 26, 2018
Makefile Added smoke target to makefile Feb 23, 2019
NOTICE Updated copyright disclaimer Jan 10, 2019
README.md README suggested installation is insecure #1173 Jun 7, 2019
build.gradle [release 19.07.0-edge] Update timestamp and build numbers Jul 9, 2019
changelog.txt Update changelog Jul 9, 2019
compile.sh Added 'provided' configuration in gradle build and some clean-up Sep 13, 2014
console.sh Updated console launcher Feb 9, 2017
gradlew Upgrade to Gradle 4.9 Jul 17, 2018
gradlew.bat Upgrade to Gradle 4.9 Jul 17, 2018
integration-tests.sh Sync build scripts May 3, 2019
launch.sh Updated copyright disclaimer Jan 10, 2019
nextflow [release 19.07.0-edge] Update timestamp and build numbers Jul 9, 2019
nextflow.md5 [release 19.07.0-edge] Update timestamp and build numbers Jul 9, 2019
nextflow.sha1 [release 19.07.0-edge] Update timestamp and build numbers Jul 9, 2019
nextflow.sha256 [release 19.07.0-edge] Update timestamp and build numbers Jul 9, 2019
profile.sh Added YourKit launcher file Mar 23, 2014
pub-tests.sh Improve CI tests scripts Apr 28, 2019
settings.gradle Updated copyright disclaimer Jan 10, 2019

README.md

Nextflow logo

"Dataflow variables are spectacularly expressive in concurrent programming"
Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum

Chat on Gitter Nextflow version Nextflow Twitter Nextflow Publication install with bioconda Nextflow license

Quick overview

Nextflow is a bioinformatics workflow manager that enables the development of portable and reproducible workflows. It supports deploying workflows on a variety of execution platforms including local, HPC schedulers, AWS Batch, Google Genomics Pipelines, and Kubernetes. Additionally, it provides support for manage your workflow dependencies through built-in support for Conda, Docker, Singularity, and Modules.

Contents

Rationale

With the rise of big data, techniques to analyse and run experiments on large datasets are increasingly necessary.

Parallelization and distributed computing are the best ways to tackle this problem, but the tools commonly available to the bioinformatics community often lack good support for these techniques, or provide a model that fits badly with the specific requirements in the bioinformatics domain and, most of the time, require the knowledge of complex tools or low-level APIs.

Nextflow framework is based on the dataflow programming model, which greatly simplifies writing parallel and distributed pipelines without adding unnecessary complexity and letting you concentrate on the flow of data, i.e. the functional logic of the application/algorithm.

It doesn't aim to be another pipeline scripting language yet, but it is built around the idea that the Linux platform is the lingua franca of data science, since it provides many simple command line and scripting tools, which by themselves are powerful, but when chained together facilitate complex data manipulations.

In practice, this means that a Nextflow script is defined by composing many different processes. Each process can execute a given bioinformatics tool or scripting language, to which is added the ability to coordinate and synchronize the processes execution by simply specifying their inputs and outputs.

Quick start

Download the package

Nextflow does not require any installation procedure, just download the distribution package by copying and pasting this command in your terminal:

curl -fsSL https://get.nextflow.io | bash

It creates the nextflow executable file in the current directory. You may want to move it to a folder accessible from your $PATH.

Download from Conda

Nextflow can also be installed from Bioconda

conda install -c bioconda nextflow 

Documentation

Nextflow documentation is available at this link http://docs.nextflow.io

HPC Schedulers

Nextflow supports common HPC schedulers, abstracting the submission of jobs from the user.

Currently the following clusters are supported:

For example to submit the execution to a SGE cluster create a file named nextflow.config, in the directory where the pipeline is going to be launched, with the following content:

process {
  executor='sge'
  queue='<your execution queue>'
}

In doing that, processes will be executed by Nextflow as SGE jobs using the qsub command. Your pipeline will behave like any other SGE job script, with the benefit that Nextflow will automatically and transparently manage the processes synchronisation, file(s) staging/un-staging, etc.

Cloud support

Nextflow also supports running workflows across various clouds and cloud technologies. Nextflow can create AWS EC2 or Google GCE clusters and deploy your workflow. Managed solutions from both Amazon and Google are also supported through AWS Batch and Google Genomics Pipelines. Additionally, Nextflow can run workflows on either on-prem or managed cloud Kubernetes clusters.

Currently supported cloud platforms:

Tool management

Containers

Nextflow has first class support for containerization. It supports both Docker and Singularity container engines. Additionally, Nextflow can easily switch between container engines enabling workflow portability.

process samtools {
  container 'biocontainers/samtools:1.3.1'

  """
  samtools --version 
  """

}

Conda environments

Conda environments provide another option for managing software packages in your workflow.

Environment Modules

Environment modules commonly found in HPC environments can also be used to manage the tools used in a Nextflow workflow.

Community

You can post questions, or report problems by using the Nextflow discussion forum or the Nextflow channel on Gitter.

Nextflow also hosts a yearly workshop showcasing researcher's workflows and advancements in the langauge. Talks from the past workshops are available on the Nextflow YouTube Channel

The nf-core project is a community effort aggregating high quality Nextflow workflows which can be used by the community.

Build from source

Required dependencies

  • Compiler Java 8
  • Runtime Java 8 or later

Build from source

Nextflow is written in Groovy (a scripting language for the JVM). A pre-compiled, ready-to-run, package is available at the Github releases page, thus it is not necessary to compile it in order to use it.

If you are interested in modifying the source code, or contributing to the project, it worth knowing that the build process is based on the Gradle build automation system.

You can compile Nextflow by typing the following command in the project home directory on your computer:

make compile

The very first time you run it, it will automatically download all the libraries required by the build process. It may take some minutes to complete.

When complete, execute the program by using the launch.sh script in the project directory.

The self-contained runnable Nextflow packages can be created by using the following command:

make pack

In order to install the compiled packages use the following command:

make install

Then you will be able to run nextflow using the nextflow launcher script in the project root folder.

Known compilation problems

Nextflow required JDK 8 to be compiled. The Java compiler used by the build process can be choose by setting the JAVA_HOME environment variable accordingly.

If the compilation stops reporting the error: java.lang.VerifyError: Bad <init> method call from inside of a branch, this is due to a bug affecting the following Java JDK:

  • 1.8.0 update 11
  • 1.8.0 update 20

Upgrade to a newer JDK to avoid to this issue. Alternatively a possible workaround is to define the following variable in your environment:

_JAVA_OPTIONS='-Xverify:none'

Read more at these links:

IntelliJ IDEA

Nextflow development with IntelliJ IDEA requires the latest version of the IDE (2019.1.2 or later).

If you have it installed in your computer, follow the steps below in order to use it with Nextflow:

  1. Clone the Nextflow repository to a directory in your computer.
  2. Open IntelliJ IDEA and choose "Import project" in the "File" menu bar.
  3. Select the Nextflow project root directory in your computer and click "OK".
  4. Then, choose the "Gradle" item in the "external module" list and click on "Next" button.
  5. Confirm the default import options and click on "Finish" to finalize the project configuration.
  6. When the import process complete, select the "Project structure" command in the "File" menu bar.
  7. In the showed dialog click on the "Project" item in the list of the left, and make sure that the "Project SDK" choice on the right contains Java 8.
  8. Set the code formatting options with setting provided here.

Contributing

Project contribution are more than welcome. See the CONTRIBUTING file for details.

Build servers

License

The Nextflow framework is released under the Apache 2.0 license.

Citations

If you use Nextflow in your research, please cite:

P. Di Tommaso, et al. Nextflow enables reproducible computational workflows. Nature Biotechnology 35, 316–319 (2017) doi:10.1038/nbt.3820

Credits

Nextflow is built on two great pieces of open source software, namely Groovy and Gpars.

YourKit is kindly supporting this open source project with its full-featured Java Profiler. Read more http://www.yourkit.com

You can’t perform that action at this time.