Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
90 lines (71 sloc) 4.94 KB
title subtitle
Guidelines
Guidelines and requirements for nf-core pipelines.

Requirements for nf-core pipelines

If you're thinking of adding a new pipeline to nf-core, please read the documentation about adding a new pipeline.

Workflow size and specificity

We aim to have a "not too big, not too small" rule with nf-core pipelines. This is deliberately fuzzy, but as a rule of thumb workflows should contain at least three processes and be simple enough to run that a new user can realistically run the pipeline after spending ten minutes reading the docs. Pipelines should be general enough to be of use to multiple groups and research projects, but comprehensive enough to cover most steps in a primary analysis.

Different pipelines should not overlap one another too much. For example, having multiple choices for tools and parameters to do the same tasks should be contained in a single pipeline with varying parameters. However, if the purpose of the pipeline tasks and results are different, then this should be a separate pipeline.

The above instructions are subject to interpretation and specific scenarios. If in doubt, please ask the community for feedback on Slack.

Minimum requirements

All nf-core pipelines must adhere to the following:

  • Be built using Nextflow
  • Have an MIT licence
  • Have software bundled using Docker
  • Continuous integration testing
  • Stable release tags
  • Common pipeline structure and usage
    • Standard filenames as supplied in the template, such as main.nf and docs/
    • Use the same command line option names as other pipelines for comparable options, e.g. --reads and --genome
  • Run in a single command
    • i.e. Not multiple separate workflows in a single repository
    • It is ok to have workflows that use the output of another nf-core pipeline as input
  • Excellent documentation and GitHub repository keywords
  • A responsible contact person / GitHub username
    • This will typically be the main person behind the pipeline development
    • This person should be responsible for basic maintenance and questions
  • The pipeline must not have any failures in the nf-core lint tests
    • These tests are run by the nf-core/tools package and validate the requirements listed on this page.
    • You can see the list of tests and how to pass them on the error codes page.

Recommended features

If possible, it's great if pipelines can also have:

  • All software bundled using bioconda
    • Nearly all nf-core pipelines use a conda env script to list their software requirements. The pipeline Docker images are then built using this, meaning that with a single file your pipeline can support nextflow users running with conda, docker or singularity.
    • The nf-core template comes with all required code to support this setup.
  • Optimised output file formats
    • Pipelines should generate CRAM alignment files by default, but have a --bam option to generate BAM outputs if required by the user.
  • Digital object identifiers (DOIs) for easy referencing in literature
    • Typically each release should have a DOI generated by Zenodo. This can be automated through linkage with the GitHub repository.
  • Explicit support for running in cloud environments
  • Benchmarks from running on cloud environments such as AWS

Workflow name

All nf-core pipeliens should be lower case and without punctuation. This is to maximise compatibility with other platforms such as dockerhub, which enforce such rules. In documentation, please refer to your pipeline as nf-core/yourpipeline.

Coding style

The nf-core style requirements are growing and maturing over time. Typically, as we agree on a new standard we try to build a test for it into the nf-core lint command. As such, to get a feel for what's expected, please read the lint test error codes.

However, in general, pipelines must:

  • Use config profiles to organise hardware-specific options
  • Run with as little input as possible
    • Metadata (eg. be able to run with just FastQ files, where possible)
    • Reference files (eg. auto-generate missing reference files, where possible)
  • Keep only code for the latest stable on the main master branch.
    • The main development code should be kept in a branch called dev
  • Use GitHub releases and keep a detailed changelog file
  • Follow a versioning approach, e.g. Semantic Versioning for your pipeline releases
You can’t perform that action at this time.