Skip to content

How to run YAMP on a HPC

Alessia Visconti edited this page Mar 23, 2021 · 4 revisions

This tutorial explains how to run YAMP on high-performance computing (HPC) facilities.

Nextflow executor

To run on different (HPC or not) systems, YAMP takes advantage of the Nextflow framework architecture, and, specifically of its executor. Briefly, a Nextflow executor is a component that specifies on which system YAMP is run, and orchestrates the YAMP execution.

For instance, in the rosalind profile (Rosalind is King's College London's HPC, more on profiles here), the executor is set to:

process.executor = 'slurm'

but Nextflow supports multiple executors, among others:

  • SGE: executor = 'sge'
  • LSF: executor = 'lsf'
  • PBS: executor = 'pbs'
  • SLURM: executor = 'slurm'

You can find more information on the supported schedulers on the Nextflow documetation

To run YAMP locally, you should not specify any executor.

Nextflow queue

Usually, schedulers offer users multiple queues to which the jobs can be submitted to. For instance, they can have queues dedicated to short or long jobs, or to jobs that require low or high memory. YAMP takes advantage of the Nextflow queue parameter to specify the HPC queue(s) to be used.

For instance, in the rosalind profile, the queue is set to:

process.queue = 'brc'

To run on your system, you should simply specify the same of your queue(s), for instance:

process.queue = 'highmem,long-highmem'

You can find more information on the queue directive on the Nextflow documetation.

Other Nextflow directives

Nextflow makes available a number of other directives that allow allocating the correct resources on the local or remote system, and that are explained here:

Please note that the processes' specifications you will find in the ./conf/base.config file (that is, time, CPU and memory), have been optimised using our in-house metagenomic dataset which is composed of about 2000 faecal samples with very different data quality and thus very different requirements. These values may require some tuning, but we are confident that they will cover most of the users' scenarios.