nextflow-io · ewels · May 25, 2023 · May 25, 2023 · May 25, 2023 · May 25, 2023
diff --git a/content/blog/2023/best-practices-deploying-pipelines-with-hpc-workload-managers.md b/content/blog/2023/best-practices-deploying-pipelines-with-hpc-workload-managers.md
@@ -39,7 +39,7 @@ Depending on the workloads a cluster is designed to support, compute hosts may b
 
 HPC workload managers have been around for decades. Initial efforts date back to the original [Portable Batch System](https://www.chpc.utah.edu/documentation/software/pbs-scheduler.php) (PBS) developed for NASA in the early 1990s. While modern workload managers have become enormously sophisticated, many of their core principles remain unchanged.
 
-Workload managers are designed to share resources efficiently between users and groups. Modern workload managers support many different scheduling policies and workload types — from parallel jobs to array jobs to interactive jobs to affinity/NUMA-aware scheduling. As a result, schedulers have many "knobs and dials" to support various applications and use cases. While complicated, all of this configurability makes them extremely powerful and flexible in the hands of a skilled cluster administrator.  
+Workload managers are designed to share resources efficiently between users and groups. Modern workload managers support many different scheduling policies and workload types — from parallel jobs to array jobs to interactive jobs to affinity/NUMA-aware scheduling. As a result, schedulers have many "knobs and dials" to support various applications and use cases. While complicated, all of this configurability makes them extremely powerful and flexible in the hands of a skilled cluster administrator.
 
 ### Some notes on terminology
 
@@ -63,7 +63,7 @@ To ensure that pipelines are portable across clouds and HPC clusters, Nextflow u
 
 You can specify the executor to use in the [nextflow.config](https://nextflow.io/docs/latest/config.html?highlight=queuesize#configuration-file) file, inline in your pipeline code, or by setting the shell variable `NXF_EXECUTOR` before running a pipeline.
 
-```
+```groovy
 process.executor = 'slurm'
 ```
 
@@ -238,17 +238,12 @@ Most HPC workload managers support the notion of queues. In a small cluster with
 
 Workload managers typically have default queues. For example, `normal` is the default queue in LSF, while `all.q` is the default queue in Grid Engine. Slurm supports the notion of partitions that are essentially the same as queues, so Slurm partitions are referred to as queues within Nextflow. You should ask your HPC cluster administrator what queue to use when submitting Nextflow jobs.
 
-Like the executor, queues are part of the process scope. The queue to dispatch jobs to is usually defined once in the `nextflow.config` file and applied to all processes in the workflow. However, they can be defined individually for each process as shown:
+Like the executor, queues are part of the process scope. The queue to dispatch jobs to is usually defined once in the `nextflow.config` file and applied to all processes in the workflow as shown below, or it can be set per-process.
 
 ```
-process grid_job {
-    queue 'long'
-    executor 'sge'
-
-
-    """
-    your task script here
-    """
+process {
+    queue = 'myqueue'
+    executor = 'sge'
 }
 ```
 
@@ -266,9 +261,10 @@ Depending on the executor, you can pass various resource requirements for each p
 
 When writing pipelines, it is a good practice to consolidate per-process resource requirements in the `nextflow.config` file, and use process selectors to indicate what resource requirements apply to what process steps. For example, in the example below, processes will be dispatched to the Slurm cluster by default. Each process will require two cores, 4 GB of memory, and can run for no more than 10 minutes. For the foo and long-running bar jobs, process-specific selectors can override these default settings as shown below:
 
-```
+```groovy
 process {
     executor = 'slurm'
+    queue = 'general'
     cpus = 2
     memory = '4 GB'
     time = '10m'
@@ -281,10 +277,11 @@ process {
 
 
     withName: bar {
+       queue = 'long'
        cpus = 32
        memory = '8 GB'
-	time = '1h 30m'
-   }
+       time = '1h 30m'
+    }
 }
 ```
 
@@ -294,19 +291,16 @@ Sometimes, organizations may want to take advantage of syntax specific to a part
 
 These scheduler-specific commands can get very detailed and granular. They can apply to all processes in a workflow or only to specific processes. As an LSF-specific example, suppose a deep learning model training workload is a step in a Nextflow pipeline. The deep learning framework used may be GPU-aware and have specific topology requirements.
 
-In this example, we specify a job consisting of two tasks where each task runs on a separate host and requires exclusive use of two GPUs. We also impose a resource requirement that we want to schedule the CPU portion of each CUDA job in physical proximity to the GPU to improve performance (on a processor core close to the same PCIe or NVLink connection, for example).  
+In this example, we specify a job consisting of two tasks where each task runs on a separate host and requires exclusive use of two GPUs. We also impose a resource requirement that we want to schedule the CPU portion of each CUDA job in physical proximity to the GPU to improve performance (on a processor core close to the same PCIe or NVLink connection, for example).
 
-```
-process dl_workload {
-     executor 'lsf'
-     queue 'gpu_hosts'
-     memory '16B'
-     clusterOptions '-gpu "num=2:mode=exclusive_process" -n2 -R "span[ptile=1] affinity[core(1)]"'
-
-
-     """
-     your task script here
-     """
+```groovy
+process {
+  withName: dl_workload {
+     executor = 'lsf'
+     queue = 'gpu_hosts'
+     memory = '16B'
+     clusterOptions = '-gpu "num=2:mode=exclusive_process" -n2 -R "span[ptile=1] affinity[core(1)]"'
+  }
 }
 ```
 
@@ -327,8 +321,7 @@ $ cat submit_pipeline.sh
 #BSUB -e err.%J
 #BSUB -J headjob
 #BSUB -R "rusage[mem=16GB]"
-export NFX_OPTS="-Xms=512m -Xmx=8g"
-nextflow run nextflow=io/hello -bg -c my.config -ansi-log false
+nextflow run nextflow=io/hello -c my.config -ansi-log false
 
 
 $ bsub < submit_pipeline.sh
@@ -344,8 +337,8 @@ Setting the JVM’s max heap size is another good practice when running on an HP
 
 These can be specified using the `NXF_OPTS` environment variable.
 
-```
-$ export NFX_OPTS="-Xms=512m -Xmx=8g"
+```bash
+export NFX_OPTS="-Xms=512m -Xmx=8g"
 ```
 
 The `-Xms` flag specifies the minimum heap size, and -Xmx specifies the maximum heap size. In the example above, the minimum heap size is set to 512 MB, which can grow to a maximum of 8 GB. You will need to experiment with appropriate values for each pipeline to determine how many concurrent head jobs you can run on the same host.
@@ -358,15 +351,15 @@ Nextflow requires a shared file system path as a working directory to allow the
 
 Nextflow implements this best practice which can be enabled by adding the following setting in your `nextflow.config` file.
 
-```
+```groovy
 process.scratch = true
 ```
 
 By default, if you enable `process.scratch`, Nextflow will use the directory pointed to by `$TMPDIR` as a scratch directory on the execution host.
 
 You can optionally specify a specific path for the scratch directory as shown:
 
-```
+```groovy
 process.scratch = '/ssd_drive/scratch_dir'
 ```
 
@@ -385,8 +378,8 @@ To learn more about Nextflow and how it works with various storage architectures
 
 If you are launching your pipeline from a login node or cluster head node, it is useful to run pipelines in the background without losing the execution output reported by Nextflow. You can accomplish this by using the -bg switch in Nextflow and redirecting *stdout* to a log file as shown:
 
-```
-$ nextflow run <pipeline> -bg > my-file.log
+```bash
+nextflow run <pipeline> -bg > my-file.log
 ```
 
 This frees up the interactive command line to run commands such as [squeue](https://slurm.schedmd.com/squeue.html) (Slurm) or [qstat](https://gridscheduler.sourceforge.net/htmlman/htmlman1/qstat.html) (Grid Engine) to monitor job execution on the cluster. It is also beneficial because it prevents network connection issues from interfering with pipeline execution.
@@ -399,22 +392,15 @@ Getting resource requirements such as cpu, memory, and time is often challenging
 
 To address this problem, Nextflow provides a mechanism that allows you to modify the amount of computing resources requested in the case of a process failure on the fly and attempt to re-execute it using a higher limit. For example:
 
-```
-process foo {
-
-
-  memory { 2.GB * task.attempt }
-  time { 1.hour * task.attempt }
-
-
-  errorStrategy { task.exitStatus in 137..140 ? 'retry' : 'terminate' }
-  maxRetries 3
-
+```groovy
+process {
+  withName: foo {
+    memory = { 2.GB * task.attempt }
+    time = { 1.hour * task.attempt }
 
-  script:
-  """
-  your_job_command --here
-  """
+    errorStrategy = { task.exitStatus in 137..140 ? 'retry' : 'terminate' }
+    maxRetries = 3
+  }
 }
 ```
 
@@ -463,7 +449,7 @@ There are several additional Nextflow configuration options that are important t
 
 `submitRateLimit` – Depending on the scheduler, having many users simultaneously submitting large numbers of jobs to a cluster can overwhelm the scheduler on the head node and cause it to become unresponsive to commands. To mitigate this, if your pipeline submits a large number of jobs, it is a good practice to throttle the rate at which jobs will be dispatched from Nextflow. By default the job submission rate is unlimited. If you wanted to allow no more than 50 jobs to be submitted every two minutes, set this parameter as shown:
 
-```
+```groovy
 executor.submitRateLimit = '50/2min'
 executor.queueSize = 50
 ```
@@ -472,7 +458,7 @@ executor.queueSize = 50
 
 When using these tools, it is helpful to associate a meaningful name with each job. Remember, a job in the context of the workload manager maps to a process or task in Nextflow. Use the `jobName` property associated with the executor to give your job a name. You can construct these names dynamically as illustrated below so the job reported by the workload manager reflects the name of our Nextflow process step and its unique ID.
 
-```
+```groovy
 executor.jobName = { "$task.name - $task.hash" }
 ```