## Nextflow with SLURM Tutorial

Let's run a Nextflow pipeline.

In [1]:
module load pcluster-helpers

In [2]:
pcluster-helper --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1mpcluster-helper [OPTIONS] COMMAND [ARGS]...[0m[1m                            [0m[1m [0m
[1m                                                                                [0m
 Helper functions for aws parallelcluster.                                      
                                                                                
[2m╭─[0m[2m Options ───────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36m-[0m[1;36m-help[0m      Show this message and exit.                                      [2m│[0m
[2m╰──────────────────────────────────────────────────────────────────────────────╯[0m
[2m╭─[0m[2m Commands ──────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36mgen-nxf-slurm-config[0m[1;36m [0m Generate a slurm.config for nextflow that is           [2m│[0m

### Generate a Nextflow slurm.config

We'll use the `pcluster-helper gen-nxf-slurm-config` in order to generate a default slurm configuration file.

In [3]:
pcluster-helper gen-nxf-slurm-config  --help

[1m                                                                                [0m
[1m [0m[1;33mUsage: [0m[1mpcluster-helper gen-nxf-slurm-config [OPTIONS][0m[1m                         [0m[1m [0m
[1m                                                                                [0m
 Generate a slurm.config for nextflow that is compatible with your cluster.     
 [2mYou will see a process label for each partition/node type.[0m                     
 [2mUse the configuration in your process by setting the label to match the label [0m 
 [2min the config.[0m                                                                 
                                                                                
[2m╭─[0m[2m Options ───────────────────────────────────────────────────────────────────[0m[2m─╮[0m
[2m│[0m [1;36m-[0m[1;36m-output[0m     [1;32m-o[0m  [1;33mTEXT[0m  Output path                                           [2m│[0m
[2m│[0m [1;36m-[0m

In [4]:
pcluster-helper gen-nxf-slurm-config  --output slurm.config --overwrite

Generating NXF Slurm config
// *****************************************************
// SlurmExecutor
// https://github.com/nextflow-io/nextflow/blob/master/modules/nextflow/src/main/groovy/nextflow/executor/SlurmExecutor.groovy
// *****************************************************

singularity.autoMounts = true

profiles {
    slurm {
        slurm.enabled          = true
        singularity.enabled    = true
        params.enable_conda    = false
        docker.enabled         = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
}

process {
    executor='slurm'
    queueSize = 15
    pollInterval = '5 min'
    dumpInterval = '6 min'
    queueStatInterval = '5 min'
    exitReadTimeout = '13 min'
    killBatchSize = 30
    submitRateLimit = '20 min'

    //****************************************
    // Defaults
    //****************************************

    cpus = 1
    memory = ''
    // In order t

We'll also want to create a default configuration for jobs that don't have a process tag. I'll choose a small one for this demonstration, but you should choose which instance is best for your workflows.

In [5]:
cat > ./slurm-default.config <<'EOF'
process {
    executor='slurm'
     // mem = 12
    cpus = 2
    memory = ''
    clusterOptions = '--partition basic --constraint t3alarge'
}
EOF

In [6]:
wget https://raw.githubusercontent.com/nf-core/rnaseq/master/conf/test.config

--2022-06-23 22:28:35--  https://raw.githubusercontent.com/nf-core/rnaseq/master/conf/test.config
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2200 (2.1K) [text/plain]
Saving to: ‘test.config.1’


2022-06-23 22:28:35 (37.0 MB/s) - ‘test.config.1’ saved [2200/2200]



In [7]:
wget https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/samplesheet/v3.4/samplesheet_test.csv
cat samplesheet_test.csv |wc -l

--2022-06-23 22:28:35--  https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/samplesheet/v3.4/samplesheet_test.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1121 (1.1K) [text/plain]
Saving to: ‘samplesheet_test.csv.1’


2022-06-23 22:28:35 (35.4 MB/s) - ‘samplesheet_test.csv.1’ saved [1121/1121]

0


### Reduce the number of samples

The samplesheet has 8 rows, and I don't want to actually run 8. I'll run the first sample.

In [8]:
cat samplesheet_test.csv |head -n 2 > samplesheet_test_t.csv ; mv samplesheet_test_t.csv samplesheet_test.csv

In [9]:
module load nextflow

In [10]:
nextflow -h

Usage: nextflow [options] COMMAND [arg...]

Options:
  -C
     Use the specified configuration file(s) overriding any defaults
  -D
     Set JVM properties
  -bg
     Execute nextflow in background
  -c, -config
     Add the specified file to configuration set
  -d, -dockerize
     Launch nextflow via Docker (experimental)
  -h
     Print this help
  -log
     Set nextflow log file path
  -q, -quiet
     Do not print information messages
  -syslog
     Send logs to syslog server (eg. localhost:514)
  -v, -version
     Print the program version

Commands:
  clean         Clean up project cache and work directories
  clone         Clone a project into a folder
  config        Print a project configuration
  console       Launch Nextflow interactive console
  drop          Delete the local copy of a project
  help          Print the usage help for a command
  info          Print project and system runtime information
  kuberun       Execute a workflow in a Kubernetes cluster (experimental

In [11]:
nextflow run -h

Execute a pipeline project
Usage: run [options] Project name or repository url
  Options:
    -E
       Exports all current system environment
       Default: false
    -ansi-log
       Enable/disable ANSI console logging
    -bucket-dir
       Remote bucket where intermediate result files are stored
    -cache
       Enable/disable processes caching
    -disable-jobs-cancellation
       Prevent the cancellation of child jobs on execution termination
    -dsl1
       Execute the workflow using DSL1 syntax
       Default: false
    -dsl2
       Execute the workflow using DSL2 syntax
       Default: false
    -dump-channels
       Dump channels for debugging purpose
    -dump-hashes
       Dump task hash keys for debugging purpose
       Default: false
    -e.
       Add the specified variable to execution environment
       Syntax: -e.key=value
       Default: {}
    -entry
       Entry workflow name to be executed
    -h, -help
       Print the command usage
       Default: false
    -

In [12]:
export NXF_SINGULARITY_CACHEDIR=$HOME/.singularity/

timeout 120 nextflow \
    run \
    nf-core/rnaseq \
    -with-dag flowchart.png \
    -with-trace \
    -w /tmp/nxf-work \
    --input ./samplesheet_test.csv \
    -resume \
    -profile slurm \
    -c test.config \
    -c slurm-default.config \
    -c slurm.config \
    --outdir ./results || echo 'Complete example'

exit 0

N E X T F L O W  ~  version 22.04.0
WARN: It appears you have never run this project before -- Option `-resume` is ignored
Launching `https://github.com/nf-core/rnaseq` [nasty_lamarck] DSL2 - revision: 89bf536ce4 [master]


-[2m----------------------------------------------------[0m-
                                        [0;32m,--.[0;30m/[0;32m,-.[0m
[0;34m        ___     __   __   __   ___     [0;32m/,-._.--~'[0m
[0;34m  |\ | |__  __ /  ` /  \ |__) |__         [0;33m}  {[0m
[0;34m  | \| |       \__, \__/ |  \ |___     [0;32m\`-._,-`-,[0m
                                        [0;32m`._,._,'[0m
[0;35m  nf-core/rnaseq v3.8.1[0m
-[2m----------------------------------------------------[0m-
[1mCore Nextflow options[0m
  [0;34mrevision                  : [0;32mmaster[0m
  [0;34mrunName                   : [0;32mnasty_lamarck[0m
  [0;34mcontainerEngine           : [0;32msingularity[0m
  [0;34mlaunchDir                 : [0;32m/scratch/ftp/user2135/interna

In [13]:
# cleanup
rm -rf test.config.*
rm -rf samplesheet_test.csv.*
rm -rf .nextflow
sleep 1m
rm -rf .nextflow