guide/nextflow_vdsl3/create-and-use-a-module.qmd

---
title: Create and use a module
order: 10
---


{{< include ../../_includes/_language_chooser.qmd >}}

```{r setup, include=FALSE}
repo_path <- system("git rev-parse --show-toplevel", intern = TRUE)
source(paste0(repo_path, "/_includes/_r_helper.R"))
source(paste0(repo_path, "/guide/component/_language_examples.R"))

temp_dir <- tempfile("create-a-module")
dir.create(temp_dir, recursive = TRUE, showWarnings = FALSE)
on.exit(unlink(temp_dir, recursive = TRUE), add = TRUE)

# create tempdir with modified files
langs <- langs %>%
  mutate(
    label = gsub("#", "\\\\#", label),
    config_path = paste0(temp_dir, "/", id, "/", basename(example_config)),
    script_path = paste0(temp_dir, "/", id, "/", basename(example_script))
  )
pwalk(
  langs %>% filter(id == "bash"),
  function(id, label, example_config, example_script, config_path, script_path, ...) {
  # create dir
  dir.create(paste0(temp_dir, "/", id), recursive = TRUE, showWarnings = FALSE)
  
  # copy script
  file.copy(example_script, script_path)
  file.copy(example_config, config_path)
})
```

Creating a VDSL3 module is as simple as adding `{ type: nextflow }` to the `platforms` section in the Viash config. Luckily, our previous example already contained such an entry:


::: {.panel-tabset}
```{r create-config, output="asis"}
pwalk(langs, function(id, label, example_config, ...) {
  qrt(
    "## {% label %}
    |
    |```yaml
    |{% paste(readr::read_lines(example_config), collapse = '\n    |') %}
    |```
    |
    |")
})
```
:::

## Generating a VDSL3 module {#generate-module}

We will now turn the Viash component into a VDSL3 module. By default, the `viash build` command will select the first platform in the list of platforms. To select the `nextflow` platform, use the `--platform nextflow` argument, or `-p nextflow` for short.

```{r viash-build-nxf}
#| echo: false
#| output: asis
id <- "bash"
qrt(
  "```{bash build-example}
  |viash build config.vsh.yaml -o target -p nextflow
  |```
  |
  |This will generate a Nextflow module in the `target/` directory:
  |
  |```{bash view-tree}
  |tree target
  |```
  |", 
  .dir = paste0(temp_dir, "/", id)
)
```

This `main.nf` file is both a [standalone Nextflow pipeline](#run-module) and a module which can be imported as [part of another pipeline](#include-module).

:::{.callout-tip}
In larger projects it's recommended to use the [`viash ns build`](/reference/cli/ns_build.qmd) command to [build all of the components](/guide/project/batch-processing.qmd) in one go. Give it a try!
:::

## Running a module as a standalone pipeline {#run-module}


Unlike typical Nextflow modules, VDSL3 modules can actually be used as a standalone pipeline.

To run a VDSL3 module as a standalone pipeline, you need to specify the input parameters and a `--publish_dir` parameter, as Nextflow will automatically choose the parameter names of the output files.

```{r nextflow-run, echo=FALSE, output="asis"}
id <- "bash"
qrt(
  "
  |You can run the executable by providing a value for `--input` and `--publish_dir`:
  |
  |```{bash}
  |nextflow run target/main.nf --input config.vsh.yaml --publish_dir output/
  |```
  |
  |This results in the following output:
  |
  |```{bash}
  |tree output
  |```
  |
  |The pipeline help can be shown by passing the `--help`
  |parameter (Output not shown).
  | 
  |```bash
  |nextflow run target/main.nf --help
  |```
  |",
  .dir = paste0(temp_dir, "/", id)
)
```


## Passing a parameter list {#param-list}

Every VDSL3 can accept a list of parameters to populate a Nextflow channel with.

```{r nextflow-run-param-list, echo=FALSE, output="asis"}
id <- "bash"
qrt(
  "
  |For example, we create a set of input files which we want to process in parallel.
  |
  |```{bash}
  |touch sample1.txt sample2.txt sample3.txt sample4.txt
  |```
  |
  |```{bash, echo=FALSE}
  |cat > param_list.yaml << HERE
  |- id: sample1
  |  input: {% paste0(temp_dir, '/', id) %}/sample1.txt
  |- id: sample2
  |  input: {% paste0(temp_dir, '/', id) %}/sample2.txt
  |- id: sample3
  |  input: {% paste0(temp_dir, '/', id) %}/sample3.txt
  |- id: sample4
  |  input: {% paste0(temp_dir, '/', id) %}/sample4.txt
  |HERE
  |```
  |
  |Next, we create a YAML file `param_list.yaml` containing an `id` 
  |and an `input` value for each parameter entry.
  |
  |```{embed, lang='yaml'}
  |param_list.yaml
  |```
  |
  |You can run the pipeline on the list of parameters using the `--param_list`
  |parameter.
  | 
  |```{bash}
  |nextflow run target/main.nf --param_list param_list.yaml --publish_dir output2
  |```
  |
  |This results in the following outputs:
  |
  |```{bash}
  |tree output2
  |```
  |
  |",
  .dir = paste0(temp_dir, "/", id)
)
```

:::{.callout-tip}
Instead of a YAML, you can also pass a JSON or a CSV to the `--param_list` 
parameter.
:::


## Module as part of a pipeline {#include-module}

This module can also be used as part of a Nextflow pipeline.
Below is a short preview of what this looks like.

```groovy
include { mymodule1 } from 'target/nextflow/mymodule1/main.nf'
include { mymodule2 } from 'target/nextflow/mymodule2/main.nf'

workflow {
  Channel.fromList([
    [
      // a unique identifier for this tuple
      "myid", 
      // the state for this tuple
      [
        input: file("in.txt"),
        module1_k: 10,
        module2_k: 4
      ]
    ]
  ])
    | mymodule1.run(
      // use a hashmap to define which part of the state is used to run mymodule1
      fromState: [
        input: "input",
        k: "module1_k"
      ],
      // use a hashmap to define how the output of mymodule1 is stored back into the state
      toState: [
        module1_output: "output"
      ]
    )
    | mymodule2.run(
      // use a closure to define which data is used to run mymodule2
      fromState: { id, state -> 
        [
          input: state.module1_output,
          k: state.module2_k
        ]
      },
      // use a closure to return only the output of module2 as a new state
      toState: { id, output, state ->
        output
      },
      auto: [
        publish: true
      ]
    )
}
```

We will discuss building pipelines with VDSL3 modules in more detail in [Create a pipeline](create-a-pipeline.qmd).