# Day 6 - Nextflow Basics: Introduction to channels and operators

Today, we will begin exploring Nextflow, the programming language that powers the advanced nf-core pipelines you worked with last week. Your task is to dive into the core concepts and syntax of Nextflow, understanding how this language enables the development of scalable and reproducible workflows.

Nextflow works quite differently from traditional programming languages like Python or Java that you may already be familiar with. To get started, it is essential to understand the foundational concepts that set Nextflow apart.

### 1. Describe the concept of Workflows, Processes and Channels we deal with in Nextflow

Workflows:
The overall structure of a pipeline. They define the order of execution by connecting processes through channels. In DSL2 they are written inside a workflow { } block.

Processes:
The computational steps of a pipeline. Each process defines its inputs, outputs, and a script (bash, python, R, etc.). A process is executed automatically for each element received from a channel.

Channels:
The data streams between processes. They carry values or files. Each new element in a channel triggers a new execution of a process. Channels can be created and transformed with built-in operators.

## Introduction to channels

Please refer to the file  $\texttt{channels\_intro.nf}$ for the next exercises. Then run the code with the respective flag here below. 

In [None]:
# Task 1 - Create a channel that enumerates the numbers from 1 to 10
!nextflow run channels_intro.nf --step 1

In [None]:
# Task 2 - Create a channel that gives out the entire alphabet
!nextflow run channels_intro.nf --step 2

In [None]:
# Task 3 - Create a channel that includes all files in the "files_dir" directory
!nextflow run channels_intro.nf --step 3

In [None]:
# Task 4 - Create a channel that includes all TXT files in the "files_dir" directory
!nextflow run channels_intro.nf --step 4

In [None]:
# Task 5 - Create a channel that includes the files "fastq_1.fq" and "fastq_2.fq" in the "files_dir" directory
!nextflow run channels_intro.nf --step 5

In [None]:
# Task 6 - go back to the time when you included all files. Are you sure that really ALL files are included? If not, how can you include them?
!nextflow run channels_intro.nf --step 6

In [None]:
# Task 7 - get all filepairs in the "files_dir" directory
!nextflow run channels_intro.nf --step 7

## Now that you have a solid understanding of the basic concepts of channels in Nextflow, it’s time to experiment and see how they work in practice.

To do so, Nextflow has the concept of Operators to give and pass information inbetween channels.

Please answer the questions in $\texttt{basic\_channel\_operations.nf}$ and run the code here. 

In [7]:
# Task 1 - Extract the first item from the channel
!nextflow run basic_channel_operations.nf --step 1


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basic_channel_operations.nf` [0;2m[[0;1;36mtiny_mercator[0;2m] DSL2 - [36mrevision: [0;36md107e83b31[m
[K
1


In [None]:
# Task 2 - Extract the last item from the channel
!nextflow run basic_channel_operations.nf --step 2

In [None]:
# Task 3 - Use an operator to extract the first two items from the channel
!nextflow run basic_channel_operations.nf --step 3

In [None]:
# Task 4 - Return the squared values of the channel
!nextflow run basic_channel_operations.nf --step 4

In [None]:
# Task 5 - Remember the previous task where you squared the values of the channel. Now, extract the first two items from the squared channel
!nextflow run basic_channel_operations.nf --step 5

In [None]:
# Task 6 - Remember when you used bash to reverse the output? Try to use map and Groovy to reverse the output
!nextflow run basic_channel_operations.nf --step 6

In [None]:
# Task 7 - Use fromPath to include all fastq files in the "files_dir" directory, then use map to return a pair containing the file name and the file path (Hint: include groovy code)
!nextflow run basic_channel_operations.nf --step 7

In [None]:
# Task 8 - Combine the items from the two channels into a single channel
!nextflow run basic_channel_operations.nf --step 8

In [None]:
# Task 9 - Flatten the list in the channel
!nextflow run basic_channel_operations.nf --step 9

In [None]:
# Task 10 - Collect the items of a channel into a list. What kind of channel is the output channel?
!nextflow run basic_channel_operations.nf --step 10

What kind of channel is the output channel?

The output channel from is a **value channel** (singleton channel).

In [None]:
# Task 11 -  From the input channel, create lists where each first item in the list of lists is the first item in the output channel, followed by a list of all the items its paired with
!nextflow run basic_channel_operations.nf --step 11

In [None]:
# Task 12 - Create a channel that joins the input to the output channel. What do you notice?
!nextflow run basic_channel_operations.nf --step 12

Task 12 - What do you notice compared to Task 11?

In [None]:
groupTuple() groups all items with the same key from one channel,
join() combines matching items from two separate channels and only keeps items that exist in both channels

In [None]:
# Task 14 - Nextflow has the concept of maps. Write the names in the maps in this channel to a file called "names.txt". Each name should be on a new line. 
#           Store the file in the "results" directory under the name "names.txt"

!nextflow run basic_channel_operations.nf --step 14

!cat results/names.txt


## Now that we learned about Channels and Operators to deal with them, let's focus on Processes that make use of these channels.

Please answer the questions in $\texttt{basics\_processes.nf}$ and run the code here. 

In [8]:
# Task 1 - create a process that says Hello World! (add debug true to the process right after initializing to be sable to print the output to the console)
!nextflow run basics_processes.nf --step 1


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mzen_brahmagupta[0;2m] DSL2 - [36mrevision: [0;36m7d08a17680[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mSAYHELLO -[K
[2A
[2mexecutor >  local (1)[m[K
[2m[[0;34md3/8cf475[0;2m] [0;2m[mSAYHELLO[2m |[m 1 of 1[32m ✔[m[K
Hello World![K
[K



In [9]:
# Task 2 - create a process that says Hello World! using Python
!nextflow run basics_processes.nf --step 2


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mvoluminous_poisson[0;2m] DSL2 - [36mrevision: [0;36m7d08a17680[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mSAYHELLO_PYTHON -[K
[2A
[2mexecutor >  local (1)[m[K
[2m[[0;34m6b/c06088[0;2m] [0;2m[mSAYHELLO_PYTHON[2m |[m 1 of 1[32m ✔[m[K
Hello World![K
[K
[5A
[2mexecutor >  local (1)[m[K
[2m[[0;34m6b/c06088[0;2m] [0;2m[mSAYHELLO_PYTHON[2m |[m 1 of 1[32m ✔[m[K
Hello World![K
[K



In [10]:
# Task 3 - create a process that reads in the string "Hello world!" from a channel and write it to command line
!nextflow run basics_processes.nf --step 3


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36madoring_easley[0;2m] DSL2 - [36mrevision: [0;36m7d08a17680[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mSAYHELLO_PARAM[2m |[m 0 of 1[K
[2A
[2mexecutor >  local (1)[m[K
[2m[[0;34mb2/74a219[0;2m] [0;2m[mSAYHELLO_PARAM[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K
Hello world![K
[K



In [12]:
# Task 4 - create a process that reads in the string "Hello world!" from a channel and write it to a file. 
!nextflow run basics_processes.nf --step 4


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36madoring_goodall[0;2m] DSL2 - [36mrevision: [0;36mc4be722323[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mSAYHELLO_FILE[2m |[m 0 of 1[K
[2A
[2mexecutor >  local (1)[m[K
[2m[[0;34m1f/e58cc8[0;2m] [0;2m[mSAYHELLO_FILE[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K



the file is created either in the results directory and qlso in the work directory.

In [13]:
# Task 5 - create a process that reads in a string and converts it to uppercase and saves it to a file as output. View the path to the file in the console
!nextflow run basics_processes.nf --step 5


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mkickass_lumiere[0;2m] DSL2 - [36mrevision: [0;36mc4be722323[m
[K
[2mexecutor >  local (1)[m[K
[2m[[0;34m-        [0;2m] [0;2m[mUPPERCASE[2m |[m 0 of 1[K
[3A
[2mexecutor >  local (1)[m[K
[2m[[0;34m17/4e4229[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K
/Users/Nikita/Desktop/Studium/Master/Comp Workflows/computational-workflows-2025/notebooks/day_04/work/17/4e42295c6f657bac6322be8a03135a/uppercase.txt[K



In [14]:
# Task 6 - add another process that reads in the resulting file from UPPERCASE and print the content to the console (debug true).
!nextflow run basics_processes.nf --step 6


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mstupefied_stallman[0;2m] DSL2 - [36mrevision: [0;36mc4be722323[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mUPPERCASE  -[K
[2m[[0;34m-        [0;2m] [0;2m[mPRINTUPPER -[K
[3A
[2mexecutor >  local (2)[m[K
[2m[[0;34me6/29cff3[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m [2m |[m 1 of 1[32m ✔[m[K
[2m[[0;34m0d/1b5216[0;2m] [0;2m[mPRINTUPPER[33;2m ([0;33m1[2m)[m[2m |[m 0 of 1[K
[4A
[2mexecutor >  local (2)[m[K
[2m[[0;34me6/29cff3[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m [2m |[m 1 of 1[32m ✔[m[K
[2m[[0;34m0d/1b5216[0;2m] [0;2m[mPRINTUPPER[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K
HELLO WORLD![K
[K



Compared to all the other runs. What changed in the output here and why?

Task 5 uses `.view()` to show the file path, while Task 6 chains processes together - the UPPERCASE output file becomes input to PRINTUPPER, which displays the actual file contents ("HELLO WORLD!"). This demonstrates process chaining in Nextflow.

In [15]:
# Task 7 - based on the paramater "zip" (see at the head of the file), create a process that zips the file created in the UPPERCASE process either in "zip", "gzip" OR "bzip2" format.
#          Print out the path to the zipped file in the console
!nextflow run basics_processes.nf --step 7


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mevil_ptolemy[0;2m] DSL2 - [36mrevision: [0;36mc4be722323[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mUPPERCASE[2m |[m 0 of 1[K
[2m[[0;34m-        [0;2m] [0;2m[mCOMPRESS  -[K
[3A
[2mexecutor >  local (2)[m[K
[2m[[0;34mf0/6fc237[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K
[2m[[0;34m67/9dfd8f[0;2m] [0;2m[mCOMPRESS[33;2m ([0;33m1[2m)[m [2m |[m 1 of 1[32m ✔[m[K
/Users/Nikita/Desktop/Studium/Master/Comp Workflows/computational-workflows-2025/notebooks/day_04/work/67/9dfd8ffd246ba6ce727f1d495fb213/compressed.zip[K
[5A
[2mexecutor >  local (2)[m[K
[2m[[0;34mf0/6fc237[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K
[2m[[0;34m67/9dfd8f[0;2m] [0;2m[mCOMPRESS[33;2m ([0;33m1[2m)[m [2m |[m 1 of 1[32m ✔[m[K
/Users/Nikita/Desktop/Studium/Master/Comp W

In [16]:
# Task 8 - Create a process that zips the file created in the UPPERCASE process in "zip", "gzip" AND "bzip2" format. Print out the paths to the zipped files in the console
!nextflow run basics_processes.nf --step 8


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mwise_easley[0;2m] DSL2 - [36mrevision: [0;36mc4be722323[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mUPPERCASE -[K
[2A
[2mexecutor >  local (1)[m[K
[2m[[0;34m46/504035[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m[2m |[m 0 of 1[K
[2m[[0;34m-        [0;2m] [0;2m[mCOMPRESS_ALL  -[K
[4A
[2mexecutor >  local (2)[m[K
[2m[[0;34m46/504035[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m   [2m |[m 1 of 1[32m ✔[m[K
[2m[[0;34m23/eb6e1a[0;2m] [0;2m[mCOMPRESS_ALL[33;2m ([0;33m1[2m)[m[2m |[m 0 of 1[K
[4A
[2mexecutor >  local (2)[m[K
[2m[[0;34m46/504035[0;2m] [0;2m[mUPPERCASE[33;2m ([0;33m1[2m)[m   [2m |[m 1 of 1[32m ✔[m[K
[2m[[0;34m23/eb6e1a[0;2m] [0;2m[mCOMPRESS_ALL[33;2m ([0;33m1[2m)[m[2m |[m 1 of 1[32m ✔[m[K
[/Users/Nikita/Desktop/Studium/Master/Comp Workflows/computational-workflows-2

In [17]:
 # Task 9 - Create a process that reads in a list of names and titles from a channel and writes them to a file.
#           Store the file in the "results" directory under the name "names.tsv"
!nextflow run basics_processes.nf --step 9


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `basics_processes.nf` [0;2m[[0;1;36mamazing_brazil[0;2m] DSL2 - [36mrevision: [0;36mc4be722323[m
[K
[2mexecutor >  local (1)[m[K
[2m[[0;34mc9/629690[0;2m] [0;2m[mWRITETOFILE[2m |[m 0 of 1[K
[3A
[2mexecutor >  local (1)[m[K
[2m[[0;34mc9/629690[0;2m] [0;2m[mWRITETOFILE[2m |[m 1 of 1[32m ✔[m[K



In [None]:
import pandas as pd

df = pd.read_csv("results/names.tsv", sep="\t")
df

## Now, let's try some more advanced Operators

Please answer the questions in $\texttt{advanced\_channel\_operations.nf}$ and run the code here. 

To come closer to actual pipelines, we introduce the concept of "meta-maps" which you can imagine as dictionaries that are passed with data via channels containing crucial metadata on the sample. 

Also, we will come back to samplesheets which you should remember from last week.

In [18]:
# Task 1 - Read in the samplesheet.

!nextflow run advanced_channel_operations.nf --step 1


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mconfident_faraday[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mconfident_faraday[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
[sample:CONTROL_REP1, fastq_1:fq_1_R1.fastq.gz, fastq_2:fq_1_R2.fastq.gz, strandedness:auto]
[sample:CONTROL_REP2, fastq_1:fq_2_R1.fastq.gz, fastq_2:fq_2_R2.fastq.gz, strandedness:forward]
[sample:CONTROL_REP3, fastq_1:fq_3_R1.fastq.gz, fastq_2:fq_3_R2.fastq.gz, strandedness:reverse]
[sample:CONTROL_REP1, fastq_1:fq_4_R1.fastq.gz, fastq_2:fq_4_R2.fastq.gz, strandedness:auto]
[sample:CONTROL_REP1, fastq_1:fq_1_R1.fastq.gz, fastq_2:fq_1_R2.fastq.gz, strandedness:auto]
[sample:CONTROL_REP2, fastq_1:fq_2_R1.fastq.gz, fastq_2:fq_2_R2.fastq.gz, strandedness:forward]
[sample:CONTROL_REP3, fastq_1:fq_3_R1.fastq.gz, fastq_2:fq_3_R2.fastq.gz, strande

In [19]:
# Task 2 - Read in the samplesheet and create a meta-map with all metadata and another list with the filenames ([[metadata_1 : metadata_1, ...], [fastq_1, fastq_2]]).
#          Set the output to a new channel "in_ch" and view the channel. YOU WILL NEED TO COPY AND PASTE THIS CODE INTO SOME OF THE FOLLOWING TASKS (sorry for that).

!nextflow run advanced_channel_operations.nf --step 2


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mzen_panini[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mzen_panini[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
[[sample:CONTROL_REP1, strandedness:auto], [fq_1_R1.fastq.gz, fq_1_R2.fastq.gz]]
[[sample:CONTROL_REP2, strandedness:forward], [fq_2_R1.fastq.gz, fq_2_R2.fastq.gz]]
[[sample:CONTROL_REP3, strandedness:reverse], [fq_3_R1.fastq.gz, fq_3_R2.fastq.gz]]
[[sample:CONTROL_REP1, strandedness:auto], [fq_4_R1.fastq.gz, fq_4_R2.fastq.gz]]
[[sample:CONTROL_REP1, strandedness:auto], [fq_1_R1.fastq.gz, fq_1_R2.fastq.gz]]
[[sample:CONTROL_REP2, strandedness:forward], [fq_2_R1.fastq.gz, fq_2_R2.fastq.gz]]
[[sample:CONTROL_REP3, strandedness:reverse], [fq_3_R1.fastq.gz, fq_3_R2.fastq.gz]]
[[sample:CONTROL_REP1, strandedness:auto], [fq_4_R1.fastq.gz, fq_4_R2.fastq.gz]]


In [20]:
# Task 3 - Now we assume that we want to handle different "strandedness" values differently. 
#          Split the channel into the right amount of channels and write them all to stdout so that we can understand which is which.

!nextflow run advanced_channel_operations.nf --step 3 -dump-channels


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mcheeky_edison[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mcheeky_edison[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
FORWARD: [[sample:CONTROL_REP2, strandedness:forward], [fq_2_R1.fastq.gz, fq_2_R2.fastq.gz]]
REVERSE: [[sample:CONTROL_REP3, strandedness:reverse], [fq_3_R1.fastq.gz, fq_3_R2.fastq.gz]]
AUTO: [[sample:CONTROL_REP1, strandedness:auto], [fq_1_R1.fastq.gz, fq_1_R2.fastq.gz]]
AUTO: [[sample:CONTROL_REP1, strandedness:auto], [fq_4_R1.fastq.gz, fq_4_R2.fastq.gz]]
FORWARD: [[sample:CONTROL_REP2, strandedness:forward], [fq_2_R1.fastq.gz, fq_2_R2.fastq.gz]]
REVERSE: [[sample:CONTROL_REP3, strandedness:reverse], [fq_3_R1.fastq.gz, fq_3_R2.fastq.gz]]
AUTO: [[sample:CONTROL_REP1, strandedness:auto], [fq_1_R1.fastq.gz, fq_1_R2.fastq.gz]]
AUTO: [[sample:CONTROL_

In [21]:
# Task 4 - Group together all files with the same sample-id and strandedness value.

!nextflow run advanced_channel_operations.nf --step 4


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mgloomy_albattani[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
Launching[35m `advanced_channel_operations.nf` [0;2m[[0;1;36mgloomy_albattani[0;2m] DSL2 - [36mrevision: [0;36m92d27bba08[m
[K
[[CONTROL_REP1, auto], [[[sample:CONTROL_REP1, strandedness:auto], [fq_1_R1.fastq.gz, fq_1_R2.fastq.gz]], [[sample:CONTROL_REP1, strandedness:auto], [fq_4_R1.fastq.gz, fq_4_R2.fastq.gz]]]]
[[CONTROL_REP2, forward], [[[sample:CONTROL_REP2, strandedness:forward], [fq_2_R1.fastq.gz, fq_2_R2.fastq.gz]]]]
[[CONTROL_REP3, reverse], [[[sample:CONTROL_REP3, strandedness:reverse], [fq_3_R1.fastq.gz, fq_3_R2.fastq.gz]]]]
[[CONTROL_REP1, auto], [[[sample:CONTROL_REP1, strandedness:auto], [fq_1_R1.fastq.gz, fq_1_R2.fastq.gz]], [[sample:CONTROL_REP1, strandedness:auto], [fq_4_R1.fastq.gz, fq_4_R2.fastq.gz]]]]
[[CONTROL_REP2, forward], [[[sample:CONTROL_RE

## It's finally time to link processes and channels with each other

Please go to the file $\texttt{link\_p\_c.nf}$

In [22]:
!nextflow run link_p_c.nf


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `link_p_c.nf` [0;2m[[0;1;36mdisturbed_kalman[0;2m] DSL2 - [36mrevision: [0;36m3834507de4[m
[K
Launching[35m `link_p_c.nf` [0;2m[[0;1;36mdisturbed_kalman[0;2m] DSL2 - [36mrevision: [0;36m3834507de4[m
[K
[2m[[0;34m-        [0;2m] [0;2m[mSPLITLETTERS   -[K
[2m[[0;34m-        [0;2m] [0;2m[mCONVERTTOUPPER -[K
[2m[[0;34m-        [0;2m] [0;2m[mSPLITLETTERS   -[K
[2m[[0;34m-        [0;2m] [0;2m[mCONVERTTOUPPER -[K
[3A
[3A
[2mexecutor >  local (5)[m[K
[2m[[0;34m84/d9b59a[0;2m] [0;2m[mSPLITLETTERS[33;2m ([0;33m2[2m)[m  [2m |[m 2 of 2[32m ✔[m[K
[2m[[0;34m67/b8f466[0;2m] [0;2m[mCONVERTTOUPPER[33;2m ([0;33m3[2m)[m[2m |[m 0 of 11[K
[2mexecutor >  local (5)[m[K
[2m[[0;34m84/d9b59a[0;2m] [0;2m[mSPLITLETTERS[33;2m ([0;33m2[2m)[m  [2m |[m 2 of 2[32m ✔[m[K
[2m[[0;34m67/b8f466[0;2m] [0;2m[mCONVERTTOUPPER[33;2m ([0;33m3

### Give a list with the paths to the chunk files

In [23]:
# Based on samplesheet_2.csv, the chunk files created are:
# From "Hello World" (block_size=4): "Hell", "o Wo", "rld"
# From "Computational Workflows" (block_size=3): "Com", "put", "ati", "ona", "l W", "ork", "flo", "ws"

chunk_files = [
    "results/chunk_h_w_1.txt",  # "Hell"
    "results/chunk_h_w_2.txt",  # "o Wo" 
    "results/chunk_h_w_3.txt",  # "rld"
    "results/chunk_c_w_1.txt",  # "Com"
    "results/chunk_c_w_2.txt",  # "put"
    "results/chunk_c_w_3.txt",  # "ati"
    "results/chunk_c_w_4.txt",  # "ona"
    "results/chunk_c_w_5.txt",  # "l W"
    "results/chunk_c_w_6.txt",  # "ork"
    "results/chunk_c_w_7.txt",  # "flo"
    "results/chunk_c_w_8.txt"   # "ws"
]

print("Chunk files created:")
for file in chunk_files:
    print(f"- {file}")
    
print(f"\nTotal number of chunk files: {len(chunk_files)}")

Chunk files created:
- results/chunk_h_w_1.txt
- results/chunk_h_w_2.txt
- results/chunk_h_w_3.txt
- results/chunk_c_w_1.txt
- results/chunk_c_w_2.txt
- results/chunk_c_w_3.txt
- results/chunk_c_w_4.txt
- results/chunk_c_w_5.txt
- results/chunk_c_w_6.txt
- results/chunk_c_w_7.txt
- results/chunk_c_w_8.txt

Total number of chunk files: 11


### Why was CONVERTTOUPPER run so often?

In [24]:
# CONVERTTOUPPER runs 11 times because the flatten operator creates 11 separate files
# from the SPLITLETTERS process:

# From "Hello World" (block_size=4): 3 chunks
# From "Computational Workflows" (block_size=3): 8 chunks
# Total: 11 chunks

# Each chunk file becomes a separate input to the CONVERTTOUPPER process.
# In Nextflow, when a channel contains multiple items (11 chunk files),
# the process runs once for each item in the channel.

# This demonstrates the core principle of Nextflow: 
# processes automatically parallelize over channel elements.

print("CONVERTTOUPPER runs 11 times because:")
print("1. SPLITLETTERS creates 11 chunk files total")
print("2. The flatten operator separates each file into individual channel elements")
print("3. CONVERTTOUPPER processes each file independently")
print("4. Nextflow automatically parallelizes process execution over channel items")

CONVERTTOUPPER runs 11 times because:
1. SPLITLETTERS creates 11 chunk files total
2. The flatten operator separates each file into individual channel elements
3. CONVERTTOUPPER processes each file independently
4. Nextflow automatically parallelizes process execution over channel items
