# Computational Workflows for biomedical data

Welcome to the course Computational Workflows for Biomedical Data. Over the next two weeks, you will learn how to leverage nf-core pipelines to analyze biomedical data and gain hands-on experience in creating your own pipelines, with a strong emphasis on Nextflow and nf-core.

Course Structure:

- Week 1: You will use a variety of nf-core pipelines to analyze a publicly available biomedical study.
- Week 2: We will shift focus to learning the basics of Nextflow, enabling you to design and implement your own computational workflows.<br>
- Final Project: The last couple of days, you will apply your knowledge to create a custom pipeline for analyzing biomedical data using Nextflow and the nf-core template.

## Basics

If you have not installed all required software, please do so now asap!


If you already installed all software, please go on and start answering the questions in this notebook. If you have any questions, don't hesitate to approach us.

1. What is nf-core?

nf-core is a project leveraging nextflow to create standardized, open-source workflow components and pipelines

2. How many pipelines are there currently in nf-core?

139 (84 released, 43 under development, 12 archived)

3. Are there any non-bioinformatic pipelines in nf-core?

Yes, for example 
- https://nf-co.re/rangeland/1.0.0/ (geology)
- https://nf-co.re/meerpipe/dev/ (astronomy)
- https://nf-co.re/spinningjenny/dev/ (history)

4. Let's go back a couple of steps. What is a pipeline and what do we use it for?

A pipeline is a program which runs multiple individual tools to process input data in a pre-determined order.

Pipelines are used to process and analyze data in a single step, where the same version and parameters of each individual tool are chosen.

5. Why do you think nf-core adheres to strict guidelines?

Ensure that all pipelines in the project adhere to common standards, such as open-source, functionality, and ease of use, which boosts reproducability.

6. What are the main features of nf-core pipelines?

components
- dependencies (nextflow + docker profile / conda enviroment)
- pipeline name
- pipeline specific parameters
- input data



## Let's start using the pipelines

1. Find the nf-core pipeline used to measure differential abundance of genes

https://nf-co.re/differentialabundance/1.5.0/

In [None]:
# run the pipeline in a cell 
# to run bash in jupyter notebooks, simply use ! before the command
# e.g.

!pwd


# For the tasks in the first week, please use the command line to run your commands and simply paste the commands you used in the respective cells!


In [1]:
!nextflow run hello


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Pulling nextflow-io/hello ...
 downloaded from https://github.com/nextflow-io/hello.git
Launching[35m `https://github.com/nextflow-io/hello` [0;2m[[0;1;36mstupefied_pare[0;2m] DSL2 - [36mrevision: [0;36m2ce0b0e294 [master][m
[K
[2m[[0;34m-        [0;2m] [0;2m[msayHello -[K
[2A
[2mexecutor >  local (4)[m[K
[2m[[0;34mf4/9343ec[0;2m] [0;2m[msayHello[33;2m ([0;33m1[2m)[m[2m |[m 0 of 4[K
Hello world![K
[K
Ciao world![K
[K
Hola world![K
[K
Bonjour world![K
[K
[11A
[2mexecutor >  local (4)[m[K
[2m[[0;34mb2/52b7a5[0;2m] [0;2m[msayHello[33;2m ([0;33m4[2m)[m[2m |[m 4 of 4[32m ✔[m[K
Hello world![K
[K
Ciao world![K
[K
Hola world![K
[K
Bonjour world![K
[K



In [5]:
!echo pwd > cat test 

In [None]:
# run the pipeline in the test profile using docker containers
# make sure to specify the version you want to use (use the latest one)
!nextflow run nf-core/differentialabundance -r 1.5.0 -profile test,docker --outdir ~/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01/differentialabundance-testrun/



[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `https://github.com/nf-core/differentialabundance` [0;2m[[0;1;36mtender_mclean[0;2m] DSL2 - [36mrevision: [0;36m3dd360fed0 [1.5.0][m
[K
[33mWARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`[39m[K


-[2m----------------------------------------------------[0m-
                                        [0;32m,--.[0;30m/[0;32m,-.[0m
[0;34m        ___     __   __   __   ___     [0;32m/,-._.--~'[0m
[0;34m  |\ | |__  __ /  ` /  \ |__) |__         [0;33m}  {[0m
[0;34m  | \| |       \__, \__/ |  \ |___     [0;32m\`-._,-`-,[0m
                                        [0;32m`._,._,'[0m
[0;35m  nf-core/differentialabundance v1.5.0-g3dd360f[0m
-[2m----------------------------------------------------[0m-
[1mCore Nextflow options[0m
  [0;34mrevision                    : [0;32m1.5.0[0m
  [0;34mru

In [None]:
# repeat the run. What did change?
!nextflow run nf-core/differentialabundance -r 1.5.0 -profile test,docker --outdir ~/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01/differentialabundance-testrun/



[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `https://github.com/nf-core/differentialabundance` [0;2m[[0;1;36mberserk_lichterman[0;2m] DSL2 - [36mrevision: [0;36m3dd360fed0 [1.5.0][m
[K
[33mWARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`[39m[K


-[2m----------------------------------------------------[0m-
                                        [0;32m,--.[0;30m/[0;32m,-.[0m
[0;34m        ___     __   __   __   ___     [0;32m/,-._.--~'[0m
[0;34m  |\ | |__  __ /  ` /  \ |__) |__         [0;33m}  {[0m
[0;34m  | \| |       \__, \__/ |  \ |___     [0;32m\`-._,-`-,[0m
                                        [0;32m`._,._,'[0m
[0;35m  nf-core/differentialabundance v1.5.0-g3dd360f[0m
-[2m----------------------------------------------------[0m-
[1mCore Nextflow options[0m
  [0;34mrevision                    : [0;32m1.5.0[0m
  [0;

Runtime is faster: 5 minutes on the repeat run vs. 14 minutes on the initial run

In [None]:
# now set -resume to the command. What did change?
!nextflow run nf-core/differentialabundance -r 1.5.0 -profile test,docker --outdir ~/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01/differentialabundance-testrun/ -resume


[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `https://github.com/nf-core/differentialabundance` [0;2m[[0;1;36msleepy_fourier[0;2m] DSL2 - [36mrevision: [0;36m3dd360fed0 [1.5.0][m
[K
[33mWARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`[39m[K


-[2m----------------------------------------------------[0m-
                                        [0;32m,--.[0;30m/[0;32m,-.[0m
[0;34m        ___     __   __   __   ___     [0;32m/,-._.--~'[0m
[0;34m  |\ | |__  __ /  ` /  \ |__) |__         [0;33m}  {[0m
[0;34m  | \| |       \__, \__/ |  \ |___     [0;32m\`-._,-`-,[0m
                                        [0;32m`._,._,'[0m
[0;35m  nf-core/differentialabundance v1.5.0-g3dd360f[0m
-[2m----------------------------------------------------[0m-
[1mCore Nextflow options[0m
  [0;34mrevision                    : [0;32m1.5.0[0m
  [0;34mr

Runtime is much faster: 35 seconds vs. 5 minutes on the first repeat run. Additionally, the output is different, as it only accessed cached results of the previous runs, only running the creation of the Shiny app. 

Check out the current directory. Next to the outdir you specified, what else has changed?

Two new folders have been created, containing temporary files (and other stuff probably) of the pipeline: `work` and `.nextflow`. Additionally, per run, a new `.log` file was created.

In [None]:
# delete the work directory and run the pipeline again using -resume. What did change?
!pwd
!rm -r /home/marcelullose/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01/work/
!nextflow run nf-core/differentialabundance -r 1.5.0 -profile test,docker --outdir ~/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01/differentialabundance-testrun/ -resume

/home/marcelullose/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01

[1m[38;5;232m[48;5;43m N E X T F L O W [0;2m  ~  [mversion 25.04.7[m
[K
Launching[35m `https://github.com/nf-core/differentialabundance` [0;2m[[0;1;36mfervent_allen[0;2m] DSL2 - [36mrevision: [0;36m3dd360fed0 [1.5.0][m
[K
[33mWARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`[39m[K


-[2m----------------------------------------------------[0m-
                                        [0;32m,--.[0;30m/[0;32m,-.[0m
[0;34m        ___     __   __   __   ___     [0;32m/,-._.--~'[0m
[0;34m  |\ | |__  __ /  ` /  \ |__) |__         [0;33m}  {[0m
[0;34m  | \| |       \__, \__/ |  \ |___     [0;32m\`-._,-`-,[0m
                                        [0;32m`._,._,'[0m
[0;35m  nf-core/differentialabundance v1.5.0-g3dd360f[0m
-[2m--------------------------------------------------

Since the chache was removed, the runtime and process is identical to the second repeat run.

## Lets look at the results

### What is differential abundance analysis?

Give the most important plots from the report:

/home/marcelullose/Documents/SoSe25/ComputationalWorkflows/computational-workflows-2025/notebooks/day_01/differentialabundance-testrun/plots/exploratory/treatment/png


# QC
## Dispersion Plot
![dispersion](/home/marcelullose/Documents/SoSe25/ComputationalWorkflows/differentialabundance-testrun/plots/qc/treatment_mCherry_hND6_.deseq2.dispersion.png)

# Exploratory Analysis

Boxplot of relative abundances    |   PCA (2D) 
:------:|:-------:
![botplot](/home/marcelullose/Documents/SoSe25/ComputationalWorkflows/differentialabundance-testrun/plots/exploratory/treatment/png/boxplot.png) | ![pca2d](/home/marcelullose/Documents/SoSe25/ComputationalWorkflows/differentialabundance-testrun/plots/exploratory/treatment/png/pca2d.png)


# Differential Analysis
## Volcano Plot
![volcano](/home/marcelullose/Documents/SoSe25/ComputationalWorkflows/differentialabundance-testrun/plots/differential/treatment_mCherry_hND6_sample_number/png/volcano.png)