# Computational Workflows for biomedical data

Welcome to the course Computational Workflows for Biomedical Data. Over the next two weeks, you will learn how to leverage nf-core pipelines to analyze biomedical data and gain hands-on experience in creating your own pipelines, with a strong emphasis on Nextflow and nf-core.

Course Structure:

- Week 1: You will use a variety of nf-core pipelines to analyze a publicly available biomedical study.
- Week 2: We will shift focus to learning the basics of Nextflow, enabling you to design and implement your own computational workflows.<br>
- Final Project: The last couple of days, you will apply your knowledge to create a custom pipeline for analyzing biomedical data using Nextflow and the nf-core template.

## Basics

If you have not installed all required software, please do so now asap!


If you already installed all software, please go on and start answering the questions in this notebook. If you have any questions, don't hesitate to approach us.

1. What is nf-core?

nf-core is a curated collection of open-source pipelines using nextflow

2. How many pipelines are there currently in nf-core?

There are currently 139 nf-core pipelines

3. Are there any non-bioinformatic pipelines in nf-core?

Yes, for example the rangeland pipeline which processes satellite imagery and related metadata to determine geographical trends or the meerpipe pipeline for pulsar timing analysis in astronomy.

4. Let's go back a couple of steps. What is a pipeline and what do we use it for?

A pipeline is a connected set of programs that automatically applies a sequence of processing steps to input data. They are used to gather reproducible insights from data without requiring the user to manually employ various tools, while remaining scalable and interconnectable.

5. Why do you think nf-core adheres to strict guidelines?

In order to ensure each pipeline is easy to use, scalable, compatible with other pipelines, and produces reproducible results.

6. What are the main features of nf-core pipelines?

They are built using nextflow and each component or process is strictly isolated using docker or conda to ensure no dependency or version issues, this also enables them to be easily adapted or combined to tackle novel tasks.

## Let's start using the pipelines

1. Find the nf-core pipeline used to measure differential abundance of genes

In [1]:
# run the pipeline in a cell 
# to run bash in jupyter notebooks, simply use ! before the command
# e.g.

!pwd


# For the tasks in the first week, please use the command line to run your commands and simply paste the commands you used in the respective cells!


/workspaces/computational-workflows-2025/notebooks/day_01


In [None]:
# run the pipeline in the test profile using docker containers
# make sure to specify the version you want to use (use the latest one)

!nextflow run nf-core/differentialabundance -profile test,docker --outdir output


In [None]:
# repeat the run. What did change?

#The run is faster but the results remain the same.


In [None]:
# now set -resume to the command. What did change?

# The run uses the results from previously completed steps, finishing faster and with the same results.

Check out the current directory. Next to the outdir you specified, what else has changed?

There is a work directory, a null directory as well as nextflow logs.

In [None]:
# delete the work directory and run the pipeline again using -resume. What did change?


What changed?

The pipeline has no work to resume, so it has to start fresh again.

## Lets look at the results

### What is differential abundance analysis?

Give the most important plots from the report: