# A short introduction to containerized software

After spending time using nf-core pipelines to answer bioinformatic questions, we will focus on the processes that lie behind these pipelines now.

Today, we will focus on containerization, namely via Docker. 



1. Check if Docker is installed.

In [1]:
!docker info

Client:
 Version:    28.4.0
 Context:    default
 Debug Mode: false
 Plugins:
  ai: Docker AI Agent - Ask Gordon (Docker Inc.)
    Version:  v1.9.11
    Path:     /usr/local/lib/docker/cli-plugins/docker-ai
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.28.0-desktop.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
  cloud: Docker Cloud (Docker Inc.)
    Version:  v0.4.29
    Path:     /usr/local/lib/docker/cli-plugins/docker-cloud
  compose: Docker Compose (Docker Inc.)
    Version:  v2.39.4-desktop.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.42
    Path:     /usr/local/lib/docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Docker Inc.)
    Version:  v0.2.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-desktop
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.31
    Path:     /usr/local/lib/docker/cli-plugins/docker

### What is a container?

It's a lightweight, portable, and isolated environment packaging software and all its dependencies together.

### Why do we use containers?

Applications can be run reliablyaccross different systems. They use host system’s OS kernel. Processes, filesystems, and network settings are kept separate from other containers and the host.

### What is a docker image?

It's a blueprint for creating containers

### Let's run our first docker image:

### Login to docker

In [None]:
# This you need to do on the command line directly

### Run your first docker container

In [2]:
!docker run hello-world


Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/


Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Do

### Find the container ID

In [3]:
!docker ps -a

CONTAINER ID   IMAGE                                                                         COMMAND                  CREATED              STATUS                          PORTS     NAMES
99ef5b42f105   hello-world                                                                   "/hello"                 6 seconds ago        Exited (0) 5 seconds ago                  great_wing
515ef65d68b8   hello-world                                                                   "/hello"                 About a minute ago   Exited (0) About a minute ago             distracted_ride
82ef260fe8a9   quay.io/biocontainers/r-shinyngs:1.8.8--r43hdfd78af_0                         "/usr/local/env-exec…"   9 minutes ago        Exited (0) 9 minutes ago                  nxf-w2rL9HSdD4Ph3KB8orvdbexH
594d7f616a7c   quay.io/biocontainers/r-shinyngs:1.8.8--r43hdfd78af_0                         "/usr/local/env-exec…"   37 minutes ago       Exited (0) 36 minutes ago                 nxf-VQO6MgdSBQBuPIDsicO4dTcl
bbf9

In [11]:
!docker container ls -a

CONTAINER ID   IMAGE                                                                         COMMAND                  CREATED          STATUS                      PORTS     NAMES
82ef260fe8a9   quay.io/biocontainers/r-shinyngs:1.8.8--r43hdfd78af_0                         "/usr/local/env-exec…"   11 minutes ago   Exited (0) 10 minutes ago             nxf-w2rL9HSdD4Ph3KB8orvdbexH
594d7f616a7c   quay.io/biocontainers/r-shinyngs:1.8.8--r43hdfd78af_0                         "/usr/local/env-exec…"   38 minutes ago   Exited (0) 38 minutes ago             nxf-VQO6MgdSBQBuPIDsicO4dTcl
bbf9e95225ec   quay.io/biocontainers/bioconductor-deseq2:1.34.0--r41hc247a5b_3               "/usr/local/env-exec…"   2 hours ago      Exited (137) 2 hours ago              nxf-37rho3Bmf005doYV15Ha06wy
8a56f7a882fb   quay.io/biocontainers/bioconductor-deseq2:1.34.0--r41hc247a5b_3               "/usr/local/env-exec…"   2 hours ago      Exited (137) 2 hours ago              nxf-b0VlhUhLT24cBBIimuf1DHYM
86d6a9744db3 

### Delete the container again, give prove its deleted

In [8]:
!docker container rm 515ef65d68b8

515ef65d68b8
515ef65d68b8


In [10]:
!docker ps -a

CONTAINER ID   IMAGE                                                                         COMMAND                  CREATED          STATUS                      PORTS     NAMES
82ef260fe8a9   quay.io/biocontainers/r-shinyngs:1.8.8--r43hdfd78af_0                         "/usr/local/env-exec…"   10 minutes ago   Exited (0) 10 minutes ago             nxf-w2rL9HSdD4Ph3KB8orvdbexH
594d7f616a7c   quay.io/biocontainers/r-shinyngs:1.8.8--r43hdfd78af_0                         "/usr/local/env-exec…"   38 minutes ago   Exited (0) 37 minutes ago             nxf-VQO6MgdSBQBuPIDsicO4dTcl
bbf9e95225ec   quay.io/biocontainers/bioconductor-deseq2:1.34.0--r41hc247a5b_3               "/usr/local/env-exec…"   2 hours ago      Exited (137) 2 hours ago              nxf-37rho3Bmf005doYV15Ha06wy
8a56f7a882fb   quay.io/biocontainers/bioconductor-deseq2:1.34.0--r41hc247a5b_3               "/usr/local/env-exec…"   2 hours ago      Exited (137) 2 hours ago              nxf-b0VlhUhLT24cBBIimuf1DHYM
86d6a9744db3 

### FASTQC is a very useful tool as you've learned last week. Let's try and run it from command line

Link to the software: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Please describe the steps you took to download and run the software for the example fastq file from last week below:

1. download FASTQC zip file
2. extract it
2. make executable `chmod 755 fastqc`
2. add to PATH
3. download fastq file
4. run `fastqc <file.fastq>`

### Very well, now let's try to make use of its docker container

1. create a container holding fastqc using seqera containers (https://seqera.io/containers/)
2. use the container to generate a fastqc html of the example fastq file

In [12]:
# pull the container
!docker pull community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29

0.12.1--af7a5314d5015c29: Pulling from library/fastqc
0.12.1--af7a5314d5015c29: Pulling from library/fastqc

[1Bc6865366: Pulling fs layer 
[1Bacc3b8ff: Pulling fs layer 
[1Bc6865366: Pulling fs layer 
[2BDigest: sha256:b7f6caf359264cf86da901b0aa5f66735a6506fcfbf103c66db6987253ad44c1[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A

In [33]:
# run the container and save the results to a new "fastqc_results" directory
!mkdir fastqc_results

!docker run -v '/mnt/c/Users/NicolaiOswald/OneDrive - UT Cloud/Dokumente/Studium Tübingen/Computational Workflows/computational-workflows-2025/notebooks/day_02/fetchngs/fastq:/data' \
    -v '/mnt/c/Users/NicolaiOswald/OneDrive - UT Cloud/Dokumente/Studium Tübingen/Computational Workflows/computational-workflows-2025/notebooks/day_03_part2/fastq_results:/output' \
    community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29 \
    fastqc /data/SRX19144486_SRR23195516_1.fastq.gz --outdir /output

shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
mkdir: cannot create directory ‘fastqc_results’: File exists
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
application/gzip
Started analysis of SRX19144486_SRR23195516_1.fastq.gz
Approx 5% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 10% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 15% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 20% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 25% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 30% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 35% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 40% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 45% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 50% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 55% complete for SRX19144486_SRR2319551

In [27]:
!cwd

shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
pwd: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
pwd: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory


In [26]:
# run the container and save the results to a new "fastqc_results" directory
!mkdir fastqc_results

!docker run -v '/mnt/c/Users/NicolaiOswald/OneDrive - UT Cloud/Dokumente/Studium Tübingen/Computational Workflows/computational-workflows-2025/notebooks/day_02/fetchngs/fastq:/data' \
    -v '/mnt/c/Users/NicolaiOswald/OneDrive - UT Cloud/Dokumente/Studium Tübingen/Computational Workflows/computational-workflows-2025/notebooks/day_03_part2/fastq_results:/output' \
    community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29 \
    fastqc /data/SRX19144486_SRR23195516_1.fastq.gz --outdir /output

mkdir: cannot create directory ‘fastqc_results’: File exists
Specified output directory '/data/fastqc_results' does not exist
Specified output directory '/data/fastqc_results' does not exist


### Now that you know how to use a docker container, which approach was easier and which approach will be easier in the future?

### What would you say, which approach is more reproducible?

### Compare the file to last weeks fastqc results, are they identical?
### Is the fastqc version identical?

## Dockerfiles

We now used Docker containers and images directly to boost our research. 

Let's create our own toy Dockerfile including the "cowsay" tool (https://en.wikipedia.org/wiki/Cowsay)

Hints:
1. Docker is Linux, so you need to know the apt-get command to install "cowsay"

In [None]:
# open the file "my_dockerfile" in a text editor

### Explain the RUN and ENV lines you added to the file

In [9]:
!docker build --help

Usage:  docker buildx build [OPTIONS] PATH | URL | -

Start a build

Aliases:
  docker build, docker builder build, docker image build, docker buildx b

Options:
      --add-host strings              Add a custom host-to-IP mapping
                                      (format: "host:ip")
      --allow stringArray             Allow extra privileged entitlement
                                      (e.g., "network.host",
                                      "security.insecure")
      --annotation stringArray        Add annotation to the image
      --attest stringArray            Attestation parameters (format:
                                      "type=sbom,generator=image")
      --build-arg stringArray         Set build-time variables
      --build-context stringArray     Additional build contexts (e.g.,
                                      name=path)
      --builder string                Override the configured builder
                                      instance (default "defa

In [26]:
# build the docker image
!docker build -f my_dockerfile -t my_dockerfile:test .

[1A[1B[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
 => [internal] load build definition from my_dockerfile                    0.0s
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
 => [internal] load build definition from my_dockerfile                    0.0s
[?25h[1A[1A[0G[?25l[+] Building 0.2s (1/2)                                          docker:default
[34m => [internal] load build definition from my_dockerfile                    0.0s
[0m[34m => => transferring dockerfile: 230B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.2s
[?25h[1A[1A[0G[?25l[+] Building 0.2s (1/2)                                          docker:default
[34m => [internal] load build definition from my_dockerfile           

In [28]:
# make sure that the image has been built
!docker ps -a

CONTAINER ID   IMAGE                                                                         COMMAND                  CREATED             STATUS                         PORTS     NAMES
7046ae31ce35   a12e546e17f1                                                                  "bash"                   8 minutes ago       Exited (0) 8 minutes ago                 zen_haibt
f6e35b3300f5   832045f85859                                                                  "bash"                   10 minutes ago      Exited (0) 10 minutes ago                boring_zhukovsky
3c81c083a910   832045f85859                                                                  "bash"                   11 minutes ago      Exited (0) 11 minutes ago                charming_shannon
5de9a5806dd8   community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29              "/usr/local/bin/_ent…"   45 minutes ago      Exited (0) 32 minutes ago                interesting_murdock
b5970c6f9858   community.wave.seqer

In [34]:
# run the docker file 
!docker run my_dockerfile:test

 ____________________
< Hello from cowsay! >
 --------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


## Let's do some bioinformatics with the docker file and create a new docker file that holds the salmon tool used in rnaseq

To do so, use "curl" in your new dockerfile to get salmon from https://github.com/COMBINE-lab/salmon/releases/download/v1.5.2/salmon-1.5.2_linux_x86_64.tar.gz

In [37]:
# use the file "salmon_docker" in this directory to build a new docker image
!cat salmon_docker

FROM debian:bullseye-slim

LABEL image.author.name="yourname"
LABEL image.author.email="yourmail"

# Install dependencies
RUN apt-get update && apt-get install -y wget && \
    wget https://github.com/COMBINE-lab/salmon/releases/download/v1.5.2/salmon-1.5.2_linux_x86_64.tar.gz \
    -O salmon-1.5.2_linux_x86_64.tar.gz && \
    tar -xvzf salmon-1.5.2_linux_x86_64.tar.gz && \
    cd salmon-1.5.2_linux_x86_64/bin && \


# Download and install Salmon

# Set the PATH environment variable (to /usr/bin)
ENV PATH="/salmon-1.5.2_linux_x86_64/bin:${PATH}"



In [40]:
# build the image
!docker build -t salmon:latest -f salmon_docker .

[1A[1B[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
[?25h[1A[0G[?25l[+] Building 0.2s (1/2)                                          docker:default
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 674B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.2s
[?25h[1A[0G[?25l[+] Building 0.2s (1/2)                                          docker:default
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 674B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bul

In [44]:
# run the docker image to give out the version of salmon
!docker run salmon:latest salmon --version

salmon 1.5.2


## Do you think bioinformaticians have to create a docker image every time they want to run a tool?

Find the salmon docker image online and run it on your computer.

What is https://biocontainers.pro/ ?

In [None]:
!docker pull combinelab/salmon:latest
!docker run combinelab/salmon:latest salmon --version

latest: Pulling from combinelab/salmon
Digest: sha256:cefd8bb0b2ed9b07f22b5f0fc317ddda540e5b0dc00810d1ff0d92fee5d80370
Status: Image is up to date for combinelab/salmon:latest
docker.io/combinelab/salmon:latest
Digest: sha256:cefd8bb0b2ed9b07f22b5f0fc317ddda540e5b0dc00810d1ff0d92fee5d80370
Status: Image is up to date for combinelab/salmon:latest
docker.io/combinelab/salmon:latest
salmon v1.10.3

Usage:  salmon -h|--help or 
        salmon -v|--version or 
        salmon -c|--cite or 
        salmon [--no-version-check] <COMMAND> [-h | options]

Commands:
     index      : create a salmon index
     quant      : quantify a sample
     alevin     : single cell analysis
     swim       : perform super-secret operation
     quantmerge : merge multiple quantifications into a single file
salmon v1.10.3

Usage:  salmon -h|--help or 
        salmon -v|--version or 
        salmon -c|--cite or 
        salmon [--no-version-check] <COMMAND> [-h | options]

Commands:
     index      : create a sa

## Are there other ways to create Docker (or Apptainer) images?

What is https://seqera.io/containers/ ?