# A short introduction to containerized software

After spending time using nf-core pipelines to answer bioinformatic questions, we will focus on the processes that lie behind these pipelines now.

Today, we will focus on containerization, namely via Docker. 



1. Check if Docker is installed.

In [1]:
from pre_commit.lang_base import basic_run_hook
!docker info

Client: Docker Engine - Community
 Version:    28.4.0
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  ai: Docker AI Agent - Ask Gordon (Docker Inc.)
    Version:  v1.9.11
    Path:     /Users/patrick/.docker/cli-plugins/docker-ai
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.28.0-desktop.1
    Path:     /Users/patrick/.docker/cli-plugins/docker-buildx
  cloud: Docker Cloud (Docker Inc.)
    Version:  v0.4.29
    Path:     /Users/patrick/.docker/cli-plugins/docker-cloud
  compose: Docker Compose (Docker Inc.)
    Version:  v2.39.4-desktop.1
    Path:     /Users/patrick/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.42
    Path:     /Users/patrick/.docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Docker Inc.)
    Version:  v0.2.0
    Path:     /Users/patrick/.docker/cli-plugins/docker-desktop
  extension: Manages Docker extensions (Docker Inc.)
    Version:  

### What is a container?

### Why do we use containers?

### What is a docker image?

### Let's run our first docker image:

### Login to docker

In [None]:
# This you need to do on the command line directly

### Run your first docker container

In [2]:
!docker run hello-world

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world

[1BDigest: sha256:54e66cc1dd1fcb1c3c58bd8017914dbed8701e2d8c74d9262e26bd9cc1642d31
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (arm64v8)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more exa

### Find the container ID

In [5]:
!docker ps -a

CONTAINER ID   IMAGE                                                                         COMMAND                  CREATED              STATUS                          PORTS     NAMES
025a1fb6b506   hello-world                                                                   "/hello"                 About a minute ago   Exited (0) About a minute ago             flamboyant_blackburn
50104af93baa   community.wave.seqera.io/library/cutadapt_trim-galore_pigz:a98edd405b34582d   "/usr/local/bin/_ent…"   3 hours ago          Exited (137) 3 hours ago                  nxf-inl6YRyBVLJytPCW8x8ebuJz
6a9fca9d4542   community.wave.seqera.io/library/cutadapt_trim-galore_pigz:a98edd405b34582d   "/usr/local/bin/_ent…"   3 hours ago          Exited (137) 3 hours ago                  nxf-wfwXYs6XdKzimPXLh5A8h3xR


### Delete the container again, give prove its deleted

In [6]:
!docker container rm 025a1fb6b506

025a1fb6b506


In [7]:
!docker ps -a

CONTAINER ID   IMAGE                                                                         COMMAND                  CREATED       STATUS                     PORTS     NAMES
50104af93baa   community.wave.seqera.io/library/cutadapt_trim-galore_pigz:a98edd405b34582d   "/usr/local/bin/_ent…"   3 hours ago   Exited (137) 3 hours ago             nxf-inl6YRyBVLJytPCW8x8ebuJz
6a9fca9d4542   community.wave.seqera.io/library/cutadapt_trim-galore_pigz:a98edd405b34582d   "/usr/local/bin/_ent…"   3 hours ago   Exited (137) 3 hours ago             nxf-wfwXYs6XdKzimPXLh5A8h3xR


### FASTQC is a very useful tool as you've learned last week. Let's try and run it from command line

Link to the software: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Please describe the steps you took to download and run the software for the example fastq file from last week below:

1. `mamba install fastqc`
2. `mkdir fastqc_manual_out`
3. `fastqc day_02/fetch_out/fastq/*.fastq.gz -o fastqc_manual_out`

### Very well, now let's try to make use of its docker container

1. create a container holding fastqc using seqera containers (https://seqera.io/containers/)
2. use the container to generate a fastqc html of the example fastq file

In [8]:
# pull the container
!docker image pull community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29

0.12.1--af7a5314d5015c29: Pulling from library/fastqc

[1Bc6865366: Pulling fs layer 
[2BDigest: sha256:b7f6caf359264cf86da901b0aa5f66735a6506fcfbf103c66db6987253ad44c1[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2

In [30]:
!ls -a "/Users/patrick/Studium_Local/3.Sem/computational-workflows-2025/notebooks/day_02/fetch_out/fastq/"

ls: cannot access '/Users/patrick/Studium_Local/3.Sem/computational-workflows-2025/notebooks/day_02/fetch_out/fastq/*': No such file or directory


In [34]:
# run the container and save the results to a new "fastqc_results" directory
!docker run --rm -v "/Users/patrick/Studium_Local/3.Sem/computational-workflows-2025/notebooks/day_02/fetch_out/fastq":/data -v "$PWD/fastqc_results":/out b7f6caf35926 fastqc "/data/SRX19144488_SRR23195511_1.fastq.gz" -o /out

application/gzip
Started analysis of SRX19144488_SRR23195511_1.fastq.gz
Approx 5% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 10% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 15% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 20% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 25% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 30% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 35% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 40% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 45% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 50% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 55% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 60% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 65% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 70% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 75% complete for SRX19144488_SRR23195511_1.fastq.gz
Approx 80% complete for SRX

### Now that you know how to use a docker container, which approach was easier and which approach will be easier in the future?

if the software is already packaged through conda it is equally fast however with complicatet dependencies, especially of outdated software a reproducible docker container which can easily be pulled is much faster

### What would you say, which approach is more reproducible?

the docker container

### Compare the file to last weeks fastqc results, are they identical?
### Is the fastqc version identical?

The docker and manual fastqc results are identical

versions are both 0.12.1

## Dockerfiles

We now used Docker containers and images directly to boost our research. 

Let's create our own toy Dockerfile including the "cowsay" tool (https://en.wikipedia.org/wiki/Cowsay)

Hints:
1. Docker is Linux, so you need to know the apt-get command to install "cowsay"

In [None]:
# open the file "my_dockerfile" in a text editor

### Explain the RUN and ENV lines you added to the file

In [6]:
# build the docker image
!docker build -f my_dockerfile docker_container

[1A[1B[0G[?25l[+] Building 0.0s (0/1)                                    docker:desktop-linux
[?25h[1A[0G[?25l[+] Building 0.2s (1/2)                                    docker:desktop-linux
[34m => [internal] load build definition from my_dockerfile                    0.0s
[0m[34m => => transferring dockerfile: 831B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.2s
[?25h[1A[1A[1A[1A[0G[?25l[+] Building 0.3s (1/2)                                    docker:desktop-linux
[34m => [internal] load build definition from my_dockerfile                    0.0s
[0m[34m => => transferring dockerfile: 831B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.3s
[?25h[1A[1A[1A[1A[0G[?25l[+] Building 0.5s (1/2)                                    docker:desktop-linux
[34m => [internal] load build definition from my_do

In [7]:
# make sure that the image has been built
!docker images --all

REPOSITORY                                                   TAG                        IMAGE ID       CREATED          SIZE
<none>                                                       <none>                     4154aeb46525   10 minutes ago   220MB
hello-world                                                  latest                     54e66cc1dd1f   7 weeks ago      16.9kB
community.wave.seqera.io/library/cutadapt_trim-galore_pigz   a98edd405b34582d           4e56e5205f5a   9 months ago     1.73GB
community.wave.seqera.io/library/fastqc                      0.12.1--af7a5314d5015c29   b7f6caf35926   11 months ago    1.37GB
quay.io/biocontainers/fq                                     0.12.0--h9ee0642_0         74b59572f1d0   14 months ago    20MB
quay.io/biocontainers/r-shinyngs                             1.8.8--r43hdfd78af_0       e0de72408557   17 months ago    1.99GB
quay.io/biocontainers/atlas-gene-annotation-manipulation     1.1.1--hdfd78af_0          099d0e113ec8   18 mon

In [1]:
# run the docker file 
!fortune | docker run -i 4154aeb46525 cowsay

 ______________________________________
/ If we do not change our direction we \
| are likely to end up where we are    |
\ headed.                              /
 --------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


## Let's do some bioinformatics with the docker file and create a new docker file that holds the salmon tool used in rnaseq

To do so, use "curl" in your new dockerfile to get salmon from https://github.com/COMBINE-lab/salmon/releases/download/v1.5.2/salmon-1.5.2_linux_x86_64.tar.gz

In [None]:
# use the file "salmon_docker" in this directory to build a new docker image
# file contents from salmon github dockerfile

In [12]:
# build the image
!docker build --platform linux/amd64 -f salmon_docker . -t salmon_docker

[1A[1B[0G[?25l[+] Building 0.0s (0/1)                                    docker:desktop-linux
[?25h[1A[0G[?25l[+] Building 0.2s (1/2)                                    docker:desktop-linux
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 753B                                       0.0s
[0m => WARN: FromAsCasing: 'as' and 'FROM' keywords' casing do not match (li  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:18.04            0.2s
[?25h[1A[1A[1A[1A[1A[0G[?25l[+] Building 0.3s (1/2)                                    docker:desktop-linux
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 753B                                       0.0s
[0m => WARN: FromAsCasing: 'as' and 'FROM' keywords' casing do not match (li  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:18.04            0.3s

In [3]:
# run the docker image to give out the version of salmon
!docker images --all

REPOSITORY                                                   TAG                        IMAGE ID       CREATED         SIZE
salmon_docker                                                latest                     07a89e432efd   2 minutes ago   274MB
<none>                                                       <none>                     594ba522afd9   19 hours ago    1.3GB
<none>                                                       <none>                     4154aeb46525   20 hours ago    220MB
hello-world                                                  latest                     54e66cc1dd1f   7 weeks ago     16.9kB
community.wave.seqera.io/library/cutadapt_trim-galore_pigz   a98edd405b34582d           4e56e5205f5a   10 months ago   1.73GB
community.wave.seqera.io/library/fastqc                      0.12.1--af7a5314d5015c29   b7f6caf35926   11 months ago   1.37GB
quay.io/biocontainers/fq                                     0.12.0--h9ee0642_0         74b59572f1d0   15 months ago

In [22]:
!docker run --platform linux/amd64 salmon_docker salmon --version

salmon 1.5.2


## Do you think bioinformaticians have to create a docker image every time they want to run a tool?

Find the salmon docker image online and run it on your computer.

combinelab/salmon, see output below

What is https://biocontainers.pro/ ?

community based containers for bioinformatics software based on conda docker and singularity. Provide ready to use collections of containerized tools.

Ideally not if a container is already provided by the package developers

In [21]:
!docker run --rm --platform linux/amd64 combinelab/salmon:latest salmon --version

salmon 1.10.3


## Are there other ways to create Docker (or Apptainer) images?

What is https://seqera.io/containers/ ?