# A short introduction to containerized software

After spending time using nf-core pipelines to answer bioinformatic questions, we will focus on the processes that lie behind these pipelines now.

Today, we will focus on containerization, namely via Docker. 



1. Check if Docker is installed.

In [1]:
!docker info

Client:
 Version:    28.1.1+1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.20.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.33.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 12
 Server Version: 28.1.1+1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
 runc version: 
 init version: de40ad0
 

### What is a container?

it is a light, portable environment with all dependencies for configuration included

### Why do we use containers?

reproducable, no dependancies of environment (Hardware),portable anywhere any pc.

### What is a docker image?

Frozen state of a filesystem with software installed

### Let's run our first docker image:

### Login to docker

In [4]:
# This you need to do on the command line directly
!docker login

[1m
USING WEB-BASED LOGIN[0m

[1m[106m[30mi[0m[0m [96mInfo → [0m[0m[3mTo sign in with credentials on the command line, use 'docker login -u <username>'[0m
         [3m[0m

Your one-time device confirmation code is: [1mBQQJ-FHBJ
[0m[1mPress ENTER[0m to open your browser or submit your device code here: [4mhttps://login.docker.com/activate
[0m
Waiting for authentication in the browser…
login canceled
^C


### Run your first docker container

In [2]:
!docker run hello-world


Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/



### Find the container ID

In [8]:
#find container ID
!docker ps -a 

CONTAINER ID   IMAGE         COMMAND    CREATED          STATUS                      PORTS     NAMES
2537a472a813   hello-world   "/hello"   10 minutes ago   Exited (0) 10 minutes ago             naughty_shtern
e6788864ccc4   hello-world   "/hello"   2 days ago       Exited (0) 2 days ago                 agitated_fermi


### Delete the container again, give prove its deleted

In [9]:
!docker rm 2537a472a813

2537a472a813


In [10]:
!docker ps -a 

CONTAINER ID   IMAGE         COMMAND    CREATED      STATUS                  PORTS     NAMES
e6788864ccc4   hello-world   "/hello"   2 days ago   Exited (0) 2 days ago             agitated_fermi


### FASTQC is a very useful tool as you've learned last week. Let's try and run it from command line

Link to the software: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Please describe the steps you took to download and run the software for the example fastq file from last week below:

1. download fastqc
2. chmod +x fastqc
3. create the java environemnt needed
4. run it

### Very well, now let's try to make use of its docker container

1. create a container holding fastqc using seqera containers (https://seqera.io/containers/)
2. use the container to generate a fastqc html of the example fastq file

In [16]:
# pull the container
!docker pull community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29

0.12.1--af7a5314d5015c29: Pulling from library/fastqc

[1B2b0c44d2: Pulling fs layer 
[1Bb097362e: Pulling fs layer 
[1Ba01cff0b: Pulling fs layer 
[1Bb700ef54: Pulling fs layer 
[1B97a3ef36: Pulling fs layer 
[1B74b0f85e: Pulling fs layer 
[1Bc00c10a5: Pulling fs layer 
[1B7ea432cc: Pulling fs layer 
[1Bd6c3110d: Waiting fs layer 
[1Ba16bbe82: Pulling fs layer 
[1B47592a0a: Pulling fs layer 
[1Bacc3b8ff: Pulling fs layer 
[1Bc6865366: Pull complete 8.2MB/368.2MBB[11A[2K[12A[2K[12A[2K[11A[2K[11A[2K[13A[2K[11A[2K[11A[2K[11A[2K[13A[2K[11A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[10A[2K[13A[2K[11A[2K[11A[2K[13A[2K[11A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[9A[2K[13A[2K[11A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[11A[2K[11A[2K[8A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[13A[2K[11A[2K[11A[2K[13A[2K[11A[2K[11A[2K[13A[2

In [17]:
# run the container and save the results to a new "fastqc_results" directory

!mkdir -p fastqc_results

In [19]:
!ls -la /home/mitch/Dokumente/Worklflow/computational-workflows-2025/test_fastq_small/

insgesamt 3608
drwxrwxr-x  2 mitch mitch   4096 Okt  1 13:59 .
drwxrwxr-x 10 mitch mitch   4096 Okt  1 13:53 ..
-rw-rw-r--  1 mitch mitch 447334 Sep 28  2017 Test01_L001_R1_001.fastq
-rw-rw-r--  1 mitch mitch  93310 Okt  1 13:59 Test01_L001_R1_001.zip
-rw-rw-r--  1 mitch mitch 446240 Sep 28  2017 Test01_L001_R2_001.fastq
-rw-rw-r--  1 mitch mitch 447253 Sep 28  2017 Test02_L001_R1_001.fastq
-rw-rw-r--  1 mitch mitch 446289 Sep 28  2017 Test02_L001_R2_001.fastq
-rw-rw-r--  1 mitch mitch 447630 Sep 28  2017 Test03_L001_R1_001.fastq
-rw-rw-r--  1 mitch mitch 446788 Sep 28  2017 Test03_L001_R2_001.fastq
-rw-rw-r--  1 mitch mitch 447066 Sep 28  2017 Test04_L001_R1_001.fastq
-rw-rw-r--  1 mitch mitch 446204 Sep 28  2017 Test04_L001_R2_001.fastq


In [20]:
!docker run --rm \
    -v /home/mitch/Dokumente/Worklflow/computational-workflows-2025/test_fastq_small:/input \
    -v $(pwd)/fastqc_results:/output \
    community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29 \
    sh -c "fastqc /input/*.fastq -o /output"

null
null
null
null
null
null
null
null
Started analysis of Test01_L001_R1_001.fastq
Analysis complete for Test01_L001_R1_001.fastq
Started analysis of Test01_L001_R2_001.fastq
Analysis complete for Test01_L001_R2_001.fastq
Started analysis of Test02_L001_R1_001.fastq
Analysis complete for Test02_L001_R1_001.fastq
Started analysis of Test02_L001_R2_001.fastq
Analysis complete for Test02_L001_R2_001.fastq
Started analysis of Test03_L001_R1_001.fastq
Analysis complete for Test03_L001_R1_001.fastq
Started analysis of Test03_L001_R2_001.fastq
Analysis complete for Test03_L001_R2_001.fastq
Started analysis of Test04_L001_R1_001.fastq
Analysis complete for Test04_L001_R1_001.fastq
Started analysis of Test04_L001_R2_001.fastq
Analysis complete for Test04_L001_R2_001.fastq


### Now that you know how to use a docker container, which approach was easier and which approach will be easier in the future?

Probably the first one for the single use of a tool I think if moe tools might be neccessary a container structure makes more sense, especially if you need a lotof dependancies for a single tool (java etc.) If your environemtn already satisfy these conditions simply downloading the tool is easier.

### What would you say, which approach is more reproducible?

SInce there might be different dependancies a container is way more reproducible since this are always described

I wasnt able to download the fastq files so i downloaded a test fastq file. 
My suggestions that the fastq container was used in the nextflow pipeline the results should be identical, while the fastqc manually might differ a bit.

### Compare the file to last weeks fastqc results, are they identical?
### Is the fastqc version identical?

## Dockerfiles

We now used Docker containers and images directly to boost our research. 

Let's create our own toy Dockerfile including the "cowsay" tool (https://en.wikipedia.org/wiki/Cowsay)

Hints:
1. Docker is Linux, so you need to know the apt-get command to install "cowsay"

In [31]:
!docker build -t my-cowsay -f my_dockerfile .

[1A[1B[0G[?25l[+] Building 0.0s (0/0)  docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/0)  docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/0)  docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/0)  docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
[?25h[1A[0G[?25l[+] Building 0.2s (0/1)                                          docker:default
 => [internal] load build definition from my_dockerfile                    0.2s
[34m => => transferring dockerfile: 843B                                       0.1s
[0m[?25h[1A[1A[1A[0G[?25l[+] Building 0.3s (1/1)                                          docker:default
[34m => [internal] load build definition from my_dockerfile                    0.2s
[0m[34m => => transferring dockerfile: 843B                                       0.1s
[0m[?25h[1A[1A[1A[0G[?25l[+] Building 0.3s (1/2)                                          docker:default

In [35]:
!docker ps -a

CONTAINER ID   IMAGE         COMMAND    CREATED      STATUS                  PORTS     NAMES
e6788864ccc4   hello-world   "/hello"   2 days ago   Exited (0) 2 days ago             agitated_fermi


In [36]:
!docker images | grep my-cowsay

# Run the cowsay container if the image exists
!docker run my-cowsay cowsay "Hello from my Docker container!"

my-cowsay                                                  latest                     991afb26a5b6   6 minutes ago   128MB
 _________________________________
< Hello from my Docker container! >
 ---------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


### Explain the RUN and ENV lines you added to the file

Run runs the commands 
env creates the variable to be called 

## Let's do some bioinformatics with the docker file and create a new docker file that holds the salmon tool used in rnaseq

To do so, use "curl" in your new dockerfile to get salmon from https://github.com/COMBINE-lab/salmon/releases/download/v1.5.2/salmon-1.5.2_linux_x86_64.tar.gz

In [39]:
# use the file "salmon_docker" in this directory to build a new docker image

!docker build -t my-salmon -f salmon_docker .

[1A[1B[0G[?25l[+] Building 0.0s (0/0)  docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/0)  docker:default
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
[?25h[1A[0G[?25l[+] Building 0.2s (1/2)                                          docker:default
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 599B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.2s
[?25h[1A[1A[1A[1A[0G[?25l[+] Building 0.3s (1/2)                                          docker:default
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 599B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.3s
[?25h[1A[1A[1A[1A[0G[?25l[+] Building 0.5s (1/2

In [41]:
# build the image
!docker images | grep my-salmon

my-salmon                                                  latest                     a3c7f1dda93f   About a minute ago   306MB


In [42]:
# run the docker image to give out the version of salmon
!docker run my-salmon salmon --version

salmon 1.5.2


## Do you think bioinformaticians have to create a docker image every time they want to run a tool?

create no, they most likely encounter docker images for preexisting tools and can simply get them from container webpages suxh as hub.docker etc.

Find the salmon docker image online and run it on your computer.

What is https://biocontainers.pro/ ?

For me it seems like an educational site that teaches all about container, guidlines, best practices etc.

In [43]:
!docker pull combinelab/salmon #https://hub.docker.com/r/combinelab/salmon

Using default tag: latest
latest: Pulling from combinelab/salmon

[1B7f213c76: Pulling fs layer 
[1B1ed9ab84: Pulling fs layer 
[1B0bdd40c3: Pulling fs layer 
[1B893c1bc1: Pulling fs layer 
[1BDigest: sha256:cefd8bb0b2ed9b07f22b5f0fc317ddda540e5b0dc00810d1ff0d92fee5d80370[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[5A[2K[3A[2K[2A[2K[5A[2K[3A[2K[5A[2K[3A[2K[3A[2K[5A[2K[3A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[3A[2K[5A[2K[5A[2K[5A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[5A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[3A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[3A

In [46]:
!docker run combinelab/salmon salmon --version

salmon 1.10.3


In [47]:
!docker ps -a

CONTAINER ID   IMAGE               COMMAND                  CREATED          STATUS                      PORTS     NAMES
ed0ba9aac3d3   combinelab/salmon   "salmon --version"       40 seconds ago   Exited (0) 38 seconds ago             gifted_wiles
6cbbdb97a505   my-salmon           "salmon --version"       12 minutes ago   Exited (0) 12 minutes ago             jolly_joliot
df4cb6802055   my-salmon           "bash"                   14 minutes ago   Exited (0) 14 minutes ago             determined_yalow
b47ca001901b   my-cowsay           "cowsay 'Hello from …"   19 hours ago     Exited (0) 19 hours ago               objective_easley
e6788864ccc4   hello-world         "/hello"                 2 days ago       Exited (0) 2 days ago                 agitated_fermi


## Are there other ways to create Docker (or Apptainer) images?
-dockerfiles
-tools like Kaniko?
-clouds


What is https://seqera.io/containers/ ?

cool tool to find different container images for different tools!
Easy to find with a search and easy to install