# A short introduction to containerized software

After spending time using nf-core pipelines to answer bioinformatic questions, we will focus on the processes that lie behind these pipelines now.

Today, we will focus on containerization, namely via Docker. 



1. Check if Docker is installed.

In [1]:
!docker info

Client:
 Version:    28.4.0
 Context:    default
 Debug Mode: false
 Plugins:
  ai: Docker AI Agent - Ask Gordon (Docker Inc.)
    Version:  v1.9.11
    Path:     /usr/local/lib/docker/cli-plugins/docker-ai
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.28.0-desktop.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
  cloud: Docker Cloud (Docker Inc.)
    Version:  v0.4.29
    Path:     /usr/local/lib/docker/cli-plugins/docker-cloud
  compose: Docker Compose (Docker Inc.)
    Version:  v2.39.4-desktop.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.42
    Path:     /usr/local/lib/docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Docker Inc.)
    Version:  v0.2.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-desktop
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.31
    Path:     /usr/local/lib/docker/cli-plugins/docker

### What is a docker container

A Docker container is a standardized, encapsulated environment that runs applications.

https://en.wikipedia.org/wiki/Docker_(software)

### Why do we use containers?

Containers isolate software from its environment and ensure that it works uniformly despite differences for instance between development and staging.

https://www.docker.com/resources/what-container/

### What is a docker image?

A Docker image is a read-only template (kind of blue-print) used to build containers. Images are used to store and ship applications.[1] Container images become containers at runtime and in the case of Docker containers – images become containers when they run on Docker Engine.[2]

1 https://en.wikipedia.org/wiki/Docker_(software)  
2 https://www.docker.com/resources/what-container/

### Let's run our first docker image:

### Login to docker

In [None]:
# This you need to do on the command line directly

### Run your first docker container

In [2]:
!docker run hello-world

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world

[1BDigest: sha256:54e66cc1dd1fcb1c3c58bd8017914dbed8701e2d8c74d9262e26bd9cc1642d31
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 h

In [5]:
!docker --help

Usage:  docker [OPTIONS] COMMAND

A self-sufficient runtime for containers

Common Commands:
  run         Create and run a new container from an image
  exec        Execute a command in a running container
  ps          List containers
  build       Build an image from a Dockerfile
  bake        Build from a file
  pull        Download an image from a registry
  push        Upload an image to a registry
  images      List images
  login       Authenticate to a registry
  logout      Log out from a registry
  search      Search Docker Hub for images
  version     Show the Docker version information
  info        Display system-wide information

Management Commands:
  ai*         Docker AI Agent - Ask Gordon
  builder     Manage builds
  buildx*     Docker Buildx
  cloud*      Docker Cloud
  compose*    Docker Compose
  container   Manage containers
  context     Manage contexts
  debug*      Get a shell into any image or container
  desktop*    Docker Desktop commands
  extension*  Man

### Find the container ID

In [14]:
!docker ps -a

CONTAINER ID   IMAGE                                                              COMMAND                  CREATED          STATUS                           PORTS     NAMES
c41d8bc80ef7   cowsay                                                             "cowsay 'hello world'"   52 minutes ago   Exited (0) 52 minutes ago                  inspiring_mayer
c7874cfac07a   cowsay                                                             "hello world"            53 minutes ago   Created                                    awesome_bassi
084bfff4282e   cowsay                                                             "bash"                   53 minutes ago   Exited (0) 53 minutes ago                  wizardly_carson
4ef9bcac4d86   community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29   "/usr/local/bin/_ent…"   2 hours ago      Exited (130) About an hour ago             festive_morse
1fce1701c548   community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29   "/usr/local/bin/_

### Delete the container again, give prove its deleted

In [15]:
!docker rm 6a580cb9beb2

Error response from daemon: No such container: 6a580cb9beb2


In [16]:
!docker ps -a

CONTAINER ID   IMAGE                                                              COMMAND                  CREATED          STATUS                           PORTS     NAMES
c41d8bc80ef7   cowsay                                                             "cowsay 'hello world'"   53 minutes ago   Exited (0) 53 minutes ago                  inspiring_mayer
c7874cfac07a   cowsay                                                             "hello world"            53 minutes ago   Created                                    awesome_bassi
084bfff4282e   cowsay                                                             "bash"                   53 minutes ago   Exited (0) 53 minutes ago                  wizardly_carson
4ef9bcac4d86   community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29   "/usr/local/bin/_ent…"   2 hours ago      Exited (130) About an hour ago             festive_morse
1fce1701c548   community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29   "/usr/local/bin/_

### FASTQC is a very useful tool as you've learned last week. Let's try and run it from command line

Link to the software: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Please describe the steps you took to download and run the software for the example fastq file from last week below:

1. I followed the link and clicked "download now"
2. I tried to read the instructions but they were more confusing
3. Downloaded the Zip file for linux and windows.
4. I opend the software with "run_fastqc"
5. And then I chose the first fast_qc file from yesterday: SRX19144486_SRR23195516_1.fastq

I have kind of the same results like yesterday. 

### Very well, now let's try to make use of its docker container

1. create a container holding fastqc using seqera containers (https://seqera.io/containers/)
2. use the container to generate a fastqc html of the example fastq file

In [17]:
# pull the container
!docker pull community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29



0.12.1--af7a5314d5015c29: Pulling from library/fastqc
Digest: sha256:b7f6caf359264cf86da901b0aa5f66735a6506fcfbf103c66db6987253ad44c1
Status: Image is up to date for community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29
community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29


In [18]:
!docker images

REPOSITORY                                                 TAG                        IMAGE ID       CREATED             SIZE
cowsay                                                     latest                     da4660cce0ba   About an hour ago   191MB
hello-world                                                latest                     54e66cc1dd1f   7 weeks ago         20.3kB
community.wave.seqera.io/library/fastqc                    0.12.1--af7a5314d5015c29   b7f6caf35926   11 months ago       1.37GB
quay.io/biocontainers/samtools                             1.21--h50ea8bc_0           783c6646029a   12 months ago       108MB
quay.io/biocontainers/fq                                   0.12.0--h9ee0642_0         74b59572f1d0   14 months ago       20MB
quay.io/biocontainers/r-shinyngs                           1.8.8--r43hdfd78af_0       e0de72408557   17 months ago       1.99GB
quay.io/biocontainers/atlas-gene-annotation-manipulation   1.1.1--hdfd78af_0          099d0e113ec8   18 months

In [19]:
!docker run community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29

In [29]:
# run the container and save the results to a new "fastqc_results" directory
!docker run community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29 fastqc \
    -v /mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_02/test/fastq/SRX19144486_SRR23195516_1.fastq \
    fastqc -o /mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_02/test/fastq/SRX19144486_SRR23195516_1.fastq


shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
Specified output directory '/mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_02/test/fastq/SRX19144486_SRR23195516_1.fastq' does not exist


In [20]:
!docker run \
    -v "/mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_02/test/fastq:/data" \
    -v "/mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_02/test/fastqc_results:/results" \
    community.wave.seqera.io/library/fastqc:0.12.1--af7a5314d5015c29 \
    fastqc /data/SRX19144486_SRR23195516_1.fastq.gz -o /results

application/gzip
Started analysis of SRX19144486_SRR23195516_1.fastq.gz
Approx 5% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 10% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 15% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 20% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 25% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 30% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 35% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 40% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 45% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 50% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 55% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 60% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 65% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 70% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 75% complete for SRX19144486_SRR23195516_1.fastq.gz
Approx 80% complete for SRX19144486_SRR23195

### Now that you know how to use a docker container, which approach was easier and which approach will be easier in the future?

I am really sorry to say, it was way easier to downloas the fastqc and then run it. It took me round about 50 minutes to pull the image, and run the container with the right commands and input/output file names... Also the runtime of the container is longer than in the application. 

### What would you say, which approach is more reproducible?

I haven't test reproducibility, but I think the container would be more stable and reproducible. And now that I figure out how to run it, I might be faster next time.

### Compare the file to last weeks fastqc results, are they identical?
### Is the fastqc version identical?

## Dockerfiles

We now used Docker containers and images directly to boost our research. 

Let's create our own toy Dockerfile including the "cowsay" tool (https://en.wikipedia.org/wiki/Cowsay)

Hints:
1. Docker is Linux, so you need to know the apt-get command to install "cowsay"

In [None]:
# open the file "my_dockerfile" in a text editor


### Explain the RUN and ENV lines you added to the file

RUN: To run the commans
ENV: 

In [1]:
!pwd

/mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_03_part2


In [None]:
# build the docker image
!docker build -t cowsay -f "/mnt/c/Users/Johanna/Documents/Studium_Tübingen/2_Semester/computational-workflows-2025/notebooks/day_03_part2/my_dockerfile" .

[1A[1B[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
[?25h[1A[0G[?25l[+] Building 0.1s (0/1)                                          docker:default
 => [internal] load build definition from my_dockerfile                    0.1s
 => => transferring dockerfile: 33B                                        0.0s
[?25h[1A[1A[1A[0G[?25l[+] Building 0.4s (1/2)                                          docker:default
[34m => [internal] load build definition from my_dockerfile                    0.2s
[0m[34m => => transferring dockerfile: 864B                                       0.1s
[0m => [internal] load 

In [6]:
# make sure that the image has been built
!docker images 

REPOSITORY                                                 TAG                        IMAGE ID       CREATED         SIZE
cowsay                                                     latest                     da4660cce0ba   5 minutes ago   191MB
hello-world                                                latest                     54e66cc1dd1f   7 weeks ago     20.3kB
community.wave.seqera.io/library/fastqc                    0.12.1--af7a5314d5015c29   b7f6caf35926   11 months ago   1.37GB
quay.io/biocontainers/samtools                             1.21--h50ea8bc_0           783c6646029a   12 months ago   108MB
quay.io/biocontainers/fq                                   0.12.0--h9ee0642_0         74b59572f1d0   14 months ago   20MB
quay.io/biocontainers/r-shinyngs                           1.8.8--r43hdfd78af_0       e0de72408557   17 months ago   1.99GB
quay.io/biocontainers/atlas-gene-annotation-manipulation   1.1.1--hdfd78af_0          099d0e113ec8   18 months ago   1.82GB
quay.io/nf-cor

In [9]:
# run the docker file 
!docker run cowsay cowsay "hello world"


 _____________
< hello world >
 -------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


## Let's do some bioinformatics with the docker file and create a new docker file that holds the salmon tool used in rnaseq

To do so, use "curl" in your new dockerfile to get salmon from https://github.com/COMBINE-lab/salmon/releases/download/v1.5.2/salmon-1.5.2_linux_x86_64.tar.gz

In [None]:
# use the file "salmon_docker" in this directory to build a new docker image

In [28]:
# build the image
!docker build -t salmon -f salmon_docker .

[1A[1B[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l
[?25h[1A[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
 => [internal] load build definition from salmon_docker                    0.0s
[?25h[1A[1A[0G[?25l[+] Building 0.2s (1/2)                                          docker:default
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 508B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullseye-slim    0.2s
[?25h[1A[1A[1A[1A[0G[?25l[+] Building 0.3s (1/2)                                          docker:default
[34m => [internal] load build definition from salmon_docker                    0.0s
[0m[34m => => transferring dockerfile: 508B                                       0.0s
[0m => [internal] load metadata for docker.io/library/debian:bullsey

In [31]:
# run the docker image to give out the version of salmon
!docker run salmon salmon

salmon v1.5.2

Usage:  salmon -h|--help or 
        salmon -v|--version or 
        salmon -c|--cite or 
        salmon [--no-version-check] <COMMAND> [-h | options]

Commands:
     index      : create a salmon index
     quant      : quantify a sample
     alevin     : single cell analysis
     swim       : perform super-secret operation
     quantmerge : merge multiple quantifications into a single file


## Do you think bioinformaticians have to create a docker image every time they want to run a tool?

Find the salmon docker image online and run it on your computer.

What is https://biocontainers.pro/ ?

In [32]:
!docker pull combinelab/salmon:latest

latest: Pulling from combinelab/salmon

[1B9485d7ab: Pulling fs layer 
[1B7f213c76: Pulling fs layer 
[1B0bdd40c3: Pulling fs layer 
[1B1ed9ab84: Pulling fs layer 
[1BDigest: sha256:cefd8bb0b2ed9b07f22b5f0fc317ddda540e5b0dc00810d1ff0d92fee5d80370[4A[2K[4A[2K[3A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[3A[2K[4A[2K[3A[2K[4A[2K[4A[2K[4A[2K[4A[2K[3A[2K[3A[2K[3A[2K[4A[2K[4A[2K[3A[2K[3A[2K[3A[2K[3A[2K[4A[2K[3A[2K[3A[2K[3A[2K[3A[2K[3A[2K[3A[2K[3A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[4A[2K[2A[2K[3A[2K[3A[2K[3A[2K[5A[2K
Status: Downloaded newer image for combinelab/salmon:latest
docker.io/combinelab/salmon:latest


In [None]:
!docker run combinelab/salmon salmon

salmon v1.10.3

Usage:  salmon -h|--help or 
        salmon -v|--version or 
        salmon -c|--cite or 
        salmon [--no-version-check] <COMMAND> [-h | options]

Commands:
     index      : create a salmon index
     quant      : quantify a sample
     alevin     : single cell analysis
     swim       : perform super-secret operation
     quantmerge : merge multiple quantifications into a single file


In [None]:
# I also foudn the salmon docker image here:
!docker pull quay.io/biocontainers/salmon:1.10.3--h45fbf2d_5

1.10.3--h45fbf2d_5: Pulling from biocontainers/salmon

[1BDigest: sha256:3938cc6dfaca6f7ee14eac0cdc0f305fdff8faa7f14541c72e684feb1b443a741A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K

Biocontainer is a community-driven project that provides the infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with special focus in proteomics, genomics, transcriptomics and metabolomics. BioContainers is based on Docker. [1] On the website there is a registry which holds a list fo all biocontainers and workflows, including metadata and statistics. They show specifications, and architecture to create, deploy and maintain software containers using Conda and Docker technologies. [2]

1) https://cyverse-foundational-open-science-skills-2019.readthedocs-hosted.com/en/latest/Containers/biocontainers.html  
2) https://biocontainers.pro/

## Are there other ways to create Docker (or Apptainer) images?

What is https://seqera.io/containers/ ?

Through this website I can select different packages and add them together and then select a link via "get container" which I can use like we did it with the fastqc. 

