In [None]:
source vars.sh

# Containers on Milton

# Motivation

## Why Containers?

* Reproducibility: ensure the analysis is insulated from changes in the outside system such as upgrades, environment variables etc.
    * e.g. you can compile a container with the exact R version, packages and data required to reproduce your analysis!
* Easy Usage: no need to install dependencies, use `make` etc, since they are all bundled. Sometimes users don't have permissions to install dependencies.
    * e.g. WEHI's AlphaFold
* Portability: the same `.sif` container can be run on any Linux machine with the same CPU architecture. No need to install using `apt`/`yum`/`conda` etc
    e.g. deploying a web application such as Shiny
* Maintainability: developers don't need to edit code to accommodate updates to language version etc
* OS Support: can be used to run old or new software that isn't compatible with the OS without breaking everything else.

[More information](https://apptainer.org/docs/user/main/introduction.html#use-cases)

## Why Apptainer?

Linux supports various container engines:

* Docker: most popular, but insecure on shared systems
* Podman: open source re-implementation of Docker. More secure
* Apptainer: different API from docker, designed to be HPC-first
* Singularity: old version of Singularity

## What about Conda?

* Conda solves the same problem of software installation
* You can share conda environment files to increase reproducibility
* Conda is even usable with Nextflow
* However:
    * Sharing Conda environment files is like sharing a recipe rather than sharing your cookies
    * Installation can be quite slow due to resolving the dependencies each time
    * Not contained: using Conda still means your analysis is influenced by your environment, your operating system, and everything else you have installed
    * Certain platforms and engines only accept containers: Cromwell/WDL/terra.bio, AWS
    * If a tool is not in Conda, you can often compile a tool from scratch inside a container, which might be difficult or impossible with Conda

# Running Containers

## Setting Up Apptainer

First we need to load apptainer. The most common way to load apptainer is as an [Environment Module](https://modules.readthedocs.io/en/latest/). On milton, the latest version we have access to is `1.1.0`.

<div class="alert alert-info" role="alert">
    ▶ Try this yourself!
</div>

In [None]:
module load apptainer/1.1.0

In [None]:
apptainer version

In [None]:
apptainer help

## Apptainer Run

You can run Apptainer images using the `apptainer run` command. This executes the "runscript" inside the container, which is the default executable:

In [None]:
apptainer run --help

If you're working with an image you have downloaded, this will probably be a `.sif` file:

<div class="alert alert-info" role="alert">
    ▶ Try this yourself!
</div>

In [None]:
apptainer run WelcomeImage/hello.sif

In [None]:
#apptainer run oras://ghcr.io/WEHI-ResearchComputing/hello:latest

Next, we will try to run a Docker container. This is the most common type of container you will likely find in the wild, since Apptainer is mostly only used on HPC. To run a Docker image in Apptainer, you will need to add `docker://` to the name of the image.

<div class="alert alert-info" role="alert">
    ▶ Try this yourself!
</div>

In [None]:
apptainer run docker://hello-world

Note that the Docker image was converted to `.sif` behind the scenes!

## Docker Hub

* Docker Hub is a common place to host Docker images
* However, it's not a great source for bioinformatics images (more on that later)
* You can search Docker Hub at: https://hub.docker.com/search?q=

<div class="alert alert-info" role="alert">
    ▶ Try this yourself! 
    See if you can find the official R Docker image. Then, <code>apptainer run</code> it.
    <br/>
    Tip: You might have to use some keywords.
    <br/>
    Tip 2: Remember to use the <code>docker://</code> prefix!
</div>

In [1]:
apptainer run docker://r-base Rscript -e 1

[34mINFO:   [0m Using cached SIF image
[1] 1


<div class="alert alert-success" role="alert"><code>apptainer run docker://r-base</code></div>

## Apptainer Exec

* Does **not** run the runscript
* Executes a custom command inside the container
* More likely to use this than `apptainer run`
* Good practise for biocontainers
* Can be used to run interactive commands (similar to SSH!)

In [2]:
apptainer exec --help

Run a command within a container

Usage:
  apptainer exec [exec options...] <container> <command>

Description:
  apptainer exec supports the following formats:

  *.sif               Singularity Image Format (SIF). Native to Singularity
                      (3.0+) and Apptainer (v1.0.0+)
  
  *.sqsh              SquashFS format.  Native to Singularity 2.4+

  *.img               ext3 format. Native to Singularity versions < 2.4.

  directory/          sandbox format. Directory containing a valid root file 
                      system and optionally Apptainer meta-data.

  instance://*        A local running instance of a container. (See the instance
                      command group.)

  library://*         A SIF container hosted on a Library (no default)

  docker://*          A Docker/OCI container hosted on Docker Hub or another
                      OCI registry.

  shub://*            A container hosted on Singularity Hub.

  oras://*            A SIF container hosted on an O

      --no-privs                      drop all privileges from root user
                                      in container)
      --no-umask                      do not propagate umask to the
                                      container, set default 0022 umask
      --nv                            enable Nvidia support
      --nvccli                        use nvidia-container-cli for GPU
                                      setup (experimental)
      --oom-kill-disable              Disable OOM killer
  -o, --overlay strings               use an overlayFS image for
                                      persistent data storage or as
                                      read-only layer of container
      --passphrase                    prompt for an encryption passphrase
      --pem-path string               enter an path to a PEM formatted RSA
                                      key for an encrypted container
  -p, --pid                           run container in a new PID names

<div class="alert alert-info" role="alert">
    ▶ Try this yourself!
</div>

In [3]:
apptainer exec WelcomeImage/hello.sif ls /opt

message.txt


In [4]:
 ls /opt

[0m[01;34mnvidia[0m  [01;34mquantum[0m


In [None]:
apptainer exec docker://docker/whalesay cowsay boo

## Interactive Sessions with Apptainer Shell

* Lets you run commands interactively inside a container
* Acts like `ssh`


<div class="alert alert-info" role="alert">
    ▶ Try this yourself! 
</div>

In [10]:
apptainer exec docker://ubuntu cat /etc/os-release

[34mINFO:   [0m Using cached SIF image
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy


```bash
apptainer shell docker://ubuntu
cat /etc/os-release
```

## Filesystems

* **Unlike** Docker, and most other container engines, Apptainer automatically "mounts" parts of the HPC filesystem into each container by default
* This means that you can access some of your local files inside the container, along with the tools provided by the container

<div class="alert alert-info" role="alert">
    ▶ Try this yourself!
</div>

In [8]:
apptainer exec docker://ubuntu ls ~

[34mINFO:   [0m Using cached SIF image
ApptainerTutorial	future.tmpl
GASAL2			hca.sqlite-journal
HCAquery		job.sh
HcaBenchmarking		job436761f0f31a180479329fee9b7cc0fe.job
HcaPython		jobRegistry
IntegrationBenchmarks	metadata.sqlite
R			ondemand
Scrooge			output
Untitled.ipynb		result
Untitled1.ipynb		run.R
_targets.R		run.sh
clustermq.tmpl		scratch
curated_annotation.rds	script.job
dials.find_spots.log	test.tex
files_for_michael.rds	testdata
files_list.rds		wf-clone-validation


In [9]:
ls ~

[0m[01;34mApptainerTutorial[0m                        metadata.sqlite
clustermq.tmpl                           [01;34mondemand[0m
curated_annotation.rds                   [01;34moutput[0m
dials.find_spots.log                     [01;34mR[0m
files_for_michael.rds                    [40;31;01mresult[0m
files_list.rds                           run.R
future.tmpl                              run.sh
[01;34mGASAL2[0m                                   [01;36mscratch[0m
[01;34mHcaBenchmarking[0m                          script.job
[01;34mHcaPython[0m                                [01;34mScrooge[0m
[01;34mHCAquery[0m                                 _targets.R
hca.sqlite-journal                       [01;34mtestdata[0m
[01;34mIntegrationBenchmarks[0m                    test.tex
job436761f0f31a180479329fee9b7cc0fe.job  Untitled1.ipynb
[01;34mjobRegistry[0m                              Untitled.ipynb
job.sh                                   [01;34mwf-clone-validation

# Writeable

* Outside of the mounted directories, containers are read-only by default. You need either the `--writable` or `--writable-tmpfs` flags to edit files
* Also, if you want to act as the root (administrator) user, you need `--fakeroot`

## Biocontainers

* Docker images automatically built from bioconda
* Most bioinformatics tools are available
* Can search at https://bioconda.github.io/search.html?q=

![](media/biocontainers.png)

Talk about metagenomics fasta annotation example

<div class="alert alert-success" role="alert"><code>apptainer shell docker://quay.io/biocontainers/diamond:2.0.15--hb97b32f_1
    
</code></div>

## Compiling from Scratch

* --fakeroot and --writable

In [None]:
cd WelcomeImage
apptainer build hello.sif hello.def

In [None]:
apptainer run hello.sif

In [None]:
ls /vast/projects/alphafold/alphafold/alphafold-2.2.0.0/

In [None]:
cat /vast/projects/alphafold/alphafold/alphafold-2.2.0.0/bin/alphafold

In [None]:
singularity run /vast/projects/alphafold/alphafold/alphafold-2.2.0.0/AlphaFold-2.2.0.0.sif \
    

## Building Containers

## Hosting Containers

### Create a Repository

### Create a Token

* Visit https://github.com/settings/tokens
* "Generate a new token (classic)"
* Check `write:packages`
* Click "Generate token"

## Login

* Run `apptainer remote login --username <YOUR GITHUB USERNAME> docker://ghcr.io`, and then paste in the access token
* Run `apptainer push WelcomeImage/hello.cif oras://ghcr.io/<YOUR GITHUB USERNAME>/hello:latest`

In [None]:
apptainer remote login --username $GITHUB_USERNAME --password $GITHUB_TOKEN docker://ghcr.io

In [None]:
apptainer push WelcomeImage/hello.cif oras://ghcr.io/$GITHUB_USERNAME/hello:latest