![<CW3E Logo>](https://cw3e.ucsd.edu/images/cw3e_logo_files/wetransfer-b4ff74/CW3E%20Final%20Logo%20Suite/4-Horizontal-Acronym/Digital/PNG/CW3E-Logo-Horizontal-Acronym-FullColor.png "Center for Western Weather and Water Extremes Logo")

# Creating a Singularity Container for a Conda Environment

---

## Overview
---

Singularity containers are essential for your code as they create complete, uniform software environments applicable for every operating system. The purpose of generating a Singularity container is to allow us to program within a consistent environment. Containerization creates reproducibility in computing workflows.

This notebook will go over the following:
1. Singularity definition files and their necessary components. 
2. Building, running, and transferring a Singularity container.
3. Utilizing containers for both Bash and Python scripts.
4. Using a container to launch a Jupyter notebook server.

## Prerequisites
---

| **Concept** | **Importance** | **Notes** |
|:-----------:|:--------------:|:---------:|
| Bash | Necessary | Understand on an intermediate level |
| Python | Necessary | Understand on an intermediate level |
| Singularity | Helpful | Understanding the basics will help with this workflow, but not necessary |

* **Time to Learn:** 2 hours est.

## Creating a Singularity Definition/Recipe File
---

Definition files (aka container recipes) are the blueprint for creating a custom Singularity container. These type of files consist of the **header** and the **sections**. These parts will be briefly discussed below. For more information on definition files, refer to the [Singularity Documentation](https://docs.sylabs.io/guides/latest/user-guide/definition_files.html).

### Generating a Definition File

---

The following line of code can be run to create your own definition file. Make sure to have this definition file in a system that has the capability to build a container (i.e., skyriver or feather).

```
vim Singularity.def
```

> Usually, the definition file for a Singularity container is just named "Singularity". However, if you are making a container for a specific environment, make sure to name it appropriately.
>> Ex: `Singularity_NetCDF.def` as a definition file name to build a NetCDF environment container.

### Header

---

The header section of a definition file describes the core OS to build and configure within the container. The header can be comprised of several keywords, but for the purposes of our container, we will only need `Bootstrap` and `From`. An example header is shown and explained below.

```
Bootstrap: docker
From: centos:7.9.2009
```

1. **Bootstrap** - this keyword refers to what kind of base you want to use (in this case, we want to use Docker).
2. **From** - this keyword refers to the named container/reference to layers you want to use (in this case, centos:7.9.2009).

For more reference to the docker bootstrap agent, refer to the [Singularity appendix.](https://docs.sylabs.io/guides/latest/user-guide/appendix.html#build-docker-module)

### Sections

---

The main content of a definition file is within the sections. Each section has its own function and provides different content and commands. The following sections were used to create the miniconda Singularity container (in this respective order).

##### %labels

This section is used to store metadata within the container. This section is often filled with information about the author and application. An example `%label` section is shown below.

```
APPLICATION_NAME Miniconda - CentOS 7.9.2009
    APPLICATION_URL https://cw3e.ucsd.edu
    APPLICATION_VERSION 1.0

    AUTHOR_NAME Patrick Mulrooney
    AUTHOR_EMAIL pmulrooney@ucsd.edu

    CO_AUTHOR_NAME Jozette Conti
    CO_AUTHOR_EMAIL jlconti@ucsd.edu

    LAST_UPDATED 2023.07.11
```

##### %setup

Any commands in this section are executed on the host system outside the container after the base OS has been installed. For the purposes of the conda container, leave this blank.

```
%setup
```

##### %environment

This section allows you to define environmental variables set at runtime. If there are any variables needed during the build time, place them in the [`%post` section](#%post). During the building of the container, this section will be written to a file in the container’s metadata folder. This folder is then sourced during the runtime. For the purposes of the miniconda environment, the following environmental variable was set.

```
%environment
    PATH=/opt/conda/bin:$PATH
```

##### %files

This section is significant for our container. In this section, you can copy any necessary files into the container. For the purposes of our miniconda container, we need to generate a list of requirements for the conda environment in Comet. This can be achieved by running the following line in the base conda environment within Comet. 

```
conda list --explicit
```

This list can then be copied and pasted into a text file named `requirements.txt`. This file can then be embedded into the container through the definition file, as shown below.

```
%files

    ./requirements.txt /
```

Now the required conda packages in Comet are within our definition file.

> If your code is using more than just the base conda environment (i.e. NetCDF or iPython) you will have to generate a new requirements file specific to that environment to implement into your container. (i.e. run `conda list --explicit` in the terminal again when in that specific environment).

<a id="%post"></a>
##### %post

The `%post` section is the main section of all the sections. Making directories and installing software/libraries occurs within this section. This section consists of commands that would be run if a root user was directly in the terminal. (i.e., this section are the commands needed to install your program successfully). 

* The first part of our `%post` section is as follows

```
%post -c /bin/bash

    export CONDA_VERSION=py310_23.3.1-0
    export SHA256SUM=aef279d6baea7f67940f16aad17ebe5f6aac97487c7c03466ff01f4819e5a651
```

In these first few commands, we define the version and hash of the miniconda we are using. These variable values can be found in the [conda documentation](https://docs.conda.io/en/latest/miniconda_hashes.html).

* The second part of the `%post` section is

```
    echo "=========== "
    yum install -y wget
```

The purpose of this `echo` command (as well as the ones that follow) is to provide separation between chunks of code and update the user during the building of the container.

`yum` is a package installer for RedHat (i.e., CentOS, which we are using). By running `yum install -y wget`, we install a command needed for the next chunk of code.	 
> The `-y` flag tells the system to assume yes during the installation of wget

* The next part of our `%post` section is

```
    echo "=========== wget conda & verify"
    wget --quiet -O miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh && \
      echo "${SHA256SUM}  miniconda.sh" > miniconda.sha256 && \
      if ! sha256sum --strict -c miniconda.sha256; then exit 1; fi
```

With our now installed wget, we can download the installation script for the conda version previously defined. 
- `--quiet` bypasses the output of wget
- `-O miniconda.sh` saves the file with a desired file name which follows the flag (miniconda.sh) (i.e., changes the output file name)
- The `&& \` at the end of the lines of code lets you do something based on whether the previous command was completed successfully (will complete the following line if the previous one was successful).

Once this downloads, the SHA256SUM we defined earlier + the newly installed file (`miniconda.sh`) will be placed into a new file named `miniconda.sha256`

Then, using a control statement, we can check if the hash of the downloaded miniconda installer file matches what we have on record. If not, the build will be stopped.

* The next `%post` section is

```
   echo "=========== install conda"
    mkdir -p /opt && \
      sh miniconda.sh -b -p /opt/conda
```

This chunk of code creates a directory/subdirectory for miniconda to download into. 

* The next `%post` section

```
   echo "=========== add links"
    ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
      echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
      echo "conda activate base" >> ~/.bashrc
```

This code block generates a symbolic link and adds commands to the bashrc file. These files will be initiated when the user logs in. 

> The line `echo "conda activate base" >> ~/.bashrc` remains unchanged if you are building a container for a non-base environment (i.e. NetCDF or iPython). 

* Next `%post` section

```
    echo "=========== Cleanup "
    rm miniconda.sh miniconda.sha256 && \
      find /opt/conda/ -follow -type f -name '*.a' -delete && \
      find /opt/conda/ -follow -type f -name '*.js.map' -delete && \
      /opt/conda/bin/conda clean -afy
```

Now that conda is installed, we can clean up our environment and clear any cache. To do this, we can remove the download files from earlier as well as any files ending in the extension `.a` or `.js.map`.

* Next `%post` section

```
    echo "=========== Init envrionment"
    . /opt/conda/etc/profile.d/conda.sh
```

This block of code initializes the conda environment by running the conda script.

* Last `%post` section for our miniconda environment

```
    echo "=========== Install from requirements.txt"
    /opt/conda/bin/conda install -y --file /requirements.txt
```

As the environment is already initialized, we have to download the requirements for the conda environment from Comet (i.e., the requirements file from earlier). After this, we are done with the `%post` section.

##### %runscript

This section comprises of the commands you would use to run the now-installed program from the `%post` section. This section is simple for our definition file.

```
%runscript

    /bin/bash
```

##### %test

This is the last section of our definition file! This section is run at the end of the build process. The test section for our definition files is just

```
%test
```

#### Finalized Definition File
---

Now that we have added all the necessary information, our final definition file should look like the following.  

In [1]:
cat Singularity.def

Bootstrap: docker
From: centos:7.9.2009

%labels

    APPLICATION_NAME Miniconda - CentOS 7.9.2009
    APPLICATION_URL https://cw3e.ucsd.edu
    APPLICATION_VERSION 1.0

    AUTHOR_NAME Patrick Mulrooney
    AUTHOR_EMAIL pmulrooney@ucsd.edu

    CO_AUTHOR_NAME Jozette Conti
    CO_AUTHOR_EMAIL jlconti@ucsd.edu

    LAST_UPDATED 2023.07.11

%setup

%environment
    PATH=/opt/conda/bin:$PATH

%files

    ./requirements.txt /

%post -c /bin/bash

    export CONDA_VERSION=py310_23.3.1-0
    export SHA256SUM=aef279d6baea7f67940f16aad17ebe5f6aac97487c7c03466ff01f4819e5a651

    yum install -y wget

    wget --quiet -O miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh && \
      echo "${SHA256SUM}  miniconda.sh" > miniconda.sha256 && \
      if ! sha256sum --strict -c miniconda.sha256; then exit 1; fi

    mkdir -p /opt && \
      sh miniconda.sh -b -p /opt/conda

    ln -s /opt/conda/etc/profile.d/conda.sh /et

## Building a Singularity Container From a Definition File
---

Now that our definition file is complete, we can build our Singularity environment in feather/skyriver. This can be achieved by using the following syntax in the same directory as our definition file.


```
singularity build ContainerName.sif YourDefineFile.def
```

If you have the correct permission, running this line of code will generate a Singularity image named `ContainerName.sif`

For the miniconda environment I made, I ran the following line of code

```
singularity build miniconda.sif Singularity.def
```

This process will take a little bit to run, but the user will be prompted when the build is complete.

> If you are a non-root user in skyriver/feather at this step, you will need to add the flag `--fakeroot`
>> Hence, your line of code should be `singularity build --fakeroot ContainerName.sif YourDefineFile.def`

## Running/Shelling into a Singularity Container
---

Now that our Singularity container is built, we can do a simple check by running the image or shelling into it.

* To run the container, the following prompt was used

```
singularity run miniconda.sif
```

* Similarly, the shell into the container, the following prompt was used

```
singularity shell miniconda.sif
```

> For both of these prompts of code, the username on the command line (i.e., `feather[~]$`) should change to `singularity>`, symbolizing that you are now within the container.

## Transferring a Singularity Container To Comet

---

As Singularity containers cannot be built on Comet, we must transfer the container we made on a different system to Comet. We can run a rsync command (preferred over scp) to do this. The following syntax can be used for this command.

```
rsync -avh --progress /path/to/miniconda.sif <whoami>@comet.sdsc.edu:/cw3e/mead/path/to/where/you/want/miniconda.sif
```

For example, the line of code I ran for this was, 

```
rsync -avh --progress /data/projects/containers/miniconda/miniconda.sif jlconti@comet.sdsc.edu:/cw3e/mead/projects/cwp106/scratch/jlconti/miniconda.sif
```

> This line of code is to be run in feather/skyriver. It then copies the Singularity image to Comet.

* The `--progress` flag will show the user the process of the image transfer. Once completed, make sure to verify the image has, in fact, transferred to the desired directory in Comet.

<a id="verify"></a>
## Verifying Your Container

---

Once the miniconda container has been transferred, run the following line of code</a> with your desired path in Comet.

```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/path/to/miniconda.sif conda list
```

For example, the line of code I ran was

```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/miniconda.sif conda list
```

Once this command has been successfully run, you should see a list of all the installed packages in your container. Verify this list to ensure the container was built with the correct requirements.

> Make sure you run this code line on a compute node in Comet, not the login node. To go onto a compute node, run the following line of code with your specific project ID.
>> `srun -A <project_ID> --partition=compute --pty --nodes=1 --wait=0 --export=ALL -t 08:00:00 /bin/bash`

## Implementing the Container into Bash Scripts

---

Now that our container has been built, transferred, and verified, we can implement it into our scripts. For these next few steps, I will use scripts from the [MET-tools/Grid-Stat repository](https://github.com/CW3E/MET-tools/tree/main). I will work with the [`run_wrfout_cf.sh`](https://github.com/CW3E/MET-tools/blob/main/Grid-Stat/run_wrfout_cf.sh) and [`batch_wrfout_cf.sh`](https://github.com/CW3E/MET-tools/blob/main/Grid-Stat/batch_wrfout_cf.sh) scripts as examples. As these require a NetCDF environment, I built a container (`MET_tools_conda_netcdf.sif`) the same way as above, but now with a different requirements file that is specific to a NetCDF environment. Below are the steps to implement a container into your scripts using `MET_tools_conda_netcdf.sif` as an example.

### Remove any Existing conda Environment 

First, we will have to remove any reference to conda. This is because our newly built Singularity container replaces the need for a conda environment to be hard-coded within our scripts. For example, in the `batch_wrfout_cf.sh` script, commands such as `conda init bash` or `conda activate netcdf` can be removed.

### Add Singularity Container into Scripts 

Now that any reference to conda has been removed, we have to add in our Singularity container for our commands to work. As we are working with a NetCDF environment, we have to reference our container with the following line of code before any calls to NetCDF (i.e., `ncks`, `cdo`, `ncl`).

```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/path/to/MET_tools_conda_netcdf.sif
```

For example, the line of code I used was

```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/MET_tools_conda_netcdf.sif
```

This line of code can be added before any NetCDF calls, as shown in the steps below.

```
cmd="ncl 'file_in=\"${file_2}\"' "
```

This is from [line 203](https://github.com/CW3E/MET-tools/blob/main/Grid-Stat/run_wrfout_cf.sh#L203) in the `run_wrfout_cf.sh` script. To implement the Singularity container, we can add the `singularity exec ...` command right before the call to `ncl`, as shown below. (And similar for any other NetCDF-related command in your scripts).

```
cmd="singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/MET_tools_conda_netcdf.sif ncl 'file_in=\"${file_2}\"' "
```

Check to see if the calls to the Singularity container will successfully run the script. If so, we know everything is working and can move on to the next step.

## Using Containers to Run Python Scripts
---

For Bash scripts, we only had to reference the container for specific NetCDF commands. However, for Python scripts, we will need to use the container itself to run the code. This can be achieved by running a similar `singularity exec --bind ...` command from before directly in the terminal of the working directory. The form of the command to use for Python scripts is shown below.

```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/path/to/MET_tools_conda_ipython.sif python /path/to/scripts/python_script.py
```

For example, the line of code that I used to run the `proc_gridstat.py` script from the MET-tools/Grid-Stat directory was 

```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/MET_tools_conda_ipython.sif python /home/jlconti/MET-tools/Grid-Stat/proc_gridstat.py
```

This line of code can be used to run any Python script that is dependent on the Singularity container environment.

## Using a Singularity Container to Launch a Jupyter Notebook
---

Another way we can utilize our container is by using it to launch a Juypter notebook server off of Comet. Before we can achieve this, however, we need to check if our container has the required Jupyter packages. To check this, we can run the same line of code we used to [verify our container](#verify). For my specific container, I ran the following line 


```
singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/miniconda.sif conda list
```

If the Juypter notebook packages are listed within the contained environment, then the container can be used to launch a Juypter notebook from the terminal. To do this, we first must make sure we are not in any active conda environment. If we are currently in a conda environment, then we can run the following command repeatedly until we are sure no conda environment is active.

```
conda deactivate
```

Once we are out of any active conda environment, we can run the following line of code in the terminal.

```
galyleo launch -B  /cw3e:/cw3e,/scratch:/scratch --sif /cw3e/mead/path/to/miniconda.sif -j notebook -A <project_ID> -t 7-00:00:00 
```

As an example, the line of code I used is below.

```
galyleo launch -B  /cw3e:/cw3e,/scratch:/scratch --sif /cw3e/mead/projects/cwp106/scratch/jlconti/miniconda.sif -j notebook -A cwp106 -t 7-00:00:00 
```

This command uses galyleo to launch a Jupyter notebook from a Singularity container environment for a set amount of time. 

## Summary 
---
This notebook explained Singularity containers and how to make one using a definition file. Once built, it was shown that a container replaces the need for any hard-coded conda environments. Commands that once depended on a conda environment now rely on the container made for that respective environment. Lastly, Singularity containers can be used to run Python scripts as well as launch Jupyter notebooks.  

## Next Steps
---
In the next notebook, [Creating a Configuration File](./Creating-a-Configuration-File.ipynb), we will review how to generate an universal configuration file for both Bash and Python scripts. 