---
title: "DSCI 522 Lecture 3"
subtitle: "Customizing and Building Containers"
author: "Sky Sheng"
execute:
  eval: false
format:
  revealjs:
    theme: default
    transition: slide
    slide-number: true
    chalkboard: true
    logo: ../images/mds-hex-sticker.png
---


## üò∫ iClicker: How is your group collaboration going?

::: {.column width="33%"}
<div style="text-align: center; font-weight: bold; font-size: 24px; margin-bottom: 10px;">(A)</div>
<div style="text-align: center;">
<img src="https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExdHRiejRzbDhlOGcyNTI5aGc5MTl0ZWJiMnEwNW1tZWFueTQxZDZ3ayZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/gl8ymnpv4Sqha/giphy.gif" style="width: 60%; max-width: 400px;"/>
</div>
:::

::: {.column width="33%"}
<div style="text-align: center; font-weight: bold; font-size: 24px; margin-bottom: 10px;">(B)</div>
<div style="text-align: center;">
<img src="https://media2.giphy.com/media/v1.Y2lkPTc5MGI3NjExODc3cXlvZHd1MmlybDZxenM5ZTNiOHcyZXd5djFmaDluamRvNWV0ZSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/gcW5zU5Te3qPm/giphy.gif" style="width: 65%; max-width: 400px;"/>
</div>
:::

::: {.column width="33%"}
<div style="text-align: center; font-weight: bold; font-size: 24px; margin-bottom: 10px;">(C)</div>
<div style="text-align: center;">
<img src="https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExMmQ3a2VweWowNHlhZ3hmZTFzdmJibWQzd2Jrd2xwa3dlbnFjYmIweiZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/JIX9t2j0ZTN9S/giphy.gif" style="width: 60%; max-width: 400px;"/>
</div>
:::


<p style="position: fixed; bottom: 0px; right: 0px; font-size: 20px; color: gray; z-index: 1000;">[Source](https://giphy.com)</p>

## [Conda](https://ubc-dsci.github.io/reproducible-and-trustworthy-workflows-for-data-science/lectures/085-virtual_env-python-conda.html) Recap: üßê How to...

::: {style="font-size: 0.7em;"}
- Create a new conda environment
- List all conda environments
- Activate a conda environment
- Deactivate a conda environment
- Remove a conda environment
- List all packages in a conda environment
- Remove a package from a conda environment
- Update a package in a conda environment
- Update all packages in a conda environment
- Share a conda environment with someone else
- Create a conda environment from `environment.yml` file
- Duplicate a conda environment
:::

## üìú Conda Cheat Sheet Part 1Ô∏è‚É£ 

::: {style="font-size: 0.7em;"}
| Task | Command |
|------|---------|
| Create a new conda environment | `conda create -n <env_name>` |
| Create with specific Python version | `conda create --name <env_name> python=3.11` |
| Create with packages | `conda create --name <env_name> numpy pandas` |
| List all conda environments | `conda env list` or `conda info --envs` |
| Activate a conda environment | `conda activate <env_name>` |
| Deactivate a conda environment | `conda deactivate` |
| Remove a conda environment | `conda env remove -n <env_name>` |
| List all packages in environment | `conda list` |
| Remove a package from **current** environment| `conda remove <package_name>` |
:::

## üìú Conda Cheat Sheet Part 2Ô∏è‚É£ 

::: {style="font-size: 0.65em;"}
| Task | Command |
|------|---------|
| Update a package in **current** environment | `conda update <package_name>` |
| Remove a package in **other** environment | `conda remove -n <env_name> <package_name>` |
| Update a package in **other** environment | `conda update -n <env_name> <package_name>` |
| Update all packages in **current** environment | `conda update --all` |
| Share environment | `conda env export --from-history > environment.yml` |
| Create conda environment from `environment.yml` | `conda env create --file environment.yml` |
| Duplicate a conda environment | `conda create --name <new_env> --clone <old_env>` |
:::

## üìï Command Line Notes

* `--name` and `-n` are equivalent.
* `--file` and `-f` are equivalent.
* `-p` and `--platform` are equivalent, to specify the platform (e.g., `linux-64`, `osx-arm64`, `win-64`).
* More command line review cheatsheet is [here](https://github.com/skysheng7/linux_command_in_aws_ec2/blob/main/linux_cheatsheet/basic_command_line.md).

## [Conda-lock](https://ubc-dsci.github.io/reproducible-and-trustworthy-workflows-for-data-science/lectures/090-conda-lock.html) Recap

üîí What are the commands for the following tasks?

- Generate a general conda-lock file for all platforms
- Generate a conda-lock file for a specific platform


## üìú Conda-lock Cheat Sheet

::: {style="font-size: 0.7em;"}
|Command | Output |
|-------------------|-------|
|#General file all platforms <br> `conda-lock lock --file environment.yml` | `conda-lock.yml` |
|#General file for one platform (e.g., Linux) <br> `conda-lock lock --file environment.yml -p linux-64` | `conda-lock.yml` |
|#Explicit lock file for one platform (e.g., Linux) <br> `conda-lock -k explicit --file environment.yml -p linux-64` | `conda-linux-64.lock` |
|#Explicit lock file from `conda-lock.yml` <br> `conda-lock render -p linux-64` | `conda-linux-64.lock` |
:::

## üôÄ `conda-lock.yml` VS `conda-linux-64.lock`? {style="font-size: 0.8em;"}

::: {style="font-size: 0.73em;"}
| Feature | `conda-lock.yml` | `conda-linux-64.lock` |
|------|-----------------------|----------------------------|
| **Format** | Unified YAML (multi-platform) | Explicit (single-platform) |
| **Content** | Structured metadata + dependencies for all platforms | Simple list of package URLs |
| **File Size** | Larger (contains all platforms) | Smaller (one platform only) |
| **Installation** | `conda-lock install --name <env_name> conda-lock.yml` | `conda create --name <env_name> --file conda-linux-64.lock` |
| **Use case** | Development across multiple platforms | Production deployment, Docker, single platform |
| **Speed** | Slightly slower (conda-lock processes it) | Fastest |
:::

## Some data science conventions

::: {style="font-size: 0.8em;"}
* üêç Use snake_case for all folder & file names (lowercase + underscore)
* Use Markdown for documentation. 
    * Need a review? [Here is the markdown tutorial](https://www.markdowntutorial.com/)
* yaml (`.yml` or `.yaml`) files for data storage and system configurations (NOT for documentation)
    * YAML = Yet Another Markup Language
    * No use of strict symbols (e.g., braces, square brackets)
    * Use `#` for comments
    * Python style indentation using whitespace (NOT TABS)
    * JSON is **subset** of YAML. JSON file can be parsed by a YAML parser.
:::

## üö¢ Docker Recap

1. ‚ö†Ô∏è Command lines are very sensitive to whitespace and quotation marks! 


```{bash}
docker run \
    --rm \
    -p 8788:8787 \
    -e PASSWORD="apassword" \
    rocker/rstudio:4.4.2

```


2. üè∑Ô∏è Don't use the TAG `latest` for the image, use the specific version tag instead.
3. Docker cheatsheet is available in [textbook](https://ubc-dsci.github.io/reproducible-and-trustworthy-workflows-for-data-science/lectures/110-containerization-2.html)

# We don't like manual work!

<div style="text-align: center;">
  <img src="https://media3.giphy.com/media/v1.Y2lkPTc5MGI3NjExeWxxaDBuOG1wdnhzcXFhYXc4bzNuZGhjamJrYTRkcjJkemlycWdxMyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/jnhXd7KT8UTk5WIgiV/giphy.gif" style="width: 90%; max-width: 700px;"/>
</div>

<p style="position: fixed; bottom: 0px; right: 0px; font-size: 20px; color: gray; z-index: 1000;">Source: Minions movie</p>


## ü§© `docker-compose.yml` file comes to the rescue!


```{yaml}
services:
  analysis-env:
    image: rocker/rstudio:4.4.2
    ports:
      - "8789:8787"
    volumes:
      - .:/home/rstudio/project
    environment:
      PASSWORD: password
    deploy:
      resources:
        limits:
          memory: 5G
```


## üò≥ But how to use `docker-compose.yml`?

* Launch the container: `docker compose up`
* Stop the container: type `Cntrl + C` in the terminal where you launched the container, and then type `docker-compose rm`
* Read more in the textbook [here](https://ubc-dsci.github.io/reproducible-and-trustworthy-workflows-for-data-science/lectures/115-docker-compose.html)

## Today's topic: Let's build our own [container](https://ubc-dsci.github.io/reproducible-and-trustworthy-workflows-for-data-science/lectures/120-containerization-3.html)!

![](../images/docker_otter.png){style="position: fixed; bottom: -100px; left: -55px; margin: 0; padding: 0;"}

# Docker images for RStudio, VSCode, and Cursor!

* Daniel has kindly created this tutorial on how to build docker image for R environment managed by `renv`: [docker-renv](https://github.com/chendaniely/docker-renv.git)
* Sky has created this tutorial for how to build docker image for VSCode & Cursor: [docker-vscode-cursor](https://github.com/skysheng7/docker-vscode-cursor.git)

# Command line using `docker-compose.yml` file

## Code we run in class: `docker-compose.yml` practice

Step-by-step instructions:

```{bash}
# 1. Please first `cd` to a local folder of your choice, don't put all files in your home directory!
# Example `cd` command: 
cd /Users/skysheng/Desktop/github/dsci522

# 2. make a new folder called `demo_docker` (or any other name you like)
mkdir demo_docker

# 3. move into that folder we just created
cd demo_docker

# 4. create a docker-copose.yml file using nano
nano docker-compose.yml

```


## Code we run in class: `docker-compose.yml` practice

::: {.smaller style="font-size: 0.8em;"}
5. Copy and paste the following code into the `docker-compose.yml` file. Local port is set at 8789, password is set to `password`, username is `rstudio`.

```{yaml}
services:
  analysis-env:
    image: rocker/rstudio:4.4.2
    ports:
      - "8789:8787"
    volumes:
      - .:/home/rstudio/project
    environment:
      PASSWORD: password
    deploy:
      resources:
        limits:
          memory: 5G
```


* You can also create this file using Graphical user interface (GUI) like VSCode. 
* If you used `nano` to create this file, you need to press `Cntrl + X` to attempt exit, by default it will ask you to save the file. Press `Y` and then press `Enter` to save the file. 

:::

## Code we run in class: `docker-compose.yml` practice

::: {.smaller style="font-size: 0.9em;"}

```{bash}
# 6. Print out the content of the file to make sure it is correct.
cat docker-compose.yml

# 7. Launch the container using docker compose files.
docker compose up
```


8. After the container is launched, your terminal will be hanging. You can open your browser and go to `http://localhost:8789` to access RStudio.

9. To stop the container, you need to type `Cntrl + C` in the terminal where you launched the container, and then type: 


```{bash}
# 10. Remove the container.
docker-compose rm
```

:::

# Command line creating `Dockerfile` 

## Code we run in class: `Dockerfile` practice

Step-by-step instructions:

```{bash}
# 1. Please first `cd` to a local folder of your choice, don't put all files in your home directory!
# Example `cd` command: 
cd /Users/skysheng/Desktop/github/dsci522

# 2. make a new folder called `demo_docker` (or any other name you like)
mkdir demo_docker

# 3. move into that folder we just created
cd demo_docker

# 4. create a environment.yml file using nano
nano environment.yml
```


## Code we run in class: `Dockerfile` practice

::: {.smaller style="font-size: 0.8em;"}

5. Copy and paste the following code into the `environment.yml` file.

```{yaml}
name: my_env
channels:
- conda-forge
dependencies:
- conda-lock=3.0.4
- pandas=2.3.3
- pandera=0.26.1
- pip=25.3
- python=3.11.14
- pip:
  - deepchecks==0.19.1
```


* You can also create this file using Graphical user interface (GUI) like VSCode. 
* If you used `nano` to create this file, you need to press `Cntrl + X` to attempt exit, by default it will ask you to save the file. Press `Y` and then press `Enter` to save the file. 

:::

## Code we run in class: `Dockerfile` practice


```{bash}
# 6. Print out the content of the file to make sure it is correct.
cat environment.yml
```


7. Create a conda-lock file using the following command: 

* Macbook users with Apple Silicon chips will need to use the following command to create a explicit lock file for linux OS: 


```{bash}
conda-lock -k explicit --file environment.yml -p linux-aarch64
```


* Everyone else can use the following command: 


```{bash}
conda-lock -k explicit --file environment.yml -p linux-64
```


## Code we run in class: `Dockerfile` practice

8. Create a Dockerfile file using nano, or you can use GUI like VSCode.


```{bash}
# 8.Create a Dockerfile file using nano, or you can use GUI like VSCode.
nano Dockerfile 
```


## Code we run in class: `Dockerfile` practice

::: {.smaller style="font-size: 0.8em;"}
9. Copy and paste the following code into the `Dockerfile` file. We use jupyter minimal notebook image as an example, and copy the conda-lock file to the container.

* Macbook users with Apple Silicon chips will need to use the following code: 


```{dockerfile}
FROM quay.io/jupyter/minimal-notebook:afe30f0c9ad8

COPY conda-linux-aarch64.lock /tmp/conda-linux-aarch64.lock
```


* Everyone else can use the following code: 


```{dockerfile}
FROM quay.io/jupyter/minimal-notebook:afe30f0c9ad8

COPY conda-linux-64.lock /tmp/conda-linux-64.lock
```

:::

## Code we run in class: `Dockerfile` practice

10. Build the docker image locally using the following command: 


```{bash}
# 10. Build the docker image locally, with the tag name `testing_cmds`. 
# pay attention to the dot at the end of the command! 
docker build --tag testing_cmds .
```


## Code we run in class: `Dockerfile` practice

11. Run the docker image you just built using the following command: 

* Launch terminal only (Mac M4 chip you may run into bugs here):


```{bash}
docker run --rm -it testing_cmds ../../bin/bash
```


* Launch interactive terminal on web browser:


```{bash}
docker run --rm -it -p 8888:8888 testing_cmds
```