<div align="center"><a href="https://www.nvidia.com/en-us/deep-learning-ai/education/"><img src="./images/DLI_Header.png"></a></div>

# Docker for Recommender Systems

When it comes time to take a model from experiment to production, there are many operational aspects to consider. Running a model out of a notebook is difficult to scale to millions of users. Considering how many items we could be making predictions for and how active our user base might be, we could easily be making thousands of predictions per second! (For comparison, Netflix receives [20,000 requests per second during peak traffic](https://netflixtechblog.com/making-the-netflix-api-more-resilient-a8ec62159c2d))

There are many different strategies to scale, but we'll be discussing [Docker](https://www.docker.com/).

## Objectives
* Understand how to set up a Docker container
  * [1. Dockerfile](#1.-Dockerfile)
* Understand how to set up multiple containers with [Docker Compose](https://docs.docker.com/compose/)
  * [2. Docker Compose](#2.-Docker-Compose)

## 1. Dockerfile

Docker is a library for making [containers](https://www.docker.com/resources/what-container), which is a way to package code and its dependencies so it can easily be copied and transported to different computing environments. They are similar to [virtual machines](https://en.wikipedia.org/wiki/Virtual_machine). A virtual machine has its own virtualized hardware and operating system separate from its host machine, but a container uses its host machine's operating system.

Let's get meta for a moment, and look at the [Dockerfile](https://docs.docker.com/engine/reference/builder/) for this class by running the cell below. There are a number of Docker commands that we used to build this notebook environment:
* [FROM](https://docs.docker.com/engine/reference/builder/#from): The base container to initially build from. Containers can be built on top of other containers. In our case, we'll be using [NVIDIA's TensorFlow container](https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow) which already configures TensorFlow to run on top of a recent version of CUDA.
* [ENV](https://docs.docker.com/engine/reference/builder/#env): Sets an environmental variable.
* [RUN](https://docs.docker.com/engine/reference/builder/#run): Run a command line argument.
* [WORKDIR](https://docs.docker.com/engine/reference/builder/#workdir): Change the working directory inside the container. `RUN cd` is ineffective as each `RUN` command gets a fresh shell as described in this [Stack Overflow](https://stackoverflow.com/questions/58847410/difference-between-run-cd-and-workdir-in-dockerfile) host.
* [ADD](https://docs.docker.com/engine/reference/builder/#add): Copy data from the build environment into the docker container. In this case we're copying the labs, like this one here.
* [EXPOSE](https://docs.docker.com/engine/reference/builder/#expose): Listen to the specified port. This lab is connecting to port `8888`.
* [ENTRYPOINT](https://docs.docker.com/engine/reference/builder/#entrypoint): Allows us to run our container as an executable and pass command line arguments.

In [1]:
import IPython
IPython.display.Code(filename="../Dockerfile", language="dockerfile")

This Dockerfile can be broken down into four goals:
1. Install the necessary libraries to set up Jupyter.
2. Install libraries for students to interact with in the notebooks.
3. Install libraries to interact with a Triton server.
4. Start the Jupyter server.

Point 3 is how this all relates to recommender Systems. We're going to be working with a [Triton Server](https://docs.nvidia.com/deeplearning/triton-inference-server/master-user-guide/docs/) to scale out our Wide & Deep model from the previous lab so we can make web-based requests to it.

## 2. Docker Compose

For this lab, we're actually running multiple containers. Let's take a look at a different file to see how to set that up. Below is the [Docker Compose](https://docs.docker.com/compose/). It's similar to our `Dockerfile` above, but it's written with `.yml` instead.

For instance, [image](https://docs.docker.com/compose/compose-file/#image) below corresponds with [FROM](https://docs.docker.com/engine/reference/builder/#from) above. Under the [services](https://docs.docker.com/compose/compose-file/#service-configuration-reference), we have a number of containers used to build the course.

To focus on recommender systems, we're going to look at `triton` and `prometheus`. The other services are boilerplate for getting JupyterLab up and running, but they're visible for the curious.

For now, let's focus on `triton` and break down each of the keys:
* [command](https://docs.docker.com/compose/compose-file/#command): The command for the container to run once it's built. In this case, we're running the command to initiate the server if we had installed the Triton Inference Server Library locally as [described here](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/quickstart.html#run-triton-inference-server).
* [image](https://docs.docker.com/compose/compose-file/#image): The base image that we're building off of, in this case, the [Triton Inference Server] image.
* [shm-size](https://docs.docker.com/compose/compose-file/#shm_size): The amount of memory to share with the container. In this case, we're giving it 1 gigabyte for faster computation.
* [ulimits](https://docs.docker.com/compose/compose-file/#ulimits): The max number of open file descriptors per process explained in this [Stack Overflow](https://stackoverflow.com/questions/24955883/what-is-the-max-opened-files-limitation-on-linux) post.
* [ports](https://docs.docker.com/compose/compose-file/#ports): The ports to expose from the container.
* [volumes](https://docs.docker.com/compose/compose-file/#volume-configuration-reference): A directory that can be shared between a container and it's host.

In [2]:
IPython.display.Code(filename="../docker-compose.yml", language="yaml")

## Wrap Up

Docker is useful for production practices because it can be developed locally on one machine before deploying the configuration to a cloud service or server farm.

We've already launched a Triton docker container when we launched the container with this lab. Check out the [next notebook](3-03_triton.ipynb) to start interacting with it.

In [3]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

<div align="center"><a href="https://www.nvidia.com/en-us/deep-learning-ai/education/"><img src="./images/DLI_Header.png"></a></div>