# Containerization with Docker

![Status](https://img.shields.io/static/v1.svg?label=Status&message=Finished&color=brightgreen)
[![Source](https://img.shields.io/static/v1.svg?label=GitHub&message=Source&color=181717&logo=GitHub)](https://github.com/particle1331/inefficient-networks/blob/master/docs/notebooks/deployment/docker.ipynb)
[![Stars](https://img.shields.io/github/stars/particle1331/inefficient-networks?style=social)](https://github.com/particle1331/inefficient-networks)

---

```text
𝗔𝘁𝘁𝗿𝗶𝗯𝘂𝘁𝗶𝗼𝗻: Builds on Week 5: Deploying Machine Learning Models of the ML Zoomcamp (2021). (github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp/05-deployment)
```


## Introduction

Docker takes virtual environments one step further by isolating the entire application from the rest of the infrastructure of your host machine. Recall that to resolve conflicts with dependencies, we usually create multiple virtual environments for each service. Instead, we can create separate **containers** for each service using Docker. 

Containers are allows services to run isolated from each other. This is nice especially if the application have side-effects that can affect other services or the host. Moreover, containers can be run anywhere (i.e. any x86 server running a modern Linux kernel) with the same standard behavior. For example, we will show how to deploy the [Prediction Serving API](https://particle1331.github.io/inefficient-networks/notebooks/deployment/model-serving-api.html) in the cloud using AWS Elastic Beanstalk.



**Readings**
* [Creating an AWS Account](https://mlbookcamp.com/article/aws)
* [A Beginner’s Guide to Understanding and Building Docker Images](https://jfrog.com/knowledge-base/a-beginners-guide-to-understanding-and-building-docker-images/#products)
* [AWS Elastic Beanstalk Features](https://aws.amazon.com/elasticbeanstalk/details/)

## Getting started

Pulling `python-3.8.12-slim`, which is a small, optimized Docker image that runs Python 3.8.12 from Docker Hub. We run this with `-it` so Python runs in interactive mode and `--rm` so that the image will be deleted after exiting. 

```bash
docker run -it --rm python:3.8.12-slim
```
```
Unable to find image 'python:3.8.12-slim' locally
3.8.12-slim: Pulling from library/python
279a020076a7: Pull complete
035530c61301: Pull complete
430f5ca6cd82: Pull complete
594f692a6b57: Pull complete
70b1dc4462d0: Pull complete
Digest: sha256:a2d8844be9a3d5df8cd64c11bba476156cbfe5991db643c83e88ae383c15b5d0
Status: Downloaded newer image for python:3.8.12-slim
Python 3.8.12 (default, Mar  1 2022, 21:13:32)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
```

We can specify the entry point to `bash` so we can access the terminal:

```bash
docker run -it --rm --entrypoint=bash python:3.8.12-slim
```
```bash
root@fc40952e1b70:/# ls
bin   dev  home  media	opt   root  sbin  sys  usr
boot  etc  lib	 mnt	proc  run   srv   tmp  var
root@fc40952e1b70:/#
```

## Dockerfile

Docker can build images automatically by reading the instructions from a `Dockerfile`. This contains all the commands a user could call on the command line to assemble an image. We will save our `Dockerfile` in the root of [`model-deployment/api`](https://github.com/particle1331/model-deployment/tree/eb/api). Note that we switch to `python:3.9.12-slim` to match the development environment of our prediction serving API.

```{margin}
[`api/Dockerfile`](https://github.com/particle1331/model-deployment/blob/eb/api/Dockerfile)
```

```Dockerfile
FROM python:3.9.12-slim

WORKDIR /app

COPY ["*.py", "*.txt", "./"]

COPY ["app", "./app/"]

RUN pip install -r requirements.txt

EXPOSE 8000

ENTRYPOINT ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
```

This sets `app` as the working directory which will create it and `cd` the terminal to it. The next statement copies the contents of the `api` directory in the host, that is in the same directory as our `Dockerfile`, into the current directory in the container which is `app`. Then, dependencies are installed and port `8000` is exposed to the host. Finally, the uvicorn server at `0.0.0.0:8000` is set as the container's default entry point. 

## Running containers

To build the image from the `Dockerfile` in the same directory:

```bash
docker build -t service .
```

Here `-t` means tag which we set to `service`. Note that while the port `8000` of the container is exposed to the host, we still have to connect this to the host's port `8000` which is accessed by the browser. We do this below in running the container with `-p 8000:8000`:

```bash
docker run -it --rm -p 8000:8000 service
```
```text
INFO:     Started server process [1]
2022-05-11 16:30:55.668 | INFO     | uvicorn.server:serve:75 - Started server process [1]
INFO:     Waiting for application startup.
2022-05-11 16:30:55.668 | INFO     | uvicorn.lifespan.on:startup:45 - Waiting for application startup.
INFO:     Application startup complete.
2022-05-11 16:30:55.668 | INFO     | uvicorn.lifespan.on:startup:59 - Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2022-05-11 16:30:55.668 | INFO     | uvicorn.server:_log_started_message:206 - Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```

```{figure} ../../img/docker-host.png
---
width: 30em
---
Connecting the container ports with localhost ports.
```

Setting the entrypoint to `bash` to see the files inside of the container:

```bash
docker run -it --rm --entrypoint=bash service
```
```bash
root@35864429a154:/app# ls
Procfile  mypy.ini	    runtime.txt		   tests
app	  requirements.txt  test_requirements.txt  tox.ini
root@35864429a154:/app#
```

## Deploying to the cloud

In this section, we deploy our application to the cloud using AWS Elastic Beanstalk (EB). Using this, we can simply upload our code and EB automatically handles the deployment, from capacity provisioning, load balancing, and automatic scaling to web application health monitoring. Regarding pricing, there is no additional charge for EB: what you pay for are the AWS resources that are created to store and run your web application, like Amazon S3 buckets and Amazon EC2 instances.


This section requires one to have an AWS account and an IAM user in this account. This can be done pretty quickly by following [this article](https://mlbookcamp.com/article/aws). After this, we have to install the EB CLI:

```bash
pip install awsebcli
```

### Initialization

Initializing our EB application. Setting the platform `-p` to `docker` for obvious reasons, the region `-r` to `us-east-1` and the name of the application as `house-prices`. This requires IAM user credentials which we enter in the terminal.

```bash
eb init -p docker -r us-east-1 house-prices
```
```text
You have not yet set up your credentials or your credentials are incorrect
You must provide your credentials.
(aws-access-id): *********
(aws-secret-key): ***************************
Application house-prices has been created.
```

Notice this creates a `.elasticbeanstalk` directory containing a `config.yml` file:

```YAML
branch-defaults:
  default:
    environment: null
global:
  application_name: house-prices
  branch: null
  default_ec2_keyname: null
  default_platform: Docker
  default_region: us-east-1
  include_git_submodules: true
  instance_profile: null
  platform_name: null
  platform_version: null
  profile: eb-cli
  repository: null
  sc: null
  workspace_type: Application
```

### Testing locally

It's always good to test run the container locally in EB:

```bash
eb local run --port 8000
```
```text
3.9.12-slim: Pulling from library/python
dfdd5ffb2577: Already exists
22d252b4015f: Already exists
38a20a308c16: Already exists
74b110b743da: Already exists
573e544d3cdf: Already exists
Digest: sha256:49082c5b5851e62d5daa510b65fe1120b295ae08a96d7f2cb854f2aa054b5939
Status: Downloaded newer image for python:3.9.12-slim
docker.io/library/python:3.9.12-slim
#1 [internal] load build definition from Dockerfile
#1 sha256:acee64af890dcf198eaccfc920f8dfb2cf1918b27a4851b90cb764e113295108
#1 transferring dockerfile: 37B 0.0s done
#1 DONE 0.1s
#2 [internal] load .dockerignore
#2 sha256:46314d5fc4c9248a818ba2a1b91451a6b08b21e730c78edc9c9952952e7d4421
#2 transferring context: 34B done
#2 DONE 0.0s
#3 [internal] load metadata for docker.io/library/python:3.9.12-slim
#3 sha256:c1f4e52590624a81c4bc4ec7e3ed4861d41ab781ab36aaf259f7f7767d36f68c
#3 DONE 0.0s
#4 [1/5] FROM docker.io/library/python:3.9.12-slim
#4 sha256:6776e43cfb11ad7ef6edd8ab8ef9d3899c912fd1fdb680b44fda719e54ccf196
#4 DONE 0.0s
#6 [internal] load build context
#6 sha256:bb9f786c56a5eb86bb5c84216cbe2897c14ca0a3d64efcfd01c555b11e3e65cd
#6 transferring context: 916B 0.0s done
#6 DONE 0.0s
#8 [4/5] COPY [app, ./app/]
#8 sha256:4f55637d9c3689c282b25ba0444f63b0c83ac448d6b9103fef3be1a660fc72a8
#8 CACHED
#7 [3/5] COPY [*.py, *.txt, ./]
#7 sha256:128f400076c6f3660053196b9faf8de680097f2c514447e5729eb65cccdd3916
#7 CACHED
#5 [2/5] WORKDIR /app
#5 sha256:51924f7a2648c15715590a60e237a35d77e1c14c70c8f66a2f7089d2eec93a33
#5 CACHED
#9 [5/5] RUN pip install -r requirements.txt
#9 sha256:13a22462bc4dcec6b6acec1628de97a6a0779be00a56074fb4f62b966ce8373b
#9 CACHED
#10 exporting to image
#10 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#10 exporting layers done
#10 writing image sha256:cf6140074279c0ec497e6d016fb8588b38b6dbd8c96cf3807d65940c76fddabe done
#10 naming to docker.io/library/jg71b5:may6ra done
#10 DONE 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
INFO:     Started server process [1]
2022-05-12 16:56:37.551 | INFO     | uvicorn.server:serve:75 - Started server process [1]
INFO:     Waiting for application startup.
2022-05-12 16:56:37.552 | INFO     | uvicorn.lifespan.on:startup:45 - Waiting for application startup.
INFO:     Application startup complete.
2022-05-12 16:56:37.555 | INFO     | uvicorn.lifespan.on:startup:59 - Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2022-05-12 16:56:37.572 | INFO     | uvicorn.server:_log_started_message:206 - Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```

### Creating the environment

Since the app works with EB locally, we can now create an actual EB environment in the cloud that underlies the web application:

```bash
eb create house-prices-env
```
```text
Creating application version archive "app-220513_010945910655".
Uploading: [##################################################] 100% Done...
Environment details for: house-prices-env
  Application name: house-prices
  Region: us-east-1
  Deployed Version: app-220513_010945910655
  Environment ID: e-vcacgx5qpw
  Platform: arn:aws:elasticbeanstalk:us-east-1::platform/Docker running on 64bit Amazon Linux 2/3.4.15
  Tier: WebServer-Standard-1.0
  CNAME: UNKNOWN
  Updated: 2022-05-12 17:10:55.488000+00:00
Printing Status:
2022-05-12 17:10:53    INFO    createEnvironment is starting.
2022-05-12 17:10:55    INFO    Using elasticbeanstalk-us-east-1-886789456404 as Amazon S3 storage bucket for environment data.
2022-05-12 17:11:21    INFO    Created security group named: sg-0079e2cdd92f7bd73
2022-05-12 17:11:36    INFO    Created load balancer named: awseb-e-v-AWSEBLoa-1E6LZRJC4OG9G
2022-05-12 17:11:36    INFO    Created security group named: awseb-e-vcacgx5qpw-stack-AWSEBSecurityGroup-16CWEUUKXFVWK
2022-05-12 17:11:36    INFO    Created Auto Scaling launch configuration named: awseb-e-vcacgx5qpw-stack-AWSEBAutoScalingLaunchConfiguration-2fyUMTUX8xJC
2022-05-12 17:13:11    INFO    Created Auto Scaling group named: awseb-e-vcacgx5qpw-stack-AWSEBAutoScalingGroup-SH3I567DJCWT
2022-05-12 17:13:11    INFO    Waiting for EC2 instances to launch. This may take a few minutes.
2022-05-12 17:13:11    INFO    Created Auto Scaling group policy named: arn:aws:autoscaling:us-east-1:886789456404:scalingPolicy:c67fd119-c153-45b7-abf0-a53e04cdcb2a:autoScalingGroupName/awseb-e-vcacgx5qpw-stack-AWSEBAutoScalingGroup-SH3I567DJCWT:policyName/awseb-e-vcacgx5qpw-stack-AWSEBAutoScalingScaleUpPolicy-140KQ45K015EH
2022-05-12 17:13:11    INFO    Created Auto Scaling group policy named: arn:aws:autoscaling:us-east-1:886789456404:scalingPolicy:9532adfe-20c6-4892-9e65-33cad550506c:autoScalingGroupName/awseb-e-vcacgx5qpw-stack-AWSEBAutoScalingGroup-SH3I567DJCWT:policyName/awseb-e-vcacgx5qpw-stack-AWSEBAutoScalingScaleDownPolicy-1W6BQSN057XIR
2022-05-12 17:13:11    INFO    Created CloudWatch alarm named: awseb-e-vcacgx5qpw-stack-AWSEBCloudwatchAlarmHigh-L6AEUYI9GTPI
2022-05-12 17:13:11    INFO    Created CloudWatch alarm named: awseb-e-vcacgx5qpw-stack-AWSEBCloudwatchAlarmLow-1IPO2R5GWMUVZ
2022-05-12 17:14:59    INFO    Instance deployment completed successfully.
2022-05-12 17:15:04    INFO    Application available at house-prices-env.eba-kxppppph.us-east-1.elasticbeanstalk.com.
2022-05-12 17:15:09    INFO    Successfully launched environment: house-prices-env
```

```{figure} ../../img/ebhome.png
---
---
The prediction serving API can now be accessed using the internet.
```

```{figure} ../../img/eb-ui.png
---
---

Elastic Beanstalk web UI.

```

```{figure} ../../img/eb-ui2.png
---
---

Dashboard for the prediction serving API web app environment showing logs and health. Here the application is being terminated.

```

### Making a request

Let us try making a request for a house price prediction given input data:

In [13]:
import requests
import json

inputs = {
  "inputs": [
    {
      "MSSubClass": 20,
      "MSZoning": "RH",
      "LotFrontage": 80,
      "LotArea": 11622,
      "Street": "Pave",
      "LotShape": "Reg",
      "LandContour": "Lvl",
      "Utilities": "AllPub",
      "LotConfig": "Inside",
      "LandSlope": "Gtl",
      "Neighborhood": "NAmes",
      "Condition1": "Feedr",
      "Condition2": "Norm",
      "BldgType": "1Fam",
      "HouseStyle": "1Story",
      "OverallQual": 5,
      "OverallCond": 6,
      "YearBuilt": 1961,
      "YearRemodAdd": 1961,
      "RoofStyle": "Gable",
      "RoofMatl": "CompShg",
      "Exterior1st": "VinylSd",
      "Exterior2nd": "VinylSd",
      "MasVnrType": "None",
      "MasVnrArea": 0,
      "ExterQual": "TA",
      "ExterCond": "TA",
      "Foundation": "CBlock",
      "BsmtQual": "TA",
      "BsmtCond": "TA",
      "BsmtExposure": "No",
      "BsmtFinType1": "Rec",
      "BsmtFinSF1": 468,
      "BsmtFinType2": "LwQ",
      "BsmtFinSF2": 144,
      "BsmtUnfSF": 270,
      "TotalBsmtSF": 882,
      "Heating": "GasA",
      "HeatingQC": "TA",
      "CentralAir": "Y",
      "Electrical": "SBrkr",
      "FirstFlrSF": 896,
      "SecondFlrSF": 0,
      "LowQualFinSF": 0,
      "GrLivArea": 896,
      "BsmtFullBath": 0,
      "BsmtHalfBath": 0,
      "FullBath": 1,
      "HalfBath": 0,
      "BedroomAbvGr": 2,
      "KitchenAbvGr": 1,
      "KitchenQual": "TA",
      "TotRmsAbvGrd": 5,
      "Functional": "Typ",
      "Fireplaces": 0,
      "GarageType": "Attchd",
      "GarageYrBlt": 1961,
      "GarageFinish": "Unf",
      "GarageCars": 1,
      "GarageArea": 730,
      "GarageQual": "TA",
      "GarageCond": "TA",
      "PavedDrive": "Y",
      "WoodDeckSF": 140,
      "OpenPorchSF": 0,
      "EnclosedPorch": 0,
      "ThreeSsnPortch": 0,
      "ScreenPorch": 120,
      "PoolArea": 0,
      "Fence": "MnPrv",
      "MiscVal": 0,
      "MoSold": 6,
      "YrSold": 2010,
      "SaleType": "WD",
      "SaleCondition": "Normal"
    }
  ]
}


host = 'http://house-prices-env.eba-kxppppph.us-east-1.elasticbeanstalk.com'
url = f'{host}/api/v1/predict'
response = requests.post(url, json=inputs)
result = response.json()

print(json.dumps(result, indent=4))

{
    "errors": null,
    "version": "0.1.0",
    "predictions": [
        113422.55344864173
    ]
}


### Terminating the environment

Note that this service is accessible to anyone on the internet. This is a security risk. In actual production, the service should only be accessible to the intended users. Hence, we now terminate the application. This can be done by simply running `eb terminate house-prices-env` in the CLI:

```bash
eb terminate house-prices-env
```
```text
The environment "house-prices-env" and all associated instances will be terminated.
To confirm, type the environment name: house-prices-env
2022-05-12 21:54:53    INFO    terminateEnvironment is starting.
2022-05-12 21:55:11    INFO    Deleted CloudWatch alarm named: awseb-e-23gau7nwzi-stack-AWSEBCloudwatchAlarmHigh-J443745IGHJ1
2022-05-12 21:55:11    INFO    Deleted CloudWatch alarm named: awseb-e-23gau7nwzi-stack-AWSEBCloudwatchAlarmLow-MB18SO8WWYAQ
2022-05-12 21:55:11    INFO    Deleted Auto Scaling group policy named: arn:aws:autoscaling:us-east-1:886789456404:scalingPolicy:d7505a15-779c-443a-9177-e7f7fb014bb3:autoScalingGroupName/awseb-e-23gau7nwzi-stack-AWSEBAutoScalingGroup-5NOVWKOF9ZLC:policyName/awseb-e-23gau7nwzi-stack-AWSEBAutoScalingScaleUpPolicy-FUW2YBNF7V2M
2022-05-12 21:55:11    INFO    Deleted Auto Scaling group policy named: arn:aws:autoscaling:us-east-1:886789456404:scalingPolicy:35fc0640-05dd-4814-893c-f32bb0615313:autoScalingGroupName/awseb-e-23gau7nwzi-stack-AWSEBAutoScalingGroup-5NOVWKOF9ZLC:policyName/awseb-e-23gau7nwzi-stack-AWSEBAutoScalingScaleDownPolicy-15K1L1CIJDG7I
2022-05-12 21:55:11    INFO    Waiting for EC2 instances to terminate. This may take a few minutes.
2022-05-12 21:57:44    INFO    Deleted Auto Scaling group named: awseb-e-23gau7nwzi-stack-AWSEBAutoScalingGroup-5NOVWKOF9ZLC
2022-05-12 21:57:59    INFO    Deleted load balancer named: awseb-e-2-AWSEBLoa-1LYFCW64ZS7SZ
2022-05-12 21:58:00    INFO    Deleted Auto Scaling launch configuration named: awseb-e-23gau7nwzi-stack-AWSEBAutoScalingLaunchConfiguration-a5VmF3A2Bp0Y
2022-05-12 21:58:00    INFO    Deleted security group named: awseb-e-23gau7nwzi-stack-AWSEBSecurityGroup-DN3MPY78253L
2022-05-12 21:58:45    INFO    Deleted security group named: sg-0b2e76e66001617ea
2022-05-12 21:58:47    INFO    Deleting SNS topic for environment house-prices-env.
2022-05-12 21:58:52    INFO    terminateEnvironment completed successfully.
```

**Remark.** The astute reader will notice that this is a different environment from above. Indeed, we recreated `house-prices-env` for the sake of demonstration.

### Load balancing

EB performs automatic load balancing under the hood. It will add more instances of the service when there are lots of request during peak hours, and automatically removes instances when there are no more requests. Moreover, the load balancer listens for incoming HTTP traffic and distributes it to the multiple instances on the same port, so the requests are dealt with in an efficient manner. The same process can also restart the application if it crashes for any reason.

As a consequence of this automatic nature, applications should be designed to be **stateless**, as containers can be terminated and replaced at any time without your knowledge or involvement. For example, it should not store any data that will be used for future requests (recall that RESTful APIs adheres to this requirement).

```{figure} ../../img/load-balancing.png
---
width: 35em
---

EB automatically load balances by scaling up or scaling down the number of containers depending on the rate of requests. The load is distributed to the multiple instances of the application.

```