# [CPSC 322](https://github.com/GonzagaCPSC322) Data Science Algorithms
[Gonzaga University](https://www.gonzaga.edu/)

[Gina Sprint](http://cs.gonzaga.edu/faculty/sprint/)

# Model Deployment
What are our learning objectives for this lesson?
* Pickle and unpickle Python objects
* Create our own Docker container image for a Flask app
* Deploy the Docker container image to Heroku

Content used in this lesson is based upon information in the following sources:
* [Heroku Deploying with Docker](https://devcenter.heroku.com/categories/deploying-with-docker)

## Warm-up Task(s)
1. I'm teaching a sample class this Saturday for GEL weekend. Could you do me a favor and please fill out a short Google Form to give me some data to use in the class? https://forms.gle/iMx4yQA5Czep4Rxb6
1. Test your deployment from Tuesday's class worked by going to https://dashboard.heroku.com/apps, clicking on your app, and clicking "Open App"
    1. It may take a few seconds to spin up before you see "Welcome to my App!!"
1. Test your `/predict` endpoint from `interview_predictor.py` by updating the `url` to your app
1. To recap, here were the commands we executed to deploy:
    1. `git init`, `git add -A`, `git commit -m "initial"`
    1. `heroku login -i`
    1. `heroku create interview-app-<your name here>`
    1. `heroku stack:set container`
    1. `git push heroku master`

## Today
* Announcements
    * RQ9 is due on Monday
    * PA6 is due on Wednesday. Questions?
    * I'll read project proposals Friday likely. Questions?
    * I'm posting ~7 YouTube videos on APIs, Flask, Pickling, and Heroku/Docker deployment to my [Data Science YouTube playlist](https://youtube.com/playlist?list=PL7uPCUbavAWf66__MqJfMvFgDqN7ghLSo) 😎
* Pickling and making a prediction using a tree
* Start ensemble learning
* IQ8 ~15 mins of class

## Introduction to Model Deployment
Once we have a trained, tested, and evaluated a classifier, we often want to "deploy" the classifier's model (e.g. a decision tree) so it can be used to make predictions for real unseen instances. For example, suppose we have a mobile app that is supposed to make predictions about what new hobbies a user might enjoy. The mobile app collects some data about the user, perhaps via a quiz, then needs to get a prediction from an already trained model. There are two main options for how to set this up:
1. Package the trained model with the app executable. One major downside of this is the model is stored as a static file and is not always up-to-date.
1. Host the model on the web and have the app query the model via the internet. This requires hosting the model via a web app, but it allows the model to always be up-to-date (that is, if the programmers set this up).

We are going to see how we can host a trained model via a Flask web app so clients, like a mobile app, can use the model to make predictions for unseen data! This will roughly correspond to three major steps/new topics:
1. Exporting a model: how to "save" a trained model and use it for predictions "later"
1. Creating Docker container images: while not required to deploy a Flask web app, creating a Docker image for our Flask web app will make the app more portable, allowing it to be easily hosted on several cloud platforms that support Docker container deployment, such as [Heroku](https://devcenter.heroku.com/categories/deploying-with-docker), [Amazon Web Services Elastic Compute Service](https://docs.aws.amazon.com/AmazonECS/latest/userguide/docker-basics.html), and [Microsoft Azure Container Instances](https://docs.microsoft.com/en-us/azure/devops/pipelines/apps/cd/deploy-docker-webapp?view=azure-devops&tabs=java).
1. Deploying a Docker container image to the cloud: as mentioned in the previous step, we have several options for where to deploy our web app. We will use Heroku in this demo!

## Exporting a Model
Once we train our classifier, we need a way to save the state of the classifier so it can be used later for predictions (likely in a different process/application). An straightforward way to do this is to save the classifer's Python object to a file after it has been trained. Generally, this is called object serialization. You can move this file to another location, like a production server. Later, you can create a Python object at runtime from this file. This is called object de-serialization. In Python, serialization and de-serialization can be done with the [`pickle` standard library](https://docs.python.org/3/library/pickle.html).

Example: Let's say we just trained a `MyDecisionTreeClassifier` via its`fit()` method and we have a reference to this classifier via `clf` (or if you are writing a method of `MyDecisionTreeClassifier`, then `self`). We can now "pickle" (AKA serialize) `clf` with:

```python
# serialize to file (pickle)
outfile = open("tree.p", "wb")
pickle.dump(clf, outfile)
outfile.close()
```
Note the file mode is "wb" because pickling writes a binary file, which is not going to be human readable if you try open this with a text editor. 

Later, when we want to "unpickle" (AKA de-serialize), we can open this "tree.p" with the "rb" file mode:

```python
# deserialize to object (unpickle)
infile = open("tree.p", "rb")
clf = pickle.load(infile)
infile.close()
```

To summarize, we will pickle a classifier to a file, then copy that file into our production environment. 

## Creating Docker Container Images
Docker containers are instances of Docker images. We have been using the [continuumio/anaconda3](https://hub.docker.com/r/continuumio/anaconda3) Docker image and a container based on this image as our Python development environment in this class. We are now going to learn how to make our own Docker images, and then deploy a container based on this image to the web. Let's get started!

Build specifications for an image are listed in a special file called `Dockerfile`. From the [Docker Docs](https://docs.docker.com/engine/reference/builder/):
>Docker can build images automatically by reading the instructions from a Dockerfile. A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build users can create an automated build that executes several command-line instructions in succession.

Here is an example `Dockerfile` for building a Flask image:

```docker
FROM continuumio/anaconda3:2020.11

ADD . /code
WORKDIR /code
ENTRYPOINT ["python", "app.py"]
```

Here is what each part of this `Dockerfile` specifies about the to-be-built image:
* `FROM <image>[:<tag>]`: A Dockerfile must begin with a FROM instruction. The FROM instruction specifies the Parent Image from which you are building.
    * For our Flask app, our base image will be the same [continuumio/anaconda3](https://hub.docker.com/r/continuumio/anaconda3) Docker image we have been using in this class.
* `ADD <src> <dest>`: The ADD instruction copies new files, directories or remote file URLs from `<src>` and adds them to the filesystem of the image at the path `<dest>`.
    * For our Flask app, we will copy the contents of the current working directory (which should be our project directory which has our `app.py` in it) into a new `/code` directory of the container when a container is instantiated.
* `WORKDIR /path/to/workdir`: The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY and ADD instructions that follow it in the Dockerfile. If the WORKDIR doesn’t exist, it will be created even if it’s not used in any subsequent Dockerfile instruction.
    * For our Flask app, we want our current working directory of a container to be `/home` so our next command (`ENTRYPOINT`) can use a simple relative path to refer to `app.py`.
* `ENTRYPOINT ["executable", "param1", "param2"]`: An ENTRYPOINT allows you to configure a container that will run as an executable. Note: You can use the exec form of ENTRYPOINT to set fairly stable default commands and arguments and then use either form of CMD to set additional defaults that are more likely to be changed.
    * For our Flask app, our container is an executable that runs the Flask app. We will invoke the Flask app the same we did directly when we were debugging it.
    
To create the image, cd into the project directory with the `Dockerfile` (typically this is in your project's top-level directory) and run `docker build -t interview:latest .` On success, your output should look like this:

```bash
sprint@cps-25626 APIServiceFun % docker build -t flaskapp/interview:latest .
[+] Building 0.2s (8/8) FINISHED                                                                           
 => [internal] load build definition from Dockerfile                                                  0.0s
 => => transferring dockerfile: 219B                                                                  0.0s
 => [internal] load .dockerignore                                                                     0.0s
 => => transferring context: 2B                                                                       0.0s
 => [internal] load metadata for docker.io/continuumio/anaconda3:2020.11                              0.0s
 => [internal] load build context                                                                     0.0s
 => => transferring context: 703B                                                                     0.0s
 => [1/3] FROM docker.io/continuumio/anaconda3:2020.11                                                0.0s
 => CACHED [2/3] ADD . /code                                                                          0.0s
 => CACHED [3/3] WORKDIR /code                                                                        0.0s
 => exporting to image                                                                                0.0s
 => => exporting layers                                                                               0.0s
 => => writing image sha256:519113531d754a411acdd4b18411dd5d89fd61257c3f9df46aa9890f862feac5          0.0s
 => => naming to docker.io/flaskapp/interview:latest   
```

To confirm the image was created, run `docker images`. You can also open Docker Desktop and click on Images. Now, you can create a container from this image with the `docker run` command just like we did with the [continuumio/anaconda3](https://hub.docker.com/r/continuumio/anaconda3) Docker image!! How cool!!

## Deploying a Docker Container Image to Heroku
[Heroku](https://www.heroku.com) is a platform as a service (PaaS) that enables developers to build, run, and operate applications entirely in the cloud. We will use Heroku to host our Flask app on the web so anyone can use it (plus, [Heroku has a partnership with the Github Student Developer Pack](https://www.heroku.com/github-students) that is worth checking out)! If you are interested in using a different cloud platform, here are the Docker docs for deployment to [Azure Container Instances](https://docs.docker.com/cloud/aci-integration/) and [AWS Elastic Compute Services](https://docs.docker.com/cloud/ecs-integration/). 

Heroku runs apps in dynos, which are lightweight Linux containers. There are different tiers of dynos. We will be running a "free web dyno" that has some restrictions. From the [Heroku free dyno hours docs](https://devcenter.heroku.com/articles/free-dyno-hours):
>If an app has a free web dyno, and that dyno receives no web traffic in a 30-minute period, it will sleep. In addition to the web dyno sleeping, the worker dyno (if present) will also sleep.  
>Free web dynos do not consume free dyno hours while sleeping.  
>If a sleeping web dyno receives web traffic, it will become active again after a short delay (assuming your account has free dyno hours available).

This means that your app will be slow to respond to an initial request if it is sleeping. To avoid this issue, I encourage you to sign up for the free [Github Student Developer Pack](https://education.github.com/pack) because [Heroku has a partnership with the Github Student Developer Pack](https://www.heroku.com/github-students) where you can get a free "hobby dyno" that won't sleep.

Here are some deployment steps to perform to prepare to deploy to Heroku:
1. Make sure your Flask `run()` command has the following arguments (read more about this [here](https://medium.com/@ksashok/containerise-your-python-flask-using-docker-and-deploy-it-onto-heroku-a0b48d025e43)):
```python
port = int(os.environ.get("PORT", 5000))
app.run(host='0.0.0.0', port=port, debug=False)
```
1. Create a Heroku account

There are two main approaches to deploying a Flask app to Heroku:
1. [Deploy your app as a Python app](https://devcenter.heroku.com/categories/python-support). To do this, define a [`Procfile`](https://devcenter.heroku.com/articles/procfile) and a `requirements.txt` for your Python dependencies (this will cause the [Python buildpack](https://elements.heroku.com/buildpacks/heroku/heroku-buildpack-python) to automatically be detected and used by Heroku). Enable [Github Integration](https://devcenter.heroku.com/articles/github-integration) for your Heroku app, this connects your Heroku app to your app's Github repository. 
1. [Deploy your app as a Docker container](https://devcenter.heroku.com/categories/deploying-with-docker). Heroku's ["Deploying with Docker" documentation](https://devcenter.heroku.com/categories/deploying-with-docker) provides two ways for you to deploy your app with Docker. There are also (at least) one other way to deploy a Flask app with Docker I'll mention here:
    1. (Heroku CLI w/pre-build Docker images) Container Registry allows you to deploy pre-built Docker images to Heroku
        * Instructions: https://devcenter.heroku.com/articles/container-registry-and-runtime)
        * This approach requires you to install the Heroku CLI on your host machine
    1. (Heroku CLI w/Heroku building the Docker images) Build your Docker images with heroku.yml for deployment to Heroku
        * Instructions: https://devcenter.heroku.com/articles/build-docker-images-heroku-yml
        * This approach does not require you to install the Heroku CLI on your host machine. Instead, if you wish, you can install the Heroku CLI in our anaconda3_cpsc322 container.
    1. (Unofficial w/Github Action bulding the Docker images) Use a Github Action, such as [Heroku Deploy](https://github.com/marketplace/actions/deploy-to-heroku)

Since we are learning Docker, here are steps for the [Deploy your app as a Docker container](https://devcenter.heroku.com/categories/deploying-with-docker) approaches A, B, and C. First, there are some shared initial steps for Docker approaches A and B:
1. Install the Heroku CLI. 
    1. For approach A, install it on your host machine. 
        * On Mac: `brew tap heroku/brew && brew install heroku`
        * On Windows: Download the appropriate Windows installer [here](https://devcenter.heroku.com/articles/heroku-cli)
    1. For approach B, install it in the anaconda3_cpsc322 Docker container:
        1. Install `gnupg`: `apt install gnupg`
        1. Install the Heroku CLI: `curl https://cli-assets.heroku.com/install-ubuntu.sh | sh`
1. Login to Heroku via the CLI: `heroku login -i` (then enter your Heroku email and password)

### Approach A: Deploy Pre-built Docker Images
1. Go to https://dashboard.heroku.com/apps and create a new Heroku app, then choose "Container Registry Use Heroku CLI"
1. Back at the command line, log in to Container Registry: `heroku container:login`
1. Then build an image and push it to Heroku's Container Registry: `heroku container:push web`
1. Release the newly pushed images to deploy your app: `heroku container:release web`

### Approach B: Build your Docker Images on Heroku
1. If you have not already done so, make your top-level project directory into a local git repository (e.g. `git init`, `git add -A`, `git commit -m "initial"`)
1. In your project's top-level directory, create a Heroku app from your project: `heroku create <app name>`
    1. This should add a new remote to your local Git repo for Heroku (confirm with `git remote -v`)
1. Create a `heroku.yml` file with the following contents:

```
build:
    docker:
        web: Dockerfile
```
1. This specifies how to build the docker image. Commit this file to your local repo.
1. Set the stack of your app to container: `heroku stack:set container`
1. Push your app to Heroku: `git push heroku master`

### Approach C: Set up Github Actions to Deploy Docker Images to Heroku
For this approach, we will set up our Github repo so that when we push to it, a Github Action automatically builds our Docker image and then pushes the image to Heroku for deployment. Following the instructions for the [Deploy to Heroku](https://github.com/marketplace/actions/deploy-to-heroku) Github Action:
1. Via the Heroku web interface, create a new app.
1. In your local Git repo, create a directory called `.github` and in that directory create a directory called `workflows`. In `.github/workflows`, create a file called `main.yml`. Put the following in this `main.yml` file:

```
name: Deploy
on:
  push:
    branches:
      - master
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: akhileshns/heroku-deploy@v3.12.12 # This is the action
        with:
          heroku_api_key: ${{secrets.HEROKU_API_KEY}}
          heroku_app_name: "YOUR APP's NAME" #Must be unique in Heroku
          heroku_email: "YOUR EMAIL"
          usedocker: true
```
1. Replace the `heroku_app_name` and `heroku_email` with values for your app name and for your Heroku account email.
1. Go to your Heroku Account Settings page. Scroll to the bottom until you see API Key. Copy this key to your clipboard.
1. In your Github Repo, go to Settings -> Secrets and click on "New Secret". Then enter HEROKU_API_KEY as the name and paste the copied API Key as the value.
1. You can now push your project to GitHub and it will be automatically deployed to Heroku henceforth.

Note: with this approach, you can add a "build" job that runs your unit tests on push as part of test-driven development. You can specify that your push should only be deployed to Heroku when the "build" job succeeds by adding the following line to the "deploy" job: `needs: build`. Here is a full `main.yml` file showing this in action:

```
name: Dockerized Test and Deploy Workflow

on:
  push:
    branches:
      - master

jobs:
  build:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    container: continuumio/anaconda3:2020.11
    steps:
      - name: Clone repo
        uses: actions/checkout@v2
      - name: Test code in Docker container	    
        run: |
          pip install tabulate
          pytest --verbose test_interview_tree.py
  deploy:
    needs: build
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - name: Clone repo
        uses: actions/checkout@v2
      - name: Deploy to Heroku
        uses: akhileshns/heroku-deploy@v3.12.12 # This is the action
        with:
          heroku_api_key: ${{secrets.HEROKU_API_KEY}}
          heroku_app_name: "interview-flask-app-action" 
          heroku_email: "sprint@gonzaga.edu"
          usedocker: true
```