## MLZoomcamp: Homework 5

In [9]:
import pickle
import requests

# Multiple cell outputs
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

### Question 1
* Install Pipenv
* What's the version of pipenv you installed?
* Use --version to find out

In [10]:
!pipenv --version

[39m[1mpipenv[39m[22m, version 2021.5.29
[0m

In [11]:
!pipenv install scikit-learn==1.0 flask

[39m[1mInstalling [32m[1mscikit-learn==1.0[39m[22m...[39m[22m
[K[39m[1mAdding[39m[22m [32m[1mscikit-learn[39m[22m [39m[1mto Pipfile's[39m[22m [33m[1m[packages][39m[22m[39m[1m...[39m[22m
[K[?25h✔ Installation Succeeded[0m 
[39m[1mInstalling [32m[1mflask[39m[22m...[39m[22m
[K[39m[1mAdding[39m[22m [32m[1mflask[39m[22m [39m[1mto Pipfile's[39m[22m [33m[1m[packages][39m[22m[39m[1m...[39m[22m
[K[?25h✔ Installation Succeeded[0m 
[39m[1mInstalling dependencies from Pipfile.lock (52faf6)...[39m[22m

To activate this project's virtualenv, run [33m[22mpipenv shell[39m[22m.
Alternatively, run a command inside the virtualenv with [33m[22mpipenv run[39m[22m.
[0m

### Question 2
* Use Pipenv to install Scikit-Learn version 1.0
* What's the first hash for scikit-learn you get in Pipfile.lock?

"sha256:121f78d6564000dc5e968394f45aac87981fcaaf2be40cfcd8f07b2baa1e1829",

### Models

We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

features = ['tenure', 'monthlycharges', 'contract']
dicts = df[features].to_dict(orient='records')

dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)

model = LogisticRegression().fit(X, y)
Note: You don't need to train the model. This code is just for your reference.

And then saved with Pickle. Download them:

### Question 3

Let's use these models!

* Write a script for loading these models with pickle
* Score this customer:

In [12]:
dv_file = 'dv.bin'
model_file = 'model1.bin'

with open(dv_file, 'rb') as f_in:
    dv = pickle.load(f_in)

with open(model_file, 'rb') as f_in:
    model = pickle.load(f_in)

dv, model



(DictVectorizer(sparse=False), LogisticRegression())

In [13]:
customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 19.7}

In [14]:
X = dv.transform(customer)
model.predict_proba(X)[0, 1]

0.11549580587832914

The churn probability for this customer is therefore 0.11549580587832914.

### Question 4

Now let's serve this model as a web service

* Install Flask and Gunicorn (or waitress, if you're on Windows)
* Write Flask code for serving the model
* Now score this customer using requests:
    * url = "YOUR_URL"
    * customer = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
    * requests.post(url, json=customer).json()
* What's the probability that this customer is churning?

In [15]:
# !pip install gunicorn
# !pip install flask

In [16]:
# !python churn_service.py

In [17]:
url = 'http://localhost:9696/predict'

customer2 = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
requests.post(url, json=customer2).json()

{'churn': True, 'churn_probability': 0.9988892771007961}

In [18]:
# Checking with previous customer
requests.post(url, json=customer).json()
# And vice versa
X = dv.transform(customer2)
model.predict_proba(X)[0, 1]

{'churn': False, 'churn_probability': 0.11549580587832914}

0.9988892771007961

### Docker

Install Docker. We will use it for the next two questions.

For these questions, I prepared a base image: agrigorev/zoomcamp-model:3.8.12-slim. You'll need to use it (see Question 5 for an example).

This image is based on python:3.8.12-slim and has a logistic regression model (a different one) as well a dictionary vectorizer inside.

This is how the Dockerfile for this image looks like:

FROM python:3.8.12-slim
WORKDIR /app
COPY ["model2.bin", "dv.bin", "./"]
I already built it and then pushed it to agrigorev/zoomcamp-model:3.8.12-slim.

Note: You don't need to build this docker image, it's just for your reference.

### Question 5

Now create your own Dockerfile based on the image I prepared.

It should start like that:

FROM agrigorev/zoomcamp-model:3.8.12-slim
\# add your stuff here
Now complete it:

* Install all the dependencies form the Pipenv file
* Copy your Flask script
* Run it with gunicorn
When you build your image, what's the digest for agrigorev/zoomcamp-model:3.8.12-slim?

Look at the first step of your build log. It should look something like that:

Step 1/3 : FROM python:3.8.12-slim
 ---> 2e56f6b0af69

In [19]:
!pwd

/Users/rmcmaster/Documents/git_repos/pub_portfolio/mlz/05-deployment


In [20]:
!docker build -t zoomcamp-model .

[1A[1B[0G[?25l[+] Building 0.0s (0/1)                                                         
[?25h[1A[0G[?25l[+] Building 0.1s (2/2)                                                         
[34m => [internal] load build definition from Dockerfile                       0.0s
[0m[34m => => transferring dockerfile: 801B                                       0.0s
[0m[34m => [internal] load .dockerignore                                          0.0s
[0m[34m => => transferring context: 2B                                            0.0s
[0m[?25h[1A[1A[1A[1A[1A[0G[?25l[+] Building 0.3s (2/3)                                                         
[34m => [internal] load build definition from Dockerfile                       0.0s
[0m[34m => => transferring dockerfile: 801B                                       0.0s
[0m[34m => [internal] load .dockerignore                                          0.0s
[0m[34m => => transferring context: 2B                        

In [21]:
!docker images --no-trunc

REPOSITORY       TAG       IMAGE ID                                                                  CREATED          SIZE
zoomcamp-model   latest    sha256:0f4e1d5d8419ba541a961372a4ec9aff18071b6d2044485da3d597da2d48d633   21 minutes ago   510MB
<none>           <none>    sha256:48b4bfc1593a8bb070b35847b48eaf49ba88cc2d2a10ed7a6b9f3420ceca6248   25 hours ago     510MB


In [22]:
!docker images --digests

REPOSITORY       TAG       DIGEST    IMAGE ID       CREATED          SIZE
zoomcamp-model   latest    <none>    0f4e1d5d8419   21 minutes ago   510MB
<none>           <none>    <none>    48b4bfc1593a   25 hours ago     510MB


Image ID from Terminal: sha256:1ee036b365452f8a1da0dbc3bf5e7dd0557cfd33f0e56b28054d1dbb9c852023

Built Image ID (shown above): sha256:48b4bfc1593a8bb070b35847b48eaf49ba88cc2d2a10ed7a6b9f3420ceca6248

### Question 6

In [23]:
!docker run -it -p 9696:9696 --rm zoomcamp-model
!docker stop

[2021-10-11 12:15:48 +0000] [1] [INFO] Starting gunicorn 20.1.0
[2021-10-11 12:15:48 +0000] [1] [INFO] Listening at: http://0.0.0.0:9696 (1)
[2021-10-11 12:15:48 +0000] [1] [INFO] Using worker: sync
[2021-10-11 12:15:48 +0000] [8] [INFO] Booting worker with pid: 8
[2021-10-11 12:25:04 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:8)
[2021-10-11 12:25:04 +0000] [8] [INFO] Worker exiting (pid: 8)
[2021-10-11 12:25:05 +0000] [11] [INFO] Booting worker with pid: 11
[2021-10-11 15:53:20 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:11)
[2021-10-11 15:53:20 +0000] [11] [INFO] Worker exiting (pid: 11)
[2021-10-11 15:53:20 +0000] [14] [INFO] Booting worker with pid: 14
[2021-10-11 16:25:43 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:14)
[2021-10-11 16:25:43 +0000] [14] [INFO] Worker exiting (pid: 14)
[2021-10-11 16:25:45 +0000] [17] [INFO] Booting worker with pid: 17
[2021-10-11 16:35:32 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:17)
[2021-10-11 16:35:32 +0000] [17] [INFO] Worker exiting (pid: 17)
[2021-

In [None]:
url = 'http://localhost:9696/predict'

customer3 = {"contract": "two_year", "tenure": 12, "monthlycharges": 10}

In [None]:
requests.post(url, json=customer3).json()

{'churn': False, 'churn_probability': 0.32940789808151005}

Therefore the churn prob for this customer is 0.32940789808151005.