# 05-deployment

https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/cohorts/2022/05-deployment/homework.md

In [1]:
import pickle
import os
import requests

In [2]:
import warnings
warnings.filterwarnings('ignore')

# Question 1

* Install Pipenv
* What's the version of pipenv you installed?
* Use --version to find out

In [3]:
!pipenv --version

pipenv, version 2022.9.24



Answer: pipenv	2022.9.24

# Question 2

In [4]:
# Use Pipenv to install Scikit-Learn version 1.0.2
# pipenv install scikit-learn==1.0.2

What's the first hash for scikit-learn you get in Pipfile.lock?

Answer: sha256:08ef968f6b72033c16c479c966bf37ccd49b06ea91b765e1cc27afefe723920b

## Models

We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

* features = ['reports', 'share', 'expenditure', 'owner']
* dicts = df[features].to_dict(orient='records')

* dv = DictVectorizer(sparse=False)
* X = dv.fit_transform(dicts)

* model = LogisticRegression(solver='liblinear').fit(X, y)

And then saved with Pickle.

In [4]:
!md5sum ./data/model1.bin ./data/dv.bin

3f57f3ebfdf57a9e1368dcd0f28a4a14 *./data/model1.bin
6b7cded86a52af7e81859647fa3a5c2e *./data/dv.bin


In [5]:
with open(os.path.join('data','model1.bin'), 'rb') as f_model:
    model = pickle.load(f_model)
    
with open(os.path.join('data','dv.bin'), 'rb') as f_dv:
    dv = pickle.load(f_dv)
    
model, dv

(LogisticRegression(solver='liblinear'), DictVectorizer(sparse=False))

# Question 3

Let's use these models!

Write a script for loading these models with pickle

Score this client:
{"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}

What's the probability that this client will get a credit card?

In [6]:
client = {"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}

In [7]:
X = dv.transform([client])
model.predict_proba(X)[0, 1]

0.16213414434326598

Answer: closest answer is __0.148__

# Question 4

Now let's serve this model as a web service

* Install Flask and gunicorn (or waitress, if you're on Windows)
* Write Flask code for serving the model
* Now score this client using requests:

In [8]:
url = 'http://localhost:9696/predict'
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
requests.post(url, json=client).json()

{'y_pred': 0.9282218018527452}

Answer: __0.928__

# Question 5

Download the base image svizor/zoomcamp-model:3.9.12-slim. You can easily make it by using docker pull command.

So what's the size of this base image?

In [9]:
!docker images | grep svizor

svizor/zoomcamp-model   3.9.12-slim   571a6fdc554b   3 days ago       125MB


Answer: __125 Mb__

# Dockerfile

Now create your own Dockerfile based on the image we prepared.

It should start like that:

Now complete it:

* Install all the dependencies form the Pipenv file
* Copy your Flask script
* Run it with Gunicorn

After that, you can build your docker image.

# Question 6

Let's run your docker container!

After running it, score this client once again:

In [10]:
url = "http://localhost:9696/predict"
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
requests.post(url, json=client).json()

{'y_pred': 0.7692649226628628}

What's the probability that this client will get a credit card now?

Answer: __0.769__