## Homework

In this homework, we will use Credit Card Data from [the previous homework](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/cohorts/2022/04-evaluation/homework.md).

> Note: sometimes your answer doesn't match one of the options exactly. That's fine. 
Select the option that's closest to your solution.


## Question 1

* Install Pipenv
* What's the version of pipenv you installed?
* Use `--version` to find out


In [1]:
!pipenv --version

[1mpipenv[0m, version 2022.10.9
[0m

## Question 2

* Use Pipenv to install Scikit-Learn version 1.0.2
* What's the first hash for scikit-learn you get in Pipfile.lock?

Note: you should create an empty folder for homework
and do it there. 

**"sha256:08ef968f6b72033c16c479c966bf37ccd49b06ea91b765e1cc27afefe723920b"**


## Models

We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

```python
features = ['reports', 'share', 'expenditure', 'owner']
dicts = df[features].to_dict(orient='records')

dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)

model = LogisticRegression(solver='liblinear').fit(X, y)
```

> **Note**: You don't need to train the model. This code is just for your reference.

And then saved with Pickle. Download them:

* [DictVectorizer](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/cohorts/2022/05-deployment/homework/dv.bin?raw=true)
* [LogisticRegression](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/cohorts/2022/05-deployment/homework/model1.bin?raw=true)

With `wget`:

```bash
PREFIX=https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/cohorts/2022/05-deployment/homework
wget $PREFIX/model1.bin
wget $PREFIX/dv.bin
```



In [3]:
PREFIX="https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/cohorts/2022/05-deployment/homework"

!wget $PREFIX/model1.bin
!wget $PREFIX/dv.bin

--2022-10-11 02:26:24--  https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/cohorts/2022/05-deployment/homework/model1.bin
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 889 [application/octet-stream]
Saving to: ‘model1.bin’


2022-10-11 02:26:25 (32.4 MB/s) - ‘model1.bin’ saved [889/889]

--2022-10-11 02:26:25--  https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/cohorts/2022/05-deployment/homework/dv.bin
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 333

In [5]:
!ls

HW5.ipynb  Pipfile.lock		    dv.bin    homework.md
Pipfile    credit_predict_flask.py  homework  model1.bin


In [1]:
import pickle

from flask import Flask
from flask import request
from flask import jsonify

import requests

## Question 3

Let's use these models!

* Write a script for loading these models with pickle
* Score this client:

```json
{"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}
```

What's the probability that this client will get a credit card? 

* 0.162
* 0.391
* 0.601
* 0.993

If you're getting errors when unpickling the files, check their checksum:

```bash
$ md5sum model1.bin dv.bin
3f57f3ebfdf57a9e1368dcd0f28a4a14  model1.bin
6b7cded86a52af7e81859647fa3a5c2e  dv.bin
```



In [2]:
!md5sum model1.bin dv.bin

3f57f3ebfdf57a9e1368dcd0f28a4a14  model1.bin
6b7cded86a52af7e81859647fa3a5c2e  dv.bin


In [3]:
model_file = 'model1.bin'
dv_file = 'dv.bin'

with open(model_file, 'rb') as f_in:
    model1 = pickle.load(f_in)
    f_in.close()
    
with open(dv_file, 'rb') as f_in:
    dv1 = pickle.load(f_in)
    f_in.close()



client = {"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}

def predict(client):
    
    X = dv1.transform([client])
    y_pred = model1.predict_proba(X)[0, 1]
    credit_bool = y_pred >= 0.5

    
    credit_prob = float(y_pred)
    credit_bool =  bool(credit_bool)

    return credit_prob, credit_bool


credit_prob, credit_bool = predict(client)

print ("Probability that customer will get a credit card is: ", round(credit_prob, 3))
print ("Will he get a credit card? ", credit_bool)

Probability that customer will get a credit card is:  0.162
Will he get a credit card?  False


## Question 4

Now let's serve this model as a web service

* Install Flask and gunicorn (or waitress, if you're on Windows)
* Write Flask code for serving the model
* Now score this client using `requests`:

```python
url = "YOUR_URL"
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
requests.post(url, json=client).json()
```

What's the probability that this client will get a credit card?

* 0.274
* 0.484
* 0.698
* 0.928



```
#### code for flask app whether a use is likely to apply for a credit card,  run from ubuntu terminal: #######

import pickle
from flask import Flask
from flask import request
from flask import jsonify
import requests
import wget



model_file = "model1.bin"
dv_file = "dv.bin"


with open(model_file, 'rb') as f_in:
    model1 = pickle.load(f_in)
    f_in.close()
    
with open(dv_file, 'rb') as f_in:
    dv1 = pickle.load(f_in)
    f_in.close()


app = Flask('credit_card')
@app.route('/predict', methods=['POST'])

def predict():
    
    client = request.get_json()
    X = dv1.transform([client])
    y_pred = model1.predict_proba(X)[0, 1]
    credit_bool = y_pred >= 0.5

    result = {
        'credit_card_prob': float(y_pred),
        'credit_bool': bool(credit_bool)
    }
    return jsonify(result)


if __name__ == '__main__':

     app.run(debug=True, host='0.0.0.0', port=9696)

     # we call from 0.0.0.0, and three messages: but translated to 127.0.0.1
     # better than localhost

```


After starting the credit_card_predict service from vscode on port 9696:

In [5]:
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}

# Runs on all 4: 
# url = 'http://0.0.0.:9696/predict'
# url = 'http://localhost:9696/predict'
url = 'http://127.0.0.1:9696/predict'
# url = 'http://172.30.58.76:9696/predict'


response = requests.post(url, json= client).json()

print(response)

if response['credit_bool'] == True:
   print('\nSending promo email to client with these details to get a credit card %s' % client)
else:
   print('\nNot sending promo email for credit card to:\n %s' % client)

{'credit_bool': True, 'credit_card_prob': 0.9282218018527452}

Sending promo email to client with these details to get a credit card {'reports': 0, 'share': 0.245, 'expenditure': 3.438, 'owner': 'yes'}


## Docker

Install [Docker](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/05-deployment/06-docker.md). We will use it for the next two questions.

For these questions, we prepared a base image: `svizor/zoomcamp-model:3.9.12-slim`. 
You'll need to use it (see Question 5 for an example).

This image is based on `python:3.9.12-slim` and has a logistic regression model 
(a different one) as well a dictionary vectorizer inside. 

This is how the Dockerfile for this image looks like:

```docker 
FROM python:3.9.12-slim
WORKDIR /app
COPY ["model2.bin", "dv.bin", "./"]
```

We already built it and then pushed it to [`svizor/zoomcamp-model:3.9.12-slim`](https://hub.docker.com/r/svizor/zoomcamp-model).

> **Note**: You don't need to build this docker image, it's just for your reference.



## Question 5

Download the base image `svizor/zoomcamp-model:3.9.12-slim`. You can easily make it by using [docker pull](https://docs.docker.com/engine/reference/commandline/pull/) command.

So what's the size of this base image?

* 15 Mb
* 125 Mb
* 275 Mb
* 415 Mb

You can get this information when running `docker images` - it'll be in the "SIZE" column.m

In [4]:
!docker pull svizor/zoomcamp-model:3.9.12-slim

3.9.12-slim: Pulling from svizor/zoomcamp-model
Digest: sha256:10445b40653d5ac17ede84db17f42ae8c4090b347a979372b8102174498b33b9
Status: Image is up to date for svizor/zoomcamp-model:3.9.12-slim
docker.io/svizor/zoomcamp-model:3.9.12-slim


In [11]:
!docker images

REPOSITORY              TAG           IMAGE ID       CREATED             SIZE
zoomcamp                latest        959ca4cc3b25   22 minutes ago      755MB
zoomcamp-test           latest        39a3f542fe81   About an hour ago   755MB
svizor/zoomcamp-model   3.9.12-slim   571a6fdc554b   9 days ago          125MB


Size is: 125MB

## Dockerfile

Now create your own Dockerfile based on the image we prepared.

It should start like that:

```docker
FROM svizor/zoomcamp-model:3.9.12-slim
# add your stuff here
```

Now complete it:

* Install all the dependencies form the Pipenv file
* Copy your Flask script
* Run it with Gunicorn 

After that, you can build your docker image.


```
Docker file: FROM  svizor/zoomcamp-model:3.9.12-slim

RUN pip install pipenv

WORKDIR /app
COPY ["Pipfile","Pipfile.lock", "./"]

RUN pipenv install --system --deploy

COPY ["docker_predict_credit.py", "./"]

EXPOSE 9000
ENTRYPOINT [ "gunicorn", "--bind=0.0.0.0:9000", "docker_predict_credit:app" ]
```


**In the prediction code, changed port for docker to 9000 and added model2.bin as model file.**

## Question 6

Let's run your docker container!

After running it, score this client once again:

```python
url = "YOUR_URL"
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
requests.post(url, json=client).json()
```

What's the probability that this client will get a credit card now?

* 0.289
* 0.502
* 0.769
* 0.972



In [2]:
client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}

url = 'http://0.0.0.0:9000/predict'

response = requests.post(url, json= client).json()

print(response)

if response['credit_bool'] == True:
   print('\nSending promo email to client with these details to get a credit card %s' % client)
else:
   print('\nNot sending promo email for credit card to:\n %s' % client)

{'credit_bool': True, 'credit_card_prob': 0.7692649226628628}

Sending promo email to client with these details to get a credit card {'reports': 0, 'share': 0.245, 'expenditure': 3.438, 'owner': 'yes'}
