In [1]:
import pickle

## Question 1

* Install Pipenv
* What's the version of pipenv you installed?
* Use `--version` to find out


In [3]:
!pipenv --version

[1mpipenv[0m, version 2023.10.3


## Question 2

* Use Pipenv to install Scikit-Learn version 1.3.1
* What's the first hash for scikit-learn you get in Pipfile.lock?

> **Note**: you should create an empty folder for homework
and do it there. 

`sha256:0c275a06c5190c5ce00af0acbb61c06374087949f643ef32d355ece12c4db043`

In [2]:
!curl -O https://github.com/DataTalksClub/machine-learning-zoomcamp/raw/master/cohorts/2023/05-deployment/homework/model1.bin

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0


In [3]:
!ls

Pipfile        Pipfile.lock   dv.bin         homework.ipynb model1.bin


## Models

We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

```python
features = ['job','duration', 'poutcome']
dicts = df[features].to_dict(orient='records')

dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)

model = LogisticRegression().fit(X, y)
```

> **Note**: You don't need to train the model. This code is just for your reference.

And then saved with Pickle. Download them:

* [DictVectorizer](https://github.com/DataTalksClub/machine-learning-zoomcamp/tree/master/cohorts/2023/05-deployment/homework/dv.bin?raw=true)
* [LogisticRegression](https://github.com/DataTalksClub/machine-learning-zoomcamp/tree/master/cohorts/2023/05-deployment/homework/model1.bin?raw=true)


## Question 3

Let's use these models!

* Write a script for loading these models with pickle
* Score this client:

```json
{"job": "retired", "duration": 445, "poutcome": "success"}
```

What's the probability that this client will get a credit? 

* 0.162
* 0.392
* 0.652
* 0.902 <- correct

In [2]:
dv_file = 'dv.bin'
model_file = 'model1.bin'

In [3]:
with open(dv_file, 'rb') as f_dv:
    dv = pickle.load(f_dv)
dv

In [4]:
f_dv.close()

In [5]:
with open('model1.bin', 'rb') as f_model:
    model = pickle.load(f_model)
f_model.close()
model

In [7]:
client = {"job": "retired", "duration": 445, "poutcome": "success"}

In [8]:
X = dv.transform(client)

In [10]:
model.predict_proba(X)[0, 1]

0.9019309332297606

## Question 4

Now let's serve this model as a web service

* Install Flask and gunicorn (or waitress, if you're on Windows)
* Write Flask code for serving the model
* Now score this client using `requests`:

```python
url = "YOUR_URL"
client = {"job": "unknown", "duration": 270, "poutcome": "failure"}
requests.post(url, json=client).json()
```

What's the probability that this client will get a credit?

* 0.140 <- correct
* 0.440
* 0.645
* 0.84

In [7]:
from flask import Flask, request, jsonify

In [15]:
import requests

In [17]:
url = "http://localhost:2912/credit_predict"
client = {"job": "unknown", "duration": 270, "poutcome": "failure"}
requests.post(url, json=client).json()

{'approved': False, 'probability': 0.14}

## Question 5

Download the base image `svizor/zoomcamp-model:3.10.12-slim`. You can easily make it by using [docker pull](https://docs.docker.com/engine/reference/commandline/pull/) command.

So what's the size of this base image?

* 47 MB
* 147 MB <- correct
* 374 MB
* 574 MB

You can get this information when running `docker images` - it'll be in the "SIZE" column.

In [19]:
!docker images

REPOSITORY              TAG            IMAGE ID       CREATED         SIZE
svizor/zoomcamp-model   3.10.12-slim   08266c8f0c4b   3 days ago      147MB
python                  3.9.18-slim    cdecdc3a8469   6 weeks ago     126MB
ubuntu                  latest         01f29b872827   2 months ago    77.8MB
python                  3.9            8bdfd6cc4bbf   2 months ago    997MB
python                  latest         cd9c1d09c087   2 months ago    1.01GB
hello-world             latest         9c7a54a9a43c   5 months ago    13.3kB
python                  3.8.12-slim    513da2530098   19 months ago   122MB


## Question 6

Let's run your docker container!

After running it, score this client once again:

```python
url = "YOUR_URL"
client = {"job": "retired", "duration": 445, "poutcome": "success"}
requests.post(url, json=client).json()
```

What's the probability that this client will get a credit now?

* 0.168
* 0.530
* 0.730
* 0.968 <- closest to my result

In [21]:
client = {"job": "retired", "duration": 445, "poutcome": "success"}
requests.post(url, json=client).json()

{'approved': True, 'probability': 0.902}