**Question 1**

* Install Pipenv
* What's the version of pipenv you installed?
* Use --version to find out


```bash
pip install pipenv
```

In [1]:
!pipenv --version

[1mpipenv[0m, version 2023.6.12


**Question 2**
* Use Pipenv to install Scikit-Learn version 1.3.1
* What's the first hash for scikit-learn you get in Pipfile.lock?
> Note: you should create an empty folder for homework and do it there.

```bash
pipenv install scikit-learn==1.3.1
```

In [5]:
!grep -A2 "scikit-learn"  Pipfile.lock

        "scikit-learn": {
            "hashes": [
                "sha256:0c275a06c5190c5ce00af0acbb61c06374087949f643ef32d355ece12c4db043",


Answer: 0c275a06c5190c5ce00af0acbb61c06374087949f643ef32d355ece12c4db043

**Models**

We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:
```bash
features = ['job','duration', 'poutcome']
dicts = df[features].to_dict(orient='records')

dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)

model = LogisticRegression().fit(X, y)
```

_Note_: You don't need to train the model. This code is just for your reference.

And then saved with Pickle. Download them:
```bash
DictVectorizer
LogisticRegression
```

With wget:
```bash
PREFIX=https://raw.githubusercontent.com/DataTalksClub/machine-learning-zoomcamp/master/cohorts/2023/05-deployment/homework
wget $PREFIX/model1.bin
wget $PREFIX/dv.bin
```

In [15]:
!PREFIX=https://raw.githubusercontent.com/DataTalksClub/machine-learning-zoomcamp/master/cohorts/2023/05-deployment/homework && wget $PREFIX/model1.bin
!PREFIX=https://raw.githubusercontent.com/DataTalksClub/machine-learning-zoomcamp/master/cohorts/2023/05-deployment/homework && wget $PREFIX/dv.bin

--2023-10-15 15:00:55--  https://raw.githubusercontent.com/DataTalksClub/machine-learning-zoomcamp/master/cohorts/2023/05-deployment/homework/model1.bin
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8003::154, 2606:50c0:8000::154, 2606:50c0:8001::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8003::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 842 [application/octet-stream]
Saving to: ‘model1.bin’


2023-10-15 15:00:55 (9,93 MB/s) - ‘model1.bin’ saved [842/842]

--2023-10-15 15:00:55--  https://raw.githubusercontent.com/DataTalksClub/machine-learning-zoomcamp/master/cohorts/2023/05-deployment/homework/dv.bin
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8003::154, 2606:50c0:8000::154, 2606:50c0:8001::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8003::154|:443... connected.
HTTP request sent, awaiting response... 

**Question 3**
  
Let's use these models!

Write a script for loading these models with _pickle_.
Score this client:
```json
{"job": "retired", "duration": 445, "poutcome": "success"}
```

What's the probability that this client will get a credit?

* 0.162
* 0.392
* 0.652
* 0.902 <--- this

In [11]:
from sklearn.pipeline import Pipeline
import pickle

In [31]:
def score_new_client(model: Pipeline, client_information: dict):
    return model.predict_proba(client_information)

In [32]:
def get_model_pipeline():
    with open('dv.bin', 'rb') as f_in:
        dict_vectorizer = pickle.load(f_in)
    
    with open('model1.bin', 'rb') as f_in:
        model = pickle.load(f_in) 
    
    model_pipeline = Pipeline([('dv', dict_vectorizer), ('model', model)])
    return model_pipeline

In [33]:
model_pipeline = get_model_pipeline()

In [34]:
new_client = {"job": "retired", "duration": 445, "poutcome": "success"}

In [35]:
score_new_client(model_pipeline, new_client)

array([[0.09806907, 0.90193093]])

In [30]:
model_pipeline

**Question 4**

Now let's serve this model as a web service

* Install Flask and gunicorn (or waitress, if you're on Windows)
* Write Flask code for serving the model
* Now score this client using requests:
```bash
url = "YOUR_URL"
client = {"job": "unknown", "duration": 270, "poutcome": "failure"}
requests.post(url, json=client).json()
```

What's the probability that this client will get a credit?

* 0.140 <--- this
* 0.440
* 0.645
* 0.845

**Docker**

Install Docker. We will use it for the next two questions.

For these questions, we prepared a base image: svizor/zoomcamp-model:3.10.12-slim. You'll need to use it (see Question 5 for an example).

This image is based on python:3.10.12-slim and has a logistic regression model (a different one) as well a dictionary vectorizer inside.

This is how the Dockerfile for this image looks like:
```bash
FROM python:3.10.12-slim
WORKDIR /app
COPY ["model2.bin", "dv.bin", "./"]
```
We already built it and then pushed it to svizor/zoomcamp-model:3.10.12-slim.


**Question 5**

Download the base image svizor/zoomcamp-model:3.10.12-slim. You can easily make it by using docker pull command.

So what's the size of this base image?

* 47 MB
* 147 MB <-- this
* 374 MB
* 574 MB
* 
You can get this information when running docker images - it'll be in the "SIZE" column.

In [37]:
!docker images

REPOSITORY                                                    TAG                 IMAGE ID       CREATED         SIZE
svizor/zoomcamp-model                                         3.10.12-slim        08266c8f0c4b   6 days ago      147MB
predict_lambda_function                                       v1                  53cc51131204   7 weeks ago     2.8GB
<none>                                                        <none>              a2be8880efbf   7 weeks ago     2.8GB
<none>                                                        <none>              37fd9ba5042c   7 weeks ago     2.8GB
heart-stroke-prediction-service                               v1                  980b1f755bf4   7 weeks ago     1.52GB
prefect_development_environment                               v1                  b502788f8e08   8 weeks ago     1.84GB
<none>                                                        <none>              ab30eb5771e1   2 months ago    1.84GB
<none>                                        

**Dockerfile**

Now create your own Dockerfile based on the image we prepared.

It should start like that:
```bash
FROM svizor/zoomcamp-model:3.10.12-slim
add your stuff here
```
Now complete it:

* Install all the dependencies form the Pipenv file
* Copy your Flask script
* Run it with Gunicorn
After that, you can build your docker image.

**Question 6**

Let's run your docker container!

After running it, score this client once again:
```bash
url = "YOUR_URL"
client = {"job": "retired", "duration": 445, "poutcome": "success"}
requests.post(url, json=client).json()
```

What's the probability that this client will get a credit now?

* 0.168
* 0.530
* 0.730 <-- this
* 0.968 