# Week 5 - Deploying

# Q1 What's the version of pipenv you installed?

In [1]:
!Pip show pipenv

Name: pipenv
Version: 2023.10.3
Summary: Python Development Workflow for Humans.
Home-page: 
Author: 
Author-email: Pipenv maintainer team <distutils-sig@python.org>
License: The MIT License (MIT)
        
        Copyright 2020-2022 Python Packaging Authority
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
     

# Q2 What's the first hash for scikit-learn you get in Pipfile.lock?

After running `pipenv install scikit-learn==1.3.1`, we get a file that has the first hash for sklearn "sha256:0c275a06c5190c5ce00af0acbb61c06374087949f643ef32d355ece12c4db043".

# Q3 What's the probability that this client will get a credit?

```python
{"job": "retired", "duration": 445, "poutcome": "success"}
```

Run the commands in terminal
```shell
PREFIX=https://raw.githubusercontent.com/DataTalksClub/machine-learning-zoomcamp/master/cohorts/2023/05-deployment/homework
wget $PREFIX/model1.bin
wget $PREFIX/dv.bin
```
Ensure that the hashes match
```shell
$ md5sum model1.bin dv.bin
8ebfdf20010cfc7f545c43e3b52fc8a1  model1.bin
924b496a89148b422c74a62dbc92a4fb  dv.bin
```

Load the models (assuming the hashes match)

In [2]:
import pickle 
# from sklearn.feature_extraction import DictVectorizer
# from sklearn.linear_model import LogisticRegression, LinearRegression

with open('dv.bin', 'rb') as f:
    dv = pickle.load(f)
    
with open('model1.bin', 'rb') as f:
    model = pickle.load(f)
    
dv, model

https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


(DictVectorizer(sparse=False), LogisticRegression())

In [4]:
input = {"job": "retired", "duration": 445, "poutcome": "success"}
input_processed = dv.transform([input])
model.predict_proba(input_processed)[:,1]

array([0.90193093])

The answer is `0.902`.

# Q4 What's the probability that this client will get a credit?


Run the following commands
```shell
pipenv install flash gunicorn
pipenv shell 
```

Run `homework_app.py` and run the next cell block.

In [16]:
import requests 

url = "http://10.0.0.7:9696/predict"
client = {"job": "unknown", "duration": 270, "poutcome": "failure"}
response = requests.post(url, json=client).json()
response

{'probability': 0.13968947052356817, 'result': False}

We observe that the probability is close `0.140`.

# Q5 So what's the size of this base image?

Run the following commands 
```
docker pull image svizor/zoomcamp-model:3.10.12-slim
docker images
```

Size is 147MB

# Q6 What's the probability that this client will get a credit now?

We will use the provided Dockerfile , Pipfile and predict_app.py. 
Run the following commands 
```shell
docker build -t zoomcamp_hw5 .
docker run -it --rm -p 9696:9696 zoomcamp_hw5
```
Now run the next cell block while the image is running.

In [18]:
url = "http://10.0.0.7:9696/predict"
client = {"job": "retired", "duration": 445, "poutcome": "success"}
requests.post(url, json=client).json()

{'probability': 0.726936946355423, 'result': True}

We see that the probability is `0.73` which means the chance is high.