# Homework

In this homework, we will use Credit Card Data from the previous homework.

Note: sometimes your answer doesn't match one of the options exactly. That's fine. Select the option that's closest to your solution.

## Question 1

- Install Pipenv
- What's the version of pipenv you installed?
- Use --version to find out

2022.10.9

## Question 2

- Use Pipenv to install Scikit-Learn version 1.0.2
- What's the first hash for scikit-learn you get in Pipfile.lock?

Note: you should create an empty folder for homework and do it there.

6d618de8e2f5dfa46023d41e14f147a45f4586ba27af7a346bc65a4aa602b80e

## Models


We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

    features = ['reports', 'share', 'expenditure', 'owner']
    dicts = df[features].to_dict(orient='records')

    dv = DictVectorizer(sparse=False)
    X = dv.fit_transform(dicts)

    model = LogisticRegression(solver='liblinear').fit(X, y)
    
Note: You don't need to train the model. This code is just for your reference.

And then saved with Pickle. Download them:

- DictVectorizer
- LogisticRegression

With wget:

    PREFIX=https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/cohorts/2022/05-deployment/homework

    wget $PREFIX/model1.bin
wget $PREFIX/dv.bin

## Question 3

Let's use these models!

- Write a script for loading these models with pickle
- Score this client:

{"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}

What's the probability that this client will get a credit card?

- 0.162
- 0.391
- 0.601
- 0.993

If you're getting errors when unpickling the files, check their checksum:

    $ md5sum model1.bin dv.bin
    3f57f3ebfdf57a9e1368dcd0f28a4a14  model1.bin
    6b7cded86a52af7e81859647fa3a5c2e  dv.bin

In [2]:
import pickle

def load(filename):
    with open(filename, 'rb') as f_in:
        return pickle.load(f_in)


dv = load('dv.bin')
model = load('model1.bin')

customer = {"reports": 0, "share": 0.001694, "expenditure": 0.12, "owner": "yes"}

X = dv.transform([customer])
y_pred = model.predict_proba(X)[0, 1]

print(round(y_pred,3))

0.162


## Question 4

Now let's serve this model as a web service

- Install Flask and gunicorn (or waitress, if you're on Windows)
- Write Flask code for serving the model
- Now score this client using requests:


    url = "YOUR_URL"
    client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
    requests.post(url, json=client).json()

What's the probability that this client will get a credit card?

- 0.274
- 0.484
- 0.698
- 0.928

In [3]:
from flask import Flask
from flask import request
from flask import jsonify

import pickle

def load(filename):
    with open(filename, 'rb') as f_in:
        return pickle.load(f_in)


dv = load('dv.bin')
model = load('model1.bin')

app = Flask('card')

@app.route('/predict', methods=['POST'])
def predict():
    customer = request.get_json()

    X = dv.transform([customer])
    y_pred = model.predict_proba(X)[0, 1]
    card_pred = y_pred >= 0.5

    result = {
        'churn_probability': float(y_pred),
        'churn': bool(card_pred)
    }

    return jsonify(result)


if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0', port=9696)


 * Serving Flask app 'card'
 * Debug mode: on


Address already in use
Port 9696 is in use by another program. Either identify and stop that program, or start the server with a different port.


AssertionError: 