# 5.10 Homework
---
In this homework, we'll use the churn prediction model trained on a smaller set of features.

## Question 1
---
- Install Pipenv
- What's the version of pipenv you installed?
- Use --version to find out

In [1]:
!pipenv --version

[39m[1mpipenv[39m[22m, version 2021.5.29
[0m

In [2]:
print("[ANSWER-1] The installed version of pipenv is: 2021.5.29")

[ANSWER-1] The installed version of pipenv is: 2021.5.29


## Question 2
---
- Use Pipenv to install Scikit-Learn version 1.0
- What's the first hash for scikit-learn you get in Pipfile.lock?

In [3]:
import json

with open("Pipfile.lock") as f:
    data = json.load(f)
    first_hash = data['default']['scikit-learn']['hashes'][0]

print(f"[ANSWER-2] The first hash for scikit-learn is:\n{first_hash}")

[ANSWER-2] The first hash for scikit-learn is:
sha256:121f78d6564000dc5e968394f45aac87981fcaaf2be40cfcd8f07b2baa1e1829


### Models
---
We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

```
features = ['tenure', 'monthlycharges', 'contract']
dicts = df[features].to_dict(orient='records')
dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)
model = LogisticRegression().fit(X, y)
```


And then saved with Pickle. Load them:

- DictVectorizer
- LogisticRegression


In [4]:
import pickle

with open("homework/model1.bin", 'rb') as f_model:
    model = pickle.load(f_model)

with open("homework/dv.bin", 'rb') as f_dv:
    dv = pickle.load(f_dv)


In [5]:
model

LogisticRegression()

In [6]:
dv

DictVectorizer(sparse=False)

## Question 3
---
Let's use these models!

- Write a script for loading these models
- Score this customer:

> {"contract": "two_year", "tenure": 12, "monthlycharges": 19.7}

What's the probability that this customer is churning?



In [7]:
def make_pred(customer, model, dv):
    X = dv.transform([customer])
    return model.predict_proba(X)[0, 1]

In [8]:
customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 19.7}
pred = make_pred(customer, model, dv)

print(f"[ANSWER-3] The probability that this customer is churning is: {round(pred, 3)}")

[ANSWER-3] The probability that this customer is churning is: 0.115


## Question 4
---
Now let's serve this model as a web service

- Install Flask and Gunicorn (or waitress, if you're on Windows)
- Write Flask code for serving the model
- Now score this customer using requests:

```
url = "YOUR_URL"
customer = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
requests.post(url, json=customer).json()
```

What's the probability that this customer is churning?

**I execute this line of code in the console**

>gunicorn --bind 0.0.0.0:9999 churn_service:app

In [15]:
import requests

url = "http://localhost:9999/predict"
customer = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
pred = requests.post(url, json=customer).json()

print(f"[ANSWER-4] The probability that this customer is churning is: {round(pred['churn_probability'], 3)}")

[ANSWER-4] The probability that this customer is churning is: 0.999


## Docker
---
Install Docker. We will use it for the next two questions.

For these questions, I prepared a base image: **agrigorev/zoomcamp-model:3.8.12-slim**. You'll need to use it (see Question 5 for an example).

This image is based on **python:3.8.12-slim** and has a logistic regression model (a different one) as well a dictionary vectorizer inside.

This is how the Dockerfile for this image looks like:

```
FROM python:3.8.12-slim
WORKDIR /app
COPY ["model2.bin", "dv.bin", "./"]
```

I already built it and then pushed it to **agrigorev/zoomcamp-model:3.8.12-slim**.

> Note: You don't need to build this docker image, it's just for your reference.

## Question 5
Now create your own Dockerfile based on the image I prepared.

It should start like that:

```
FROM agrigorev/zoomcamp-model:3.8.12-slim
# add your stuff here
```

Now complete it:

- Install all the dependencies form the Pipenv file
- Copy your Flask script
- Run it with gunicorn

When you build your image, what's the digest for **agrigorev/zoomcamp-model:3.8.12-slim**?

Look at the first step of your build log. It should look something like that:

```
Step 1/3 : FROM python:3.8.12-slim
 ---> 2e56f6b0af69
```


**After I build the image with the following commands**

>docker build -t zoomcamp_homework_5 .

**I got the log:**

```
Sending build context to Docker daemon  75.26kB
Step 1/7 : FROM agrigorev/zoomcamp-model:3.8.12-slim
3.8.12-slim: Pulling from agrigorev/zoomcamp-model
bd897bb914af: Pull complete 
aee78d822213: Pull complete 
6d9f6b5c1e71: Pull complete 
cf9f290bd6be: Pull complete 
5e4b501cbda5: Pull complete 
bd464adb9682: Pull complete 
c803b748156d: Pull complete 
Digest: sha256:1ee036b365452f8a1da0dbc3bf5e7dd0557cfd33f0e56b28054d1dbb9c852023
Status: Downloaded newer image for agrigorev/zoomcamp-model:3.8.12-slim
 ---> f0f43f7bc6e0
```



In [16]:
print("[ANSWER-5] The digest for agrigorev/zoomcamp-model:3.8.12-slim is: f0f43f7bc6e0")

[ANSWER-5] The digest for agrigorev/zoomcamp-model:3.8.12-slim is: f0f43f7bc6e0


## Question 6
---
Let's run your docker container!

After running it, score the same customer:

```
url = "YOUR_URL"
customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 10}
requests.post(url, json=customer).json()
```

What's the probability that this customer is churning?

**I run the container with th following line**

>docker run -it --rm -p 9999:9999 zoomcamp_homework_5

In [19]:
import requests

url = "http://localhost:9999/predict"
customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 10}
pred = requests.post(url, json=customer).json()

print(f"[ANSWER-6] The probability that this customer is churning is: {round(pred['churn_probability'], 3)}")

[ANSWER-6] The probability that this customer is churning is: 0.728


## Testing Heroku Service

In [3]:
import requests

url = "https://churn-service.herokuapp.com/predict"
customer = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
pred = requests.post(url, json=customer).json()

print(f"The probability that this customer is churning is: {round(pred['churn_probability'], 3)}")

The probability that this customer is churning is: 0.999
