```bash
pip install pipenv
pipenv --version
    

```

You can run python script using and environment whithout using pipenv shell, you just type:

```bash
pipenv run python predict.py
```

## Question 1

* Install Pipenv
* What's the version of pipenv you installed?
* Use `--version` to find out


**ANSWER**: ```pipenv, version 2022.6.7```



## Question 2

* Use Pipenv to install Scikit-Learn version 1.0
* What's the first hash for scikit-learn you get in Pipfile.lock? 


## Models

We've prepared a dictionary vectorizer and a model.

They were trained (roughly) using this code:

```
features = ['tenure', 'monthlycharges', 'contract']
dicts = df[features].to_dict(orient='records')

dv = DictVectorizer(sparse=False)
X = dv.fit_transform(dicts)

model = LogisticRegression().fit(X, y)
```

> **Note**: You don't need to train the model. This code is just for your reference.

And then saved with Pickle. Download them:

* [DictVectorizer](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/05-deployment/homework/dv.bin?raw=true)
* [LogisticRegression](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/05-deployment/homework/model1.bin?raw=true)

With wget:

```bash
PREFIX=https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/course-zoomcamp/05-deployment/homework
wget $PREFIX/model1.bin
wget $PREFIX/dv.bin
```

**ANSWER**: ```sha256:121f78d6564000dc5e968394f45aac87981fcaaf2be40cfcd8f07b2baa1e1829```

In [None]:
!pipenv install scikit-learn==1.0

## Question 3

Let's use these models!

* Write a script for loading these models with pickle
* Score this customer:

```json
{"contract": "two_year", "tenure": 12, "monthlycharges": 19.7}
```

What's the probability that this customer is churning? 

If you're getting errors when unpickling the files, check their checksum:

```bash
$ md5sum model1.bin dv.bin
5868e129bfbb309ba60bf750263afab1  model1.bin
c49b69f8a5a3c560882ff5daa3c0ff4d  dv.bin
```

**ANSWER**: ```0.11549580587832914```

In [1]:
%%writefile predict.py
import pickle

model_name = "model1.bin"
dv_name = "dv.bin"

with open(model_name, "rb") as f_in:
    model = pickle.load(f_in)
with open(dv_name, "rb") as f_in:
    dv = pickle.load(f_in)
    
def predict(customer):
    X = dv.transform([customer])
    y_pred = model.predict_proba(X)
    churn_proba = y_pred[0][1]
    
    return churn_proba

customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 19.7}

print(predict(customer))

Overwriting predict.py


## Question 4

Now let's serve this model as a web service

* Install Flask and Gunicorn (or waitress, if you're on Windows)
* Write Flask code for serving the model
* Now score this customer using `requests`:

```python
url = "YOUR_URL"
customer = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
requests.post(url, json=customer).json()
```

What's the probability that this customer is churning?


## Docker

Install [Docker](06-docker.md). We will use it for the next two questions.

For these questions, I prepared a base image: `agrigorev/zoomcamp-model:3.8.12-slim`. 
You'll need to use it (see Question 5 for an example).

This image is based on `python:3.8.12-slim` and has a logistic regression model 
(a different one) as well a dictionary vectorizer inside. 

This is how the Dockerfile for this image looks like:

```docker 
FROM python:3.8.12-slim
WORKDIR /app
COPY ["model2.bin", "dv.bin", "./"]
```

I already built it and then pushed it to [`agrigorev/zoomcamp-model:3.8.12-slim`](https://hub.docker.com/r/agrigorev/zoomcamp-model).

> **Note**: You don't need to build this docker image, it's just for your reference.

**ANSWER**. ```0.9988892771007961```

In [2]:
%%writefile predict.py
import pickle

from flask import Flask, request, jsonify

# Load Model
model_name = "model1.bin"
dv_name = "dv.bin"

with open(model_name, "rb") as f_in:
    model = pickle.load(f_in)
with open(dv_name, "rb") as f_in:
    dv = pickle.load(f_in)

# Flask App
app = Flask("churn-app-h5")

def predict_proba(customer):
    X = dv.transform([customer])
    y_pred = model.predict_proba(X)
    churn_proba = y_pred[0][1]
    
    return churn_proba
    
@app.route("/predict_churn", methods = ["POST"])
def predict_post():
    customer = request.get_json()
    
    churn_proba = predict_proba(customer)
    churn_bool = churn_proba >= 0.5
    
    result = {
        "churn_proba": float(churn_proba),
        "churn_bool": bool(churn_bool)
    }
    
    return jsonify(result)


if __name__ == "__main__":
    app.run(debug = True, host = '0.0.0.0', port = 7878)

Overwriting predict.py


In [3]:
import requests

url = "http://127.0.0.1:7878/predict_churn"
customer = {"contract": "two_year", "tenure": 1, "monthlycharges": 10}
requests.post(url, json = customer).json()

{'churn_bool': True, 'churn_proba': 0.9988892771007961}

## Question 5

Now create your own Dockerfile based on the image I prepared.

It should start like that:

```docker
FROM agrigorev/zoomcamp-model:3.8.12-slim
# add your stuff here
```

Now complete it:

* Install all the dependencies form the Pipenv file
* Copy your Flask script
* Run it with gunicorn 


When you build your image, what's the image id for `agrigorev/zoomcamp-model:3.8.12-slim`?

Look at the first step of your build log. It should look something like that:

```
$ docker some-command-for-building
Sending build context to Docker daemon  2.048kB
Step 1/N : FROM agrigorev/zoomcamp-model:3.8.12-slim
 ---> XXXXXXXXXXXX
Step 2/N : ....
```

You need this `XXXXXXXXXXXX`.

Alternatively, you can get this information when running `docker images` - it'll be in the "IMAGE ID" column.
Submitting DIGEST (long string starting with "sha256") is also fine.

**ANSWER**: ```f0f43f7bc6e0```

In [4]:
%%writefile predict_to_docker.py
import pickle

from flask import Flask, request, jsonify

# Load Model
model_name = "model2.bin"
dv_name = "dv.bin"

with open(model_name, "rb") as f_in:
    model = pickle.load(f_in)
with open(dv_name, "rb") as f_in:
    dv = pickle.load(f_in)

# Flask App
app = Flask("churn-app-h5")

def predict_proba(customer):
    X = dv.transform([customer])
    y_pred = model.predict_proba(X)
    churn_proba = y_pred[0][1]
    
    return churn_proba
    
@app.route("/predict_churn", methods = ["POST"])
def predict_post():
    customer = request.get_json()
    
    churn_proba = predict_proba(customer)
    churn_bool = churn_proba >= 0.5
    
    result = {
        "churn_proba": float(churn_proba),
        "churn_bool": bool(churn_bool)
    }
    
    return jsonify(result)


if __name__ == "__main__":
    app.run(debug = True, host = '0.0.0.0', port = 7878)

Overwriting predict_to_docker.py


In [5]:
%%writefile Dockerfile
FROM agrigorev/zoomcamp-model:3.8.12-slim

RUN pip install pipenv

COPY ["Pipfile", "Pipfile.lock", "./"]

RUN pipenv install --system --deploy

COPY ["predict_to_docker.py", "./"]

# Port where the App will be Exposed
EXPOSE 7878

ENTRYPOINT ["gunicorn", "--bind=0.0.0.0:7878", "predict_to_docker:app"]

Overwriting Dockerfile


## Question 6

Let's run your docker container!

After running it, score this customer:

```python
url = "YOUR_URL"
customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 10}
requests.post(url, json=customer).json()
```

What's the probability that this customer is churning?

In [6]:
import requests

url = "http://127.0.0.1:7878/predict_churn"
customer = {"contract": "two_year", "tenure": 12, "monthlycharges": 10}
requests.post(url, json=customer).json()

{'churn_bool': True, 'churn_proba': 0.7284944888182928}

In [54]:
# Al final intentarlo en mi laptop

In [None]:
# Explicar esto en mis notas docker run -it --rm --entrypoint=bash -> y ponerlo en el zoomcamp
# -it no recuerod
# --rm remove the image
# entrypoint bash que inicie en la consola, etc.