# Evaluation Metrics

## 4.1 Overview

The fourth week of Machine Learning Zoomcamp is about different metrics to evaluate a binary classifier. These measures include accuracy, confusion table, precision, recall, ROC curves(TPR, FRP, random model, and ideal model), AUROC, and cross-validation.

In [30]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [31]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction import DictVectorizer
from sklearn.linear_model import LogisticRegression

In [32]:
df = pd.read_csv('customer_data.csv')
df.columns = df.columns.str.lower().str.replace(' ', '_')
categorical_columns = list(df.dtypes[df.dtypes == 'object'].index)

for cell in categorical_columns:
    df[cell] = df[cell].str.lower().str.replace(' ', '_')
    
df.totalcharges = pd.to_numeric(df.totalcharges, errors='coerce')
df.totalcharges = df.totalcharges.fillna(0)

df.churn = (df.churn == 'yes').astype(int)

In [33]:
df_full_train, df_test = train_test_split(df, test_size=0.2, random_state=1)
df_train, df_val = train_test_split(df_full_train, test_size=0.25, random_state=1)

df_train = df_train.reset_index(drop=True)
df_val = df_val.reset_index(drop=True)
df_test = df_test.reset_index(drop=True)

y_train = df_train.churn.values
y_val = df_val.churn.values
y_test = df_val.churn.values

del df_train['churn']
del df_val['churn']
del df_test['churn']

In [34]:
numerical = ['tenure', 'monthlycharges', 'totalcharges']

categorical = ['gender', 'seniorcitizen', 'partner', 'dependents', 
        'phoneservice', 'multiplelines', 'internetservice',
       'onlinesecurity', 'onlinebackup', 'deviceprotection', 
        'techsupport','streamingtv', 'streamingmovies', 
        'contract', 'paperlessbilling',
       'paymentmethod']

In [35]:
from sklearn.metrics import roc_auc_score

In [36]:
# to implement, we have to create a train function

def train(df_train, y_train, C=1.0):
    dicts = df_train[categorical + numerical].to_dict(orient='records')
    
    dv = DictVectorizer(sparse=False)
    X_train = dv.fit_transform(dicts)
    
    model = LogisticRegression(C=C, max_iter=1000)
    model.fit(X_train, y_train)
    
    return dv, model

In [37]:
def predict(df, dv, model):
    dicts = df[categorical + numerical].to_dict(orient='records')
    
    X = dv.transform(dicts)
    y_pred = model.predict_proba(X)[:, 1]
    
    return y_pred

In [38]:
# k-fold cross-validation

from sklearn.model_selection import KFold

In [39]:
n_splits = 5
C = 1.0
kfold = KFold(n_splits=n_splits, shuffle=True, random_state=1)

scores = []

for train_idx, val_idx in kfold.split(df_full_train):
    df_train = df_full_train.iloc[train_idx]
    df_val = df_full_train.iloc[val_idx]

    y_train = df_train.churn.values
    y_val = df_val.churn.values

    dv, model = train(df_train, y_train, C=C)
    y_pred = predict(df_val, dv, model)

    auc = roc_auc_score(y_val, y_pred)
    scores.append(auc)

print('C=%s %.3f +- %.3f' % (C, np.mean(scores), np.std(scores)))

C=1.0 0.841 +- 0.008


In [40]:
scores

[0.8435682269548084,
 0.8458337471858032,
 0.8311780052177403,
 0.8301724275756219,
 0.8517774779580183]

In [41]:
# we train our model once more on C=1.0


dv, model = train(df_full_train, df_full_train.churn.values, C=1.0)
y_pred = predict(df_test, dv, model)

auc = roc_auc_score(y_test, y_pred)
auc

# the result ought to be 0.8572386167896259

0.5035238731962783

# Model Deployment

## 5.1 Intro/Session Overview

In This session we talked about the earlier model we made in chapter 3 for churn prediction. This chapter contains the deployment of the model. If we want to use the model to predict new values without running the code, There's a way to do this. The way to use the model in different machines without running the code, is to deploy the model in a server (run the code and make the model). After deploying the code in a machine used as server we can make some endpoints (using api's) to connect from another machine to the server and predict values.

To deploy the model in a server there are some steps:

- After training the model save it, to use it for making predictions in future (session 02-pickle).
- Make the API endpoints in order to request predictions. (session 03-flask-intro and 04-flask-deployment)
- Some other server deployment options (sessions 5 to 9)

We train a model in jupyter notebook, then we use the model by saving it to a file(model.bin). We want to load the file from a service e.g. churn service and the model will be inside the service. 

Say we have another service such as a marketing service that contain all the users. The marketing service can send a request to the churn service with info about the user, then the churn service sends back prediciton to the marketing service and based on the prediction received, the marketing service can decide whether they want to send promotional email with discount prices.

We put the churn prediction model into a web service using flask (a framework for creating web services in python). We want to wrap the service in such a way that it does not interfere with other services that we have in our machine. What we want to do is to create a special environment for our python dependencies - for this, we will use `pipenv`. We have another layer for system dependencies, then deploy the container to the cloud

![deployment_image](deployment.png)

## 5.2 Saving and Loading the Model

- Saving the model to pickle
- Loading the model to pickle
- Turning our notebook into a python script


### Saving the Model

In [42]:
# pickle for saving python objects

import pickle

In [43]:
# output_file = 'model_C=%s.bin' % C
output_file = f'model_C={C}.bin'
output_file

'model_C=1.0.bin'

In [44]:
# # create a file and write to it
# f_out = open(output_file, 'wb')

# # save our model
# pickle.dump((dv, model), f_out)

# # close the file
# f_out.close()

In [45]:
# using a methodology that ensures that the file is always closed
# instead of the cell above, we can rewrite:

with open(output_file, 'wb') as f_out:
    pickle.dump((dv, model), f_out)
    # do stuff
# here, once outside the 'with' statement, the file closes
# and we can do other things.

### Loading the Model

We restarted the kernel so that we can mimick loading the model

In [1]:
import pickle

In [2]:
model_file = 'model_C=10.bin'

In [3]:
# we read the file here

with open(model_file, 'rb') as f_in:
    dv, model = pickle.load(f_in)

In [4]:
dv, model

(DictVectorizer(sparse=False), LogisticRegression(max_iter=1000))

In [5]:
## a new customer informations
customer = {
  'customerid': '8879-zkjof',
  'gender': 'female',
  'seniorcitizen': 0,
  'partner': 'no',
  'dependents': 'no',
  'tenure': 41,
  'phoneservice': 'yes',
  'multiplelines': 'no',
  'internetservice': 'dsl',
  'onlinesecurity': 'yes',
  'onlinebackup': 'no',
  'deviceprotection': 'yes',
  'techsupport': 'yes',
  'streamingtv': 'yes',
  'streamingmovies': 'yes',
  'contract': 'one_year',
  'paperlessbilling': 'yes',
  'paymentmethod': 'bank_transfer_(automatic)',
  'monthlycharges': 79.85,
  'totalcharges': 3320.75
}


In [6]:
# we want to turn the customer above into a feature matrix
X = dv.transform([customer])
model.predict_proba(X)[0, 1]
# the result shows that the customer is going to churn for the value of
# 0.6363584152715288 cos the result from the cell is wrong.

0.06224295541445037

It is not convenient to train a model from jupyter notebook. The best way is to create a python file/script that does the training. We do that by downloading the notebook as a python file.

## 5.3 Web Services: Introduction to Flask

Web services are services you communicate with over a network using some protocol. We can use flask for implementing it.

In this session, we create a simple service where user sends a `ping` address and it responds with a `pong`.

## 5.4 Serving the Churn Model with Flask