<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>

# Hyperparameter Tuning

## *Data Science Unit 4 Sprint 2 Assignment 4*

## Your Mission, should you choose to accept it...

To hyperparameter tune and extract every ounce of accuracy out of this telecom customer churn dataset: <https://drive.google.com/file/d/1dfbAsM9DwA7tYhInyflIpZnYs7VT-0AQ/view> 

## Requirements

- Load the data
- Clean the data if necessary (it will be)
- Create and fit a baseline Keras MLP model to the data.
- Hyperparameter tune (at least) the following parameters:
 - batch_size
 - training epochs
 - optimizer
 - learning rate (if applicable to optimizer)
 - momentum (if applicable to optimizer)
 - activation functions
 - network weight initialization
 - dropout regularization
 - number of neurons in the hidden layer
 
 You must use Grid Search and Cross Validation for your initial pass of the above hyperparameters
 
 Try and get the maximum accuracy possible out of this data! You'll save big telecoms millions! Doesn't that sound great?


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow import keras
from keras.layers import Dense
from keras.models import Sequential
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedKFold, cross_val_score, KFold
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OrdinalEncoder, OneHotEncoder


Using TensorFlow backend.


In [2]:
pd.set_option('display.max_columns', 100)

In [3]:
df = pd.read_csv('./WA_Fn-UseC_-Telco-Customer-Churn.csv')

In [4]:
df.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,0,Yes,No,1,No,No phone service,DSL,No,Yes,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,No
1,5575-GNVDE,Male,0,No,No,34,Yes,No,DSL,Yes,No,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,No
2,3668-QPYBK,Male,0,No,No,2,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,Yes
3,7795-CFOCW,Male,0,No,No,45,No,No phone service,DSL,Yes,No,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,No
4,9237-HQITU,Female,0,No,No,2,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,Yes


In [5]:
df.describe(include='all')

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
count,7043,7043,7043.0,7043,7043,7043.0,7043,7043,7043,7043,7043,7043,7043,7043,7043,7043,7043,7043,7043.0,7043.0,7043
unique,7043,2,,2,2,,2,3,3,3,3,3,3,3,3,3,2,4,,6531.0,2
top,3049-NDXFL,Male,,No,No,,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,,,No
freq,1,3555,,3641,4933,,6361,3390,3096,3498,3088,3095,3473,2810,2785,3875,4171,2365,,11.0,5174
mean,,,0.162147,,,32.371149,,,,,,,,,,,,,64.761692,,
std,,,0.368612,,,24.559481,,,,,,,,,,,,,30.090047,,
min,,,0.0,,,0.0,,,,,,,,,,,,,18.25,,
25%,,,0.0,,,9.0,,,,,,,,,,,,,35.5,,
50%,,,0.0,,,29.0,,,,,,,,,,,,,70.35,,
75%,,,0.0,,,55.0,,,,,,,,,,,,,89.85,,


In [6]:
yes_no = ['Partner',
          'Dependents',
          'PhoneService',
          'OnlineSecurity',
          'PaperlessBilling',
          'Churn'
         ]

In [7]:
df[yes_no] = df[yes_no].applymap(lambda x: 1 if x == 'Yes' else 0)

In [8]:
df.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,0,1,0,1,0,No phone service,DSL,0,Yes,No,No,No,No,Month-to-month,1,Electronic check,29.85,29.85,0
1,5575-GNVDE,Male,0,0,0,34,1,No,DSL,1,No,Yes,No,No,No,One year,0,Mailed check,56.95,1889.5,0
2,3668-QPYBK,Male,0,0,0,2,1,No,DSL,1,Yes,No,No,No,No,Month-to-month,1,Mailed check,53.85,108.15,1
3,7795-CFOCW,Male,0,0,0,45,0,No phone service,DSL,1,No,Yes,Yes,No,No,One year,0,Bank transfer (automatic),42.3,1840.75,0
4,9237-HQITU,Female,0,0,0,2,1,No,Fiber optic,0,No,No,No,No,No,Month-to-month,1,Electronic check,70.7,151.65,1


In [9]:
df['gender'] = df['gender'].map(lambda x: 1 if x == 'Female' else 0)

In [10]:
df['InternetService'].unique()

array(['DSL', 'Fiber optic', 'No'], dtype=object)

In [11]:
df['InternetService'] = df['InternetService'].map({'No': 0, 'DSL': 1, 'Fiber optic': 2})

In [12]:
df['OnlineBackup'].unique()

array(['Yes', 'No', 'No internet service'], dtype=object)

In [13]:
df['OnlineBackup'] = df['OnlineBackup'].map({'No internet service': 0, 'No': 1, 'Yes': 2})

In [14]:
for column in ['DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies']:
    df[column] = df[column].map({'No internet service': 0, 'No': 1, 'Yes': 2})

In [15]:
df['MultipleLines'].unique()

array(['No phone service', 'No', 'Yes'], dtype=object)

In [16]:
df['MultipleLines'] = df['MultipleLines'].map({'No phone service': 0, 'No': 1, 'Yes': 2})

In [17]:
df['Contract'].unique()

array(['Month-to-month', 'One year', 'Two year'], dtype=object)

In [18]:
df['Contract'] = df['Contract'].map({'Month-to-month': 0, 'One year': 1, 'Two year': 2})

In [19]:
df['PaymentMethod'].unique()

array(['Electronic check', 'Mailed check', 'Bank transfer (automatic)',
       'Credit card (automatic)'], dtype=object)

In [20]:
df['PaymentMethod'] = df['PaymentMethod'].map({'Mailed check': 0, 
                                               'Electronic check': 1, 
                                               'Bank transfer (automatic)': 2,
                                               'Credit card (automatic)': 3})

In [21]:
df.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,1,0,1,0,1,0,0,1,0,2,1,1,1,1,0,1,1,29.85,29.85,0
1,5575-GNVDE,0,0,0,0,34,1,1,1,1,1,2,1,1,1,1,0,0,56.95,1889.5,0
2,3668-QPYBK,0,0,0,0,2,1,1,1,1,2,1,1,1,1,0,1,0,53.85,108.15,1
3,7795-CFOCW,0,0,0,0,45,0,0,1,1,1,2,2,1,1,1,0,2,42.3,1840.75,0
4,9237-HQITU,1,0,0,0,2,1,1,2,0,1,1,1,1,1,0,1,1,70.7,151.65,1


### All cleaned up.

In [22]:
X = df.drop(columns=['Churn', 'customerID', 'TotalCharges'])
y = df['Churn']

In [23]:
scaler = StandardScaler()
X = scaler.fit_transform(X)

In [24]:
X.shape, y.shape

((7043, 18), (7043,))

In [25]:
inputs = X.shape[1]
epochs = 20
batch_size = 42

model = Sequential()
model.add(Dense(64, activation='relu', input_dim=inputs))
model.add(Dense(64, activation='relu'))
model.add(Dense(1))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model.fit(X, y, validation_split=0.33, epochs=epochs, batch_size=batch_size)

W0816 00:56:59.369577 4643050944 deprecation_wrapper.py:119] From /Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0816 00:56:59.432763 4643050944 deprecation_wrapper.py:119] From /Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0816 00:56:59.439653 4643050944 deprecation_wrapper.py:119] From /Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0816 00:56:59.496935 4643050944 deprecation_wrapper.py:119] From /Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/keras/optimize

Train on 4718 samples, validate on 2325 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1093d1438>

In [26]:
model.evaluate(X, y)



[0.4486779142120896, 0.8090302427942638]

In [27]:
model.metrics_names

['loss', 'acc']

In [28]:
def model_creator(optimizer='adam', learning_rate=.01):
    model = Sequential()
    model.add(Dense(32, activation='relu', input_dim=inputs))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(learning_rate=learning_rate, loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

In [36]:
model_2 = KerasClassifier(build_fn=model_creator, verbose=1)


In [37]:
params = {'batch_size': [10, 50, 100, 250, 500, 1000, 2500],
          'epochs': [20]}

grid = GridSearchCV(estimator=model_2, param_grid=params, n_jobs=2)
grid_result = grid.fit(X, y)
print(f'Best: {grid_result.best_score_} using {grid_result.best_params_}')
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f'Mean: {mean}, Stdev: {stdev} with : {param}')

E0816 01:16:04.265345 123145444253696 _base.py:627] exception calling callback for <Future at 0x1a40871128 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/parallel.py", line 309, in __call__
    self.parallel.dispatch_next()
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/parallel.py", line 731, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/jobli

TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. The exit codes of the workers are {EXIT(1)}

In [35]:
params = {'optimizer': ['adam', 'adagrad', 'sgd'],
          'epochs': [20]}
grid1 = GridSearchCV(estimator=model_2, param_grid= params,
                   n_jobs=-1)
grid_result = grid1.fit(X, y, verbose=1)

print(f'Best: {grid_result.best_score_} using {grid_result.best_params_}')
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f'Mean: {mean}, Stdev: {stdev} with : {param}')

E0816 00:59:33.754279 123145444253696 _base.py:627] exception calling callback for <Future at 0x1a40aae0b8 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/parallel.py", line 309, in __call__
    self.parallel.dispatch_next()
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/parallel.py", line 731, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/lambda_school_loaner_153/anaconda3/envs/unit4wk2/lib/python3.7/site-packages/jobli

TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. The exit codes of the workers are {EXIT(1)}

In [None]:
params = {'learning_rate': [.1, .01, .001, .0001],
          'epochs': [20]}
grid1 = GridSearchCV(estimator=grid_result.best_estimator_, param_grid= params,
                   n_jobs=-1)
grid_result = grid1.fit(X, y, verbose=1)

print(f'Best: {grid_result.best_score_} using {grid_result.best_params_}')
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f'Mean: {mean}, Stdev: {stdev} with : {param}')

In [None]:
training epochs
optimizer
learning rate (if applicable to optim
momentum (if applicable to optimizer)
activation functions
network weight initialization
dropout regularization
number of neurons in the hidden layer

In [None]:
import numpy

def create_model(optimizer='adam'):
	# create model
	model = Sequential()
	model.add(Dense(12, input_dim=18, activation='relu'))
	model.add(Dense(1, activation='sigmoid'))
	# Compile model
	model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
	return model
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# create model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=10, verbose=0)
# define the grid search parameters
optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']
param_grid = dict(optimizer=optimizer)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X, y)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

## Stretch Goals:

- Try to implement Random Search Hyperparameter Tuning on this dataset
- Try to implement Bayesian Optimiation tuning on this dataset
- Practice hyperparameter tuning other datasets that we have looked at. How high can you get MNIST? Above 99%?
- Study for the Sprint Challenge
 - Can you implement both perceptron and MLP models from scratch with forward and backpropagation?
 - Can you implement both perceptron and MLP models in keras and tune their hyperparameters with cross validation?