<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>

# Hyperparameter Tuning

## *Data Science Unit 4 Sprint 2 Assignment 4*

## Your Mission, should you choose to accept it...

To hyperparameter tune and extract every ounce of accuracy out of this telecom customer churn dataset: <https://drive.google.com/file/d/1dfbAsM9DwA7tYhInyflIpZnYs7VT-0AQ/view> 

## Requirements

- Load the data
- Clean the data if necessary (it will be)
- Create and fit a baseline Keras MLP model to the data.
- Hyperparameter tune (at least) the following parameters:
 - batch_size
 - training epochs
 - optimizer
 - learning rate (if applicable to optimizer)
 - momentum (if applicable to optimizer)
 - activation functions
 - network weight initialization
 - dropout regularization
 - number of neurons in the hidden layer
 
 You must use Grid Search and Cross Validation for your initial pass of the above hyperparameters
 
 Try and get the maximum accuracy possible out of this data! You'll save big telecoms millions! Doesn't that sound great?


In [123]:
import numpy as np
import pandas as pd
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.optimizers import Adam

In [5]:
#!pip install wandb
!wandb login 358ce2801e640a67df828839c179d15370f0f4aa
import wandb
from wandb.keras import WandbCallback

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /home/ec2-user/.netrc
[32mSuccessfully logged in to Weights & Biases![0m


In [35]:
seed = 7
numpy.random.seed(seed)

path = "data/WA_Fn-UseC_-Telco-Customer-Churn+.csv"
df = pd.read_csv(path)

df.head().T

Unnamed: 0,0,1,2,3,4
customerID,7590-VHVEG,5575-GNVDE,3668-QPYBK,7795-CFOCW,9237-HQITU
gender,Female,Male,Male,Male,Female
SeniorCitizen,0,0,0,0,0
Partner,Yes,No,No,No,No
Dependents,No,No,No,No,No
tenure,1,34,2,45,2
PhoneService,No,Yes,Yes,No,Yes
MultipleLines,No phone service,No,No,No phone service,No
InternetService,DSL,DSL,DSL,DSL,Fiber optic
OnlineSecurity,No,Yes,Yes,Yes,No


In [97]:
# df.customerID.unique #all values are unique for customerID
# df.tenure.value_counts()
df.shape

(7043, 21)

In [44]:
df.isna().sum()

customerID          0
gender              0
SeniorCitizen       0
Partner             0
Dependents          0
tenure              0
PhoneService        0
MultipleLines       0
InternetService     0
OnlineSecurity      0
OnlineBackup        0
DeviceProtection    0
TechSupport         0
StreamingTV         0
StreamingMovies     0
Contract            0
PaperlessBilling    0
PaymentMethod       0
MonthlyCharges      0
TotalCharges        0
Churn               0
dtype: int64

In [49]:
df.values[0]

array(['7590-VHVEG', 'Female', 0, 'Yes', 'No', 1, 'No',
       'No phone service', 'DSL', 'No', 'Yes', 'No', 'No', 'No', 'No',
       'Month-to-month', 'Yes', 'Electronic check', 29.85, '29.85', 'No'],
      dtype=object)

In [116]:
from sklearn.preprocessing import OrdinalEncoder

X = df.drop(columns=['customerID', 'Churn'])
y = df.Churn
y = np.array(y).reshape(-1,1)

data_enc = OrdinalEncoder()
target_enc = OrdinalEncoder()

data_enc.fit(X)
X = data_enc.transform(X)

target_enc.fit(y)
y = target_enc.transform(y)

In [117]:
data_enc.inverse_transform(X)[0]

array(['Female', 0, 'Yes', 'No', 1, 'No', 'No phone service', 'DSL', 'No',
       'Yes', 'No', 'No', 'No', 'No', 'Month-to-month', 'Yes',
       'Electronic check', 29.85, '29.85'], dtype=object)

In [118]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X = scaler.fit_transform(X)

In [119]:
data_enc.inverse_transform(scaler.inverse_transform(X))[0]

array(['Female', 0, 'Yes', 'No', 1, 'No', 'No phone service', 'DSL', 'No',
       'Yes', 'No', 'No', 'No', 'No', 'Month-to-month', 'Yes',
       'Electronic check', 29.85, '29.85'], dtype=object)

In [125]:
#Initializes and Experiment

inputs = X.shape[1]

def create_model():
    # create model
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=(inputs,)))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# create model
model = KerasClassifier(build_fn=create_model, verbose=0)

# define the grid search parameters
batch_size = [20]
epochs = [20]
param_grid = dict(batch_size=batch_size, epochs=epochs)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 

Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable
wandb: Network error resolved after 0:00:11.442145, resuming normal operation.
Error generating diff: Command '['git', 'diff', '--submodule=diff', 'HEAD']' timed out after 5 seconds


Best: 0.7655828496906444 using {'batch_size': 20, 'epochs': 20}
Means: 0.7655828496906444, Stdev: 0.0014129647852479726 with: {'batch_size': 20, 'epochs': 20}


In [126]:
wandb.init(project="assignment", entity="ds8") #Initializes and Experiment

inputs = X.shape[1]
wandb.config.epochs = 50
wandb.config.batch_size = 10

def create_model():
    # create model
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=(inputs,)))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# create model
model = KerasClassifier(build_fn=create_model, verbose=0)

model.fit(X, y, 
          validation_split=0.33, 
          epochs=wandb.config.epochs, 
          batch_size=wandb.config.batch_size, 
          callbacks=[WandbCallback()]
         )

# # define the grid search parameters
# batch_size = [20]
# epochs = [20]
# param_grid = dict(batch_size=batch_size, epochs=epochs)

# # Create Grid Search
# grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
# grid_result = grid.fit(X, y)

# # Report Results
# print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
# means = grid_result.cv_results_['mean_test_score']
# stds = grid_result.cv_results_['std_test_score']
# params = grid_result.cv_results_['params']
# for mean, stdev, param in zip(means, stds, params):
#     print(f"Means: {mean}, Stdev: {stdev} with: {param}") 

Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable
Error generating diff: Command '['git', 'diff', '--submodule=diff', 'HEAD']' timed out after 5 seconds


<tensorflow.python.keras.callbacks.History at 0x7f9089c6ac50>

In [None]:
sweep_config = {
    'method': 'random',
    'parameters': {
        'learning_rate': {'distribution': 'normal',
                         'min': .05,
                         'max': .15},
        'epochs': {'distribution': 'uniform',
                    'min': 100,
                    'max': 1000},
        'batch_size': {'distribution': 'uniform',
            'min': 10,
            'max': 400}
    }
}
sweep_id = wandb.sweep(sweep_config)

from tensorflow.keras.optimizers import Adam

inputs = X.shape[1]

def train():
    
    wandb.init(project="assignment", entity="ds8") 
    
    config = wandb.config

    # Create Model
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=(inputs,)))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Optimizer 
    adam = Adam(learning_rate=config.learning_rate)

    # Compile Model
    model.compile(optimizer=adam, loss='binary_crossentropy', metrics=['accuracy'])

    # Fit Model
    model.fit(X, y, 
              validation_split=0.33, 
              epochs=config.epochs, 
              batch_size=config.batch_size, 
              callbacks=[WandbCallback()]
             )
    
wandb.agent(sweep_id, function=train)

Create sweep with ID: k2vkjfau
Sweep URL: https://app.wandb.ai/ds8/assignment/sweeps/k2vkjfau


Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable


wandb: Agent Starting Run: g923fyg6 with config:
	batch_size: 87.90212490010397
	epochs: 695.7622066029073
	learning_rate: 1.53821913791661


Error generating diff: Command '['git', 'diff', '--submodule=diff', 'HEAD']' timed out after 5 seconds


wandb: Agent Started Run: g923fyg6


Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable
Error generating diff: Command '['git', 'diff', '--submodule=diff', 'HEAD']' timed out after 5 seconds
Process Process-2:
Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/wandb/wandb_agent.py", line 62, in _start
    function()
  File "<ipython-input-129-4cc7c45405ec>", line 35, in train
    adam = Adam(learning_rate=config.learning_rate)
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/wandb/wandb_config.py", line 212, in __getattr__
    return self.__getitem__(key)
  File "/home/ec2-user/anac

## Stretch Goals:

- Try to implement Random Search Hyperparameter Tuning on this dataset
- Try to implement Bayesian Optimiation tuning on this dataset using hyperas or hyperopt (if you're brave)
- Practice hyperparameter tuning other datasets that we have looked at. How high can you get MNIST? Above 99%?
- Study for the Sprint Challenge
 - Can you implement both perceptron and MLP models from scratch with forward and backpropagation?
 - Can you implement both perceptron and MLP models in keras and tune their hyperparameters with cross validation?