<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Train Practice

## *Data Science Unit 4 Sprint 2 Assignment 3*

Continue to use TensorFlow Keras & a sample of the [Quickdraw dataset](https://github.com/googlecreativelab/quickdraw-dataset) to build a sketch classification model. The dataset has been sampled to only 10 classes and 10000 observations per class. Using your baseline model from yesterday, hyperparameter tune it and report on your highest validation accuracy. Your singular goal today is to achieve the highest accuracy possible.

*Don't forgot to switch to GPU on Colab!*

### Hyperparameters to Tune

At a minimum, tune each of these hyperparameters using any strategy we discussed during lecture today: 
- Optimizer
- Learning Rate
- Activiation Function
  - At least 1 subparameter within the Relu activation function
- Number of Neurons in Hidden Layers
- Number of Hidden Layers
- Weight Initialization

In [4]:
# Your Code Starts Here


# Imports
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

def load_quickdraw10(path):
    # Load the data
    df = np.load(path)
    X = df['arr_0']
    y = df['arr_1']

    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.10, random_state=79, stratify=y
    )

    # Normalize the data
    X_train = X_train.astype('float32') / 255
    X_test = X_test.astype('float32') / 255

    return X_train, y_train, X_test, y_test

# Load my data
X_train, y_train, X_test, y_test = load_quickdraw10('/content/quickdraw10.npz')

# Check my work
print(f'X Train: {X_train.shape},', f'y Train: {y_train.shape},', 
      f'X Test: {X_test.shape},', f'y Test: {y_test.shape}')




X Train: (90000, 784), y Train: (90000,), X Test: (10000, 784), y Test: (10000,)


In [5]:

# Imports
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam, SGD

# Set up Baseline model from last assignment
# Function to create a model
def create_model(learn_rate=0.01, optimizer=SGD):
    opt = optimizer(lr=learn_rate)
    
    # Instantiate the model
    model = Sequential([
                        Dense(128, input_dim=784, activation='sigmoid', 
                              name='Input_Layer'),
                        Dense(64, activation='relu', name='Hidden_64'),
                        Dense(32, activation='relu', name='Hidden_32'),
                        Dense(10, activation='softmax', name='Output_Layer')
    ])

    # Compile the model
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer=opt,
                  metrics=['accuracy'])
    
    return model

# Create my default model and look a the summary
base_model = create_model()
base_model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
Input_Layer (Dense)          (None, 128)               100480    
_________________________________________________________________
Hidden_64 (Dense)            (None, 64)                8256      
_________________________________________________________________
Hidden_32 (Dense)            (None, 32)                2080      
_________________________________________________________________
Output_Layer (Dense)         (None, 10)                330       
Total params: 111,146
Trainable params: 111,146
Non-trainable params: 0
_________________________________________________________________


In [6]:
# Fit my model
baseline = base_model.fit(X_train, y_train,
                          epochs=15,
                          validation_split=0.1)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [7]:
base_model.evaluate(X_test, y_test)



[0.5954679846763611, 0.8208000063896179]

In [9]:
import numpy
import pandas as pd
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier

# Function to create model, required for KerasClassifier
def create_model2(units=128, layers=2):
    # Create model
    model = Sequential()
    model.add(Dense(units, input_dim=784, activation='relu', name='Input_layer'))
    
    units = units/2

    for num in range(layers):
        model.add(Dense(units=units/2, activation='relu', 
                        name=f'Hidden_layer{(units/2)}'))
        units = units/2

    model.add(Dense(10, activation='softmax', name='Output_layer'))

    opt = SGD(learning_rate=0.01)

    # Compile model
    model.compile(loss='sparse_categorical_crossentropy', 
                  optimizer=opt, metrics=['accuracy'])
    return model

# create model
model = KerasClassifier(build_fn=create_model2, verbose=1)

# define the grid search parameters
param_grid = {'batch_size': [32,64,128],
              'epochs': [5],
              'units':[128]}
# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=2)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Best: 0.8306111097335815 using {'batch_size': 32, 'epochs': 5, 'units': 128}
Means: 0.8306111097335815, Stdev: 0.0041073570459400945 with: {'batch_size': 32, 'epochs': 5, 'units': 128}
Means: 0.804022216796875, Stdev: 0.003870515641919862 with: {'batch_size': 64, 'epochs': 5, 'units': 128}
Means: 0.7677666664123535, Stdev: 0.009403939213673062 with: {'batch_size': 128, 'epochs': 5, 'units': 128}


In [10]:
# Load the tensorboard extension
%load_ext tensorboard

In [12]:

import tensorflow as tf
from tensorboard.plugins.hparams import api as hp

import os
import datetime

In [13]:

HP_UNITS = hp.HParam('num_units', hp.Discrete([256, 64, 128]))
HP_LR = hp.HParam('learning_rate', hp.RealInterval(0.01, 0.3))
HP_OPT = hp.HParam('optimizer', hp.Discrete(['adam', 'adamax', 'sgd']))
HP_LAYERS = hp.HParam('layers', hp.Discrete([1, 2, 3]))

METRIC_ACCURACY = 'accuracy'

with tf.summary.create_file_writer('logs/hp_tuning').as_default():
    hp.hparams_config(
        hparams=[HP_UNITS, HP_LR, HP_OPT, HP_LAYERS],
        metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')])

In [14]:

from tensorflow.keras.optimizers import Adamax

def train_test_model(hparams):
    model = Sequential([
        Dense(hparams[HP_UNITS], activation='relu', input_dim=784),
        Dense(hparams[HP_UNITS]/2, activation='relu'),
        Dense(hparams[HP_UNITS]/4, activation='relu'),
        Dense(10, activation='softmax')])

    opt_name = hparams[HP_OPT]
    lr = hparams[HP_LR]

    if opt_name == 'sgd':
        opt = tf.keras.optimizers.SGD(learning_rate=lr)
    
    elif opt_name == 'adam':
        opt = tf.keras.optimizers.Adam(learning_rate=lr)
    
    elif opt_name == 'adamax':
        opt = tf.keras.optimizers.Adamax(learning_rate=lr)
    
    else:
        raise Exception("Unrecognized optimizer. Must be either 'sgd', 'adam' or 'adamax")

    model.compile(optimizer=opt,
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    model.fit(X_train, y_train, 
              batch_size=32,
              epochs=10)

    _, val_acc = model.evaluate(X_test, y_test)

    return val_acc

In [15]:

def run(run_dir, hparams):
    with tf.summary.create_file_writer(run_dir).as_default():
        hp.hparams(hparams)
        accuracy = train_test_model(hparams)
        tf.summary.scalar(METRIC_ACCURACY, accuracy, step=1)

In [16]:
%%time
session_num = 0 
for num_units in HP_UNITS.domain.values:
    for layer in HP_LAYERS.domain.values:
        for learning_rate in tf.linspace(HP_LR.domain.min_value, 
                                        HP_LR.domain.max_value, num=3):
            for optimizer in HP_OPT.domain.values:
                hparams = {
                    HP_UNITS: num_units,
                    HP_LAYERS: layer,
                    HP_LR: float(learning_rate),
                    HP_OPT: optimizer
                }
        run_name = f"run-{session_num}"
        print(f"---> Starting trial: {run_name}")
        print({h.name: hparams[h] for h in hparams})
        run('logs/hp_tuning/' + run_name, hparams)
        session_num += 1

---> Starting trial: run-0
{'num_units': 64, 'layers': 1, 'learning_rate': 0.30000001192092896, 'optimizer': 'sgd'}
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
---> Starting trial: run-1
{'num_units': 64, 'layers': 2, 'learning_rate': 0.30000001192092896, 'optimizer': 'sgd'}
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
---> Starting trial: run-2
{'num_units': 64, 'layers': 3, 'learning_rate': 0.30000001192092896, 'optimizer': 'sgd'}
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
---> Starting trial: run-3
{'num_units': 128, 'layers': 1, 'learning_rate': 0.30000001192092896, 'optimizer': 'sgd'}
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
---> Starting trial: run-4
{'num_units': 128, 'layers': 2, 'learning_rate': 0.3000000119209

In [None]:
%tensorboard --logdir logs/hp_tuning

### Stretch Goals
- Implement Bayesian Hyper-parameter Optimization
- Select a new dataset and apply a neural network to it.
- Use a cloud base experiment tracking framework such as weights and biases
- Research potential architecture ideas for this problem. Try Lenet-10 for example. 