<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Train Practice

## *Data Science Unit 4 Sprint 2 Assignment 4*

Continue to use TensorFlow Keras & a sample of the [Quickdraw dataset](https://github.com/googlecreativelab/quickdraw-dataset) to build a sketch classification model. The dataset has been sampled to only 10 classes and 10000 observations per class. Apply regularization techniques to your model. 

*Don't forgot to switch to GPU on Colab!*

## Regularization

Using your best performing model from the previous module, apply each of the following regularization strategies: 
* Early Stopping
* Dropout
* Weight Decay
* Weight Constraint


In [5]:
from tensorflow.keras import regularizers
from tensorflow.keras.constraints import MaxNorm
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard
from tensorflow.keras.layers import Dense, ReLU, Dropout
import tensorflow as tf
import os
import numpy as np
from sklearn.model_selection import train_test_split

In [6]:
def split_zip_df(path):
  data = np.load(path)
  features = 'arr_0'
  target = 'arr_1'
  X = data[features]
  y = data[target]
  X_train, X_test, y_train, y_test = train_test_split(
      X, y, test_size=0.20,
      stratify= y,
      random_state=17)
  X_train, X_val, y_train, y_val = train_test_split(
      X_train, y_train, test_size=0.20,
      stratify= y_train,
      random_state=17)

  return X_train, y_train, X_val, y_val, X_test, y_test

  
X_train, y_train, X_val, y_val, X_test, y_test = split_zip_df('quickdraw10.npz')

X_train.shape, y_train.shape, X_val.shape, y_val.shape, X_test.shape, y_test.shape

((64000, 784), (64000,), (16000, 784), (16000,), (20000, 784), (20000,))

In [10]:
tf.random.set_seed(17)
logdir = os.path.join("logs", "EarlyStopping-Loss")

tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
stop = EarlyStopping(monitor='val_loss', min_delta=0.001, patience=5)

# Normal values tend to be 0 to 0.01 on log scale
wd = 0.001
model = tf.keras.Sequential([
    Dense(128, activation='relu', input_dim= 784, kernel_regularizer=regularizers.L2(wd)),
    Dense(128, activation='relu', kernel_regularizer=regularizers.L2(wd)),
    Dense(128, activation='relu', kernel_regularizer=regularizers.L2(wd)),
    Dense(10, activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(X_train, y_train, epochs=72, 
          validation_data=(X_test,y_test),
          callbacks=[tensorboard_callback, stop])

Epoch 1/72
Epoch 2/72
Epoch 3/72
Epoch 4/72
Epoch 5/72
Epoch 6/72
Epoch 7/72
Epoch 8/72
Epoch 9/72
Epoch 10/72
Epoch 11/72
Epoch 12/72
Epoch 13/72
Epoch 14/72
Epoch 15/72
Epoch 16/72
Epoch 17/72
Epoch 18/72


<tensorflow.python.keras.callbacks.History at 0x7fb19ce93ef0>

## Deploy

Save your model's weights using the Checkpoint function. Try reloading the model and making inference on your validation dataset.

In [14]:
stop = EarlyStopping(monitor='val_accuracy', min_delta=0.005, patience=5)
mcp = ModelCheckpoint('best_weights.h5', 
                      monitor='val_accuracy', 
                      verbose=1, 
                      save_best_only=True,
                      save_weights_only=True)

def get_model(dropout_rate):
  model = tf.keras.Sequential([
      Dense(128, activation='relu', input_dim= 784, kernel_constraint=MaxNorm(3)),
      Dropout(dropout_rate),
      Dense(128, activation='relu', kernel_constraint=MaxNorm(3)),
      Dropout(dropout_rate),
      Dense(128, activation='relu', kernel_constraint=MaxNorm(3)),
      Dropout(dropout_rate),
      Dense(10, activation='softmax')
  ])

  model.compile(loss='sparse_categorical_crossentropy', 
                optimizer='adam', 
                metrics=['accuracy'])
  return model

model2 = get_model(0.2)
model.fit(X_train, y_train, 
          epochs=100, 
          validation_data=(X_test,y_test),
          callbacks=[stop, mcp])

Epoch 1/100
Epoch 00001: val_accuracy improved from -inf to 0.84850, saving model to best_weights.h5
Epoch 2/100
Epoch 00002: val_accuracy did not improve from 0.84850
Epoch 3/100
Epoch 00003: val_accuracy improved from 0.84850 to 0.85335, saving model to best_weights.h5
Epoch 4/100
Epoch 00004: val_accuracy did not improve from 0.85335
Epoch 5/100
Epoch 00005: val_accuracy did not improve from 0.85335
Epoch 6/100
Epoch 00006: val_accuracy did not improve from 0.85335


<tensorflow.python.keras.callbacks.History at 0x7fb2005526a0>

In [15]:
model2.load_weights('best_weights.h5')
model2.evaluate(X_test, y_test)



[0.490261971950531, 0.8533499836921692]

### Stretch Goals
- Mount your Google Drive to Colab to persist your model checkpoint files. 
- Research L2 normalization (weight decay)
- Write a custom callback function to stop training after you reach .88 validation accuracy. 
- Select a new dataset and apply a neural network to it.
- Research TensorFlow Serving
- Play [QuickDraw](https://quickdraw.withgoogle.com/data)
- Create a static webpage using TensorFlow.js to serve a model. Check out [Teachable Machine Learning](https://teachablemachine.withgoogle.com/) for ideas. 

In [17]:
from google.colab import drive

drive.mount('/content/gdrive')

Mounted at /content/gdrive
