# Notebook on Keras/Tensorflow "Deep" NN Modeling for Images

The notebook is from Dane Morgan's class again, but also with almost all materials taken from others (see below.) We will build a standard deep NN for the American Sign Language dataset (images of hands making letters in [American Sign Language](http://www.asl.gs/)). Note that this NN is not convolutional and not very "deep", but has mulitple layers and many parameters.
* Most taken from [Nvidia Deep Learning course](https://www.nvidia.com/en-us/training/).

Learning goal(s):
* Be able to build a simple deep learning Neural Network and train and aseess it using Keras/Tensorflow.
* Prepare image data for training.
Create and compile a simple model for image classification.
* Train an image classification model and observe the results.




# Colab setup

## GPUs
This lab is much faster on GPUs/TPUs. Using GPUs or TPUs by doing the following (might not be always available):
* Enabling GPU. To enable GPU in your notebook, (i) Edit/Notebook Settings, or (ii) select the following menu options − Runtime / Change runtime type. See which works.
* Testing for GPU. You can easily check if the GPU is enabled by executing the following code:

In [None]:
# Test if running GPU
import tensorflow as tf
tf.test.gpu_device_name()

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Here, modify the path where you saved this notebook and the associated lab files on your Google drive, if different from below.

In [None]:
path = '/content/drive/MyDrive/Path to Lab3/Lab3'
#path = '/content/drive/MyDrive/Online Filed/Dane Morgan Mixed Sharing/Projects/Courses/MSE803 F22/Dane student view/MSE803_Lab1'

Check the path name is assigned correctly.

In [None]:
!echo $path
import os
os.path.isdir(path)


In [None]:
# This is needed so can import packages from a different path than standard libraries
import sys
sys.path.append(path)

# Some useful functions

In [None]:
# This plots scores for training and validation data vs. epoch.
def err_plot(acc, val_acc, loss, val_loss):
  import matplotlib.pyplot as plt

  epochs = range(1, len(acc) + 1)

  plt.plot(epochs, acc, 'bo', label='Training acc')
  plt.plot(epochs, val_acc, 'b', label='Validation acc')
  plt.title('Training and validation accuracy')
  plt.legend()

  plt.figure()

  plt.plot(epochs, loss, 'bo', label='Training loss')
  plt.plot(epochs, val_loss, 'b', label='Validation loss')
  plt.title('Training and validation loss')
  plt.legend()

  plt.show()



# Setting up the Data
This dataset is not available via Keras in the same way that some data sets are (e.g., MNIST). This dataset is available from the website [Kaggle](http://www.kaggle.com), which is a fantastic place to find datasets and other deep learning resources. In addition to providing resources like datasets and "kernels" that are like these notebooks, Kaggle hosts competitions that you can take part in, competing with others in training highly accurate models.

If you're looking to practice or see examples of many deep learning projects, Kaggle is a great site to visit.

We will be learning to classify images into letters in the American Sign Language Alphabet (ASL). The [ASL alphabet](http://www.asl.gs/) contains 26 letters. Two of those letters (j and z) require movement, so they are not included in the training dataset. We display the images below.

Note: You need to download figures (see Hw3 description) and put them into the "images" directory before you can run these "display" cells.

In [None]:
from IPython.display import Image
filename = os.path.join(path, 'images/asl.png')
print('Loading image from: ',filename)
display(Image(filename, width=600))

## Reading in the Data
Use Pandas DataFrame.

In [None]:
import pandas as pd
train_df = pd.read_csv(os.path.join(path, 'sign_mnist_train.csv'))
valid_df = pd.read_csv(os.path.join(path, 'sign_mnist_valid.csv'))

## Exploring the Data

Let's take a look at our data. We can use the [head](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html) method to print the first few rows of the DataFrame. Each row is an image which has a `label` column, and also, 784 values representing each pixel value in the image, just like with the MNIST dataset. Note that the labels currently are numerical values, not letters of the alphabet:

In [None]:
train_df.head()

## Extracting the Labels

We would like to store our training and validation labels in `y_train` and `y_valid` variables. Here we create those variables and then delete the labels from our original dataframes, where they are no longer needed:

In [None]:
y_train = train_df['label']
y_valid = valid_df['label']
del train_df['label']
del valid_df['label']

## Extracting the Images

We would like to store our training and validation images in `x_train` and `x_valid` variables. Here we create those variables:

In [None]:
x_train = train_df.values
x_valid = valid_df.values

## Summarizing the Training and Validation Data

We now have 27,455 images with 784 pixels each for training...

In [None]:
x_train.shape

...as well as their corresponding labels:

In [None]:
y_train.shape

For validation, we have 7,172 images...

In [None]:
x_valid.shape

...and their corresponding labels:

In [None]:
y_valid.shape

## Visualizing the Data

To visualize the images, we will again use the matplotlib library. We don't need to worry about the details of this visualization, but if interested, you can learn more about [matplotlib](https://matplotlib.org/) at a later time.

Note that we'll have to reshape the data from its current 1D shape of 784 pixels, to a 2D shape of 28x28 pixels to make sense of the image:

In [None]:
import matplotlib.pyplot as plt
plt.figure(figsize=(40,40))

num_images = 20
for i in range(num_images):
    row = x_train[i]
    label = y_train[i]

    image = row.reshape(28,28)
    plt.subplot(1, num_images, i+1)
    plt.title(label, fontdict={'fontsize': 30})
    plt.axis('off')
    plt.imshow(image, cmap='gray')

# Preprocess the Image Data

## Normalize the Image Data

We are going to normalize the image data, meaning that their pixel values, instead of being between 0 and 255 as they are currently:

In [None]:
x_train.min()

In [None]:
x_train.max()

...should be floating point values between 0 and 1. Use the following cell to work. If you get stuck, look at the solution below.

In [None]:
x_train = x_train / 255
x_valid = x_valid / 255

## Categorize the Labels

We are going to categorically (one-hot) encode the labels. We can use the [keras.utils.to_categorical](https://www.tensorflow.org/api_docs/python/tf/keras/utils/to_categorical) method to accomplish this by passing it the values to encode, and, the number of categories to encode it into.

In [None]:
import tensorflow.keras as keras
num_classes = 24
if not y_train.shape[-1] == 24:  # Avoid running multiple times
    y_train = keras.utils.to_categorical(y_train, num_classes)
    y_valid = keras.utils.to_categorical(y_valid, num_classes)

# Build and Fit the Model

## Build the Model

The data is all prepared, we have normalized images for training and validation, as well as categorically encoded labels for training and validation.

For this exercise we are going to build a sequential model that:
* Has a dense input layer. This layer should contain 512 neurons, use the `relu` activation function, and expect input images with a shape of `(784,)`.
* Has a dense output layer with neurons equal to the number of classes, using the `softmax` activation function.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(units = 512, activation='relu', input_shape=(784,)))
model.add(Dense(units = num_classes, activation='softmax'))

## Summarizing the Model

Run the cell below to summarize the model you just created:

In [None]:
model.summary()

## Compiling the Model

We'll [compile](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#compile) our model with a loss function of [categorical crossentropy](https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy) to reflect the fact that we want to fit into one of many categories, and measuring the accuracy of our model as it is fit:

In [None]:
model.compile(loss='categorical_crossentropy', metrics=['accuracy'])

## Train the Model

Use the model's `fit` method to train it for 20 epochs using the training and validation images and labels created above (~2 min).

In [None]:
hist=model.fit(x_train, y_train, epochs=20, verbose=1, validation_data=(x_valid, y_valid))

## Analyze Model Results

Key data to assess include how the loss function evolves during learning with each epoch to see if we are actually doing any training.  Also the accuracy (number of correct predictions divided by the total number of predictions) is a good metric to see how well our model is doing.  We usually want results for training and validation data.

In [None]:
# See what data we have saved in history from model run
print(hist.history.keys())

In [None]:
# Plot accuracy and loss
acc = hist.history['accuracy']
val_acc = hist.history['val_accuracy']
loss = hist.history['loss']
val_loss = hist.history['val_loss']

err_plot(acc,val_acc,loss,val_loss)

In [None]:
# Output top accuracy in validation data
max_acc = max(hist.history['val_accuracy'])
maxacc_index = hist.history['val_accuracy'].index(max_acc)+1
print("Best validation accuracy during run was: %1.2f on epoch %i"
      % (max_acc, maxacc_index))

## Discussion: What happened?

We can see that the training accuracy got to a fairly high level, but the validation accuracy was not as high. What happened here?

This is an example of the model learning to categorize the training data, but performing poorly against new data that it has not been trained on. Essentially, it is memorizing the dataset, but not gaining a robust and general understanding of the problem, which means *overfitting*.

# Summary

In this section you built your own neural network to perform image classification that is quite accurate. Congrats!

At this point we should be getting somewhat familiar with the process of loading data (incuding labels), preparing it, creating a model, and then training the model with prepared data.

# Question 1
Add another dense layer of 512 nodes and retrain and so you can compare to the best validation accuracy to what you got with the initial architecture above.  Provide a similar plot of train and validation accuracy and give value of best value from this NN vs. above result

Answer: