# DS/CMPSC 410 Spring 2021
# Instructor: Professor John Yen
# TA: Rupesh Prajapati 
# LAs: Lily Jakielaszek and Cayla Shan Pun
# Lab 9: Deep Neural Networks

## The goals of this lab are for you to be able to
### - Use tensorflow and keras to implement a deep learning application (MNIST)
### - Be able to assess the result of DNN learning using validation data
### - Be able to identify potential overfitting risk of a DNN model
### - Be able to reduce potential overfitting risk by adjusting epoch and size of batch
### - Be able to compare learning outcomes of different DNN architectures

## Exercises: 
- Exercise 1: 5 points
- Exercise 2: 10 points
- Exercise 3: 10 points
- Exercise 4: 15 points
- Exercise 5: 20 points
- Exercise 6: 25 points

## Total Points (Lab): 85 points

# Due: midnight, April 21 (Thursday), 2022

# Install tensorflow and keras
The first thing to do is to install tensorflow and keras in your ICDS Roar environment.
- Open a terminal window in Jupyter Lab
- Type the following in ther terminal window 
```pip install tensorflow```
- After the installation of tensorflow completes, type the following in the terminal window 
```pip install keras```
- Wait until the installation completes. Then run the "import tensorflow as tf" in Jupyter Notebook and continue based on the instructions on Jupyter Notebook.

In [1]:
import tensorflow as tf

In [2]:
from tensorflow import keras

In [3]:
from tensorflow.keras import *

# Exercise 1 (5 points)
Enter your name here: Haichen Wei

In [4]:
mnist = tf.keras.datasets.mnist

In [5]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [6]:
len(train_labels)

60000

In [7]:
len(test_labels)

10000

In [8]:
train_images.shape

(60000, 28, 28)

# Part A Pre-processing Input Data
In this lab, we pre-process the 28 by 28 input image into a vector of 784 (28*28) input features (one for each pixel).  An alternative is to use Convolutionary Deep Neural Networks
directly on 28 by 28 input images without reshaping.

In [9]:
train_images2 = train_images.reshape(60000, 784)
test_images2 = test_images.reshape(10000, 784)

In [10]:
train_images2.shape

(60000, 784)

In [11]:
train_images[0]

array([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   3,
         18,  18,  18, 126, 136, 175,  26, 166, 255, 247, 127,   0,   0,
          0,   0],
       [  

In [12]:
train_images2[0]

array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   3,  18,  18,  18,
       126, 136, 175,  26, 166, 255, 247, 127,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,  30,  36,  94, 154, 17

### The computation of nodes in a DNN is based on floating point. Therefore, we need to convert the input data (reshpaed image data) to floating point.

In [13]:
train_images3 = train_images2.astype('float32')
test_images3 = test_images2.astype('float32')

In [14]:
train_images3[0]

array([  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   

## The value of each pixel ranges from 0 (white) to 255 (darkest black). We want to transform the input into the range of [0, 1]. This is easily done by dividing the original pixel value by 255.

In [15]:
train_images3 /= 255
test_images3 /= 255

In [16]:
train_images3[0]

array([0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.     

# Exercise 2 (10 points)
Fill in the shape of input data. Specify the number of nodes in each hidden layer (3 hidden layers total) for the DNN architecture.
Recommended number of nodes in each hidden layers: 512. The last layer is the output layer. Hence, it should have
10 nodes (one for each signal digit character).

In [17]:
model = tf.keras.Sequential( [ \
                             tf.keras.layers.Dense(512, activation='relu', input_shape=(784,)),\
                             tf.keras.layers.Dense(512, activation='relu'), \
                             tf.keras.layers.Dense(512, activation='relu'),
                             tf.keras.layers.Dense(10, activation='softmax'), ])

In [18]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Exercise 3 (10 points)
Fill in the training input and testing input (validation data) in the code below, both
of which should be images after the three transformation steps (reshape, astype, and scaling).
In this first DNN learning, we use a batch size of 128.  The default batch size is 32. We set vertose to 1 so that we can see the result of evaluating both training data and validation data.

In [19]:
model.fit(train_images3, train_labels, batch_size=128, epochs=30, verbose=1, \
          validation_data=(test_images3, test_labels))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x2b6bd5b77eb0>

# Exercise 4 (15 points)
- (a) Does the final DNN model learned indicate a risk for overfitting? Explain your answer. (5 points)
- (b) Which epoch, you believe, generated the best model? Explain the rationale of your decision. (10 points)

# Answers to Exercise 4
- (a) Yes, the loss measure for validation data is much worse than the loss measure for training data.
- (b) Epoch 14 is the best, because it has the highest validation accuracy, 0.9838, and its validation loss, 0.0756, is relatively smaller.

# Exercise 5 (20 points)
Change the batch size to 1000, complete the following code (keeping the number of nodes in each layer identical to Exercise 2. Compare results of the learned DNN to that of the previous one. Answer the following questions in the Markdown cell at the bottom of the Notebook.
- (a) Which epoch generates the best DNN model? Why? (10 points)
- (b) Based on the results of Exercise 4 and 5, which choice of batch size is better? Why? (10 points)

In [20]:
model2 = tf.keras.Sequential( [ \
                             tf.keras.layers.Dense(512, activation='relu', input_shape=(784,)),\
                             tf.keras.layers.Dense(512, activation='relu'), \
                             tf.keras.layers.Dense(512, activation='relu'),
                             tf.keras.layers.Dense(10, activation='softmax'), ])

In [21]:
model2.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [22]:
model2.fit(train_images3, train_labels, batch_size=1000, epochs=30, verbose=1, \
          validation_data=(test_images3, test_labels))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x2b6bd6795fa0>

# Answers to Exercise 5 (20 points)
- (a) (10 points)

Epoch 7 is the best, because it has the smallest validation , 0.0630, and its validation accuracy, 0.9819, is larger relatively.

- (b) (10 points)

1000 batch size is better. A larger batch size provides more training data to allow the weight adjustments to consider all of them at once.

# Exercise 6 (25 points)
Copy code above for training a DNN of only two layers, using the same number of nodes in each layer as you chose for Exercise 2 and 4, but use the batch size that gave the better result.
- (a) What is the performance result of the DNN learned? (10 points)
- (a) Will you choose this DNN over the one with three layers?  Why? (10 points)
- (b) Compare the overfitting risk of this DNN with that of the previous two? (5 points)

# Answer to Exercise 6 
- (a) It has high validation accuracy and the validation loss is smaller than previous two in general. 
- (b) No, this DNN performs worse than the other two because of the overfitting. The loss measure for validation data is much worse than the loss measure for training data, and it decreases as training proceeds.
- (c) It has higher overfitting risk than previous two models. 

In [23]:
model3 = tf.keras.Sequential( [ \
                             tf.keras.layers.Dense(512, activation='relu', input_shape=(784,)),\
                             tf.keras.layers.Dense(512, activation='relu'),
                             tf.keras.layers.Dense(10, activation='softmax'), ])

In [24]:
model3.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [25]:
model3.fit(train_images3, train_labels, batch_size=1000, epochs=30, verbose=1, \
          validation_data=(test_images3, test_labels))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x2b6c0786ab80>