### University of Virginia
### DS 5110: Big Data Systems
### TensorFlow Neural Network Assignment
### Last Updated: May 3, 2021

### Assignment

In this assignment, you will review some code that trains and evaluates a neural network using `TensorFlow`.  You will then run and modify the code to answer the questions below. 

**TOTAL POINTS: 10**

### Source

The code to implement this neural network in `TensorFlow` was sourced from this Github repo (under unrestricted license):

https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/neural_network.py

I have added some annotation to explain the steps of the process.

**Background for Understanding the Code**

`neural_net()` sets up the architecture and returns the output layer.

`model_fn()` sets up the following:
- construction of the neural net
- `softmax()` layer to produce probabilities for each class
- output of the predicted class, which is the class with highest probability
- forward propagation and the loss calculation, based on cross entropy
- backpropagation and parameter updating using minibatch gradient descent
- accuracy calculation

`tf.estimator.Estimator` is an `Estimator` class to train and evaluate `TensorFlow` models.  It wraps a model, which is specified by a `model_fn`.

`tf.estimator.inputs.numpy_input_fn()` returns an input function that would feed dict of numpy arrays into the model.


**NOTE**  

This URL allows you to easily enter and run TensorFlow code from a browser.  

https://colab.research.google.com/github/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/hello_t2t.ipynb#scrollTo=xGUjWehq8Vxq

**NEURAL NET CODE**

In [4]:
""" 
Use TensorFlow to Implement a neural network with 2 hidden layers.
This example is using the MNIST database of handwritten digits (http://yann.lecun.com/exdb/mnist/).
Links:
    [MNIST Dataset](http://yann.lecun.com/exdb/mnist/).
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
"""

from __future__ import print_function
import tensorflow as tf

# Import MNIST data, saving in home directory subfolder
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot = False)

# set up hyperparameters
learning_rate = 0.1
num_steps = 1000
batch_size = 128
display_step = 100

n_hidden_1 = 256 # number of neurons in first (fully-connected) hidden layer
n_hidden_2 = 256 # number of neurons in second (fully-connected) hidden layer
num_input = 784 # MNIST data input (image shape: 28*28)
num_classes = 10 # MNIST classes (0-9 digits)

# Define the neural network
def neural_net(x_dict):
    # TF Estimator input is a dict, in case of multiple inputs
    x = x_dict['images']
    # initialize first hidden layer
    layer_1 = tf.layers.dense(x, n_hidden_1)
    # initialize second hidden layer
    layer_2 = tf.layers.dense(layer_1, n_hidden_2)
    # initialize output layer, with one neural per class
    out_layer = tf.layers.dense(layer_2, num_classes)
    return out_layer

# Define the model function (following TF Estimator Template)
def model_fn(features, labels, mode):

    print('entering mode: {}'.format(mode))
    
    # Build the neural network, returning the output layer (predicted probs)
    logits = neural_net(features)

    # Predictions
    # extract class with highest probability
    pred_classes = tf.argmax(logits, axis=1)
    # extract the probabilities for each class
    pred_probas = tf.nn.softmax(logits)

    # If prediction mode, early return
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=pred_classes)

    # Define loss, using cross entropy
    loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits, labels=tf.cast(labels, dtype=tf.int32)))
    
    # Define optimizer, using gradient descent
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
    train_op = optimizer.minimize(loss_op,
                                  global_step=tf.train.get_global_step())

    # Evaluate the accuracy of the model
    acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)

    # TF Estimators required to return a EstimatorSpec, that specify
    # the different ops for training, evaluating, ...
    estim_specs = tf.estimator.EstimatorSpec(
        mode=mode,
        predictions=pred_classes,
        loss=loss_op,
        train_op=train_op,
        eval_metric_ops={'accuracy': acc_op})

    return estim_specs

print('building estimator definition...')
model = tf.estimator.Estimator(model_fn)
print('estimator definition complete')

print('defining the input function for training; loading MNIST data...')
input_fn_train = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.train.images}, y=mnist.train.labels,
    batch_size=batch_size, num_epochs=None, shuffle=True)
print('training data setup complete')

print('training the model...')
model.train(input_fn_train, steps=num_steps)
print('model training complete')

print('setting up eval data...')
input_fn_eval = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.test.images}, y=mnist.test.labels,
    batch_size=batch_size, shuffle=False)
print('setting up eval data complete')

print('evaluating the model...')
e = model.evaluate(input_fn_eval)
print('model evaluation complete')

print("Testing Accuracy:", e['accuracy'])

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
building estimator definition...
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpnr8e133h', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6afbceffd0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluat

1) (1 PT) Run the code.  I will check for logging that looks like this:

In [None]:
building estimator definition...
estimator definition complete
defining the input function for training; loading MNIST data...
training data setup complete
training the model...
entering mode: train
model training complete
setting up eval data...
setting up eval data complete
evaluating the model...
entering mode: eval
model evaluation complete
('Test Accuracy:', 0.916) # this may differ

2) (1 PT) Enter the test accuracy in the cell below

3) (1 PT) Change `learning_rate` from **0.1** to **0.2** and report test accuracy below.

4) (1 PT) How does this accuracy compare to the original model?  Provide an explanation of why this might occur.

5) (1 PT) Change `num_steps` from **1000** to **10** and report the test accuracy below.

6) (1 PT) How does this accuracy compare to the original model?  Provide an explanation of why this might occur.  

7) (2 PTS) Change one or more of the hyperparameters to see if you can improve the accuracy of the model.  For your final model, report the changes you made, and the accuracy.  If you cannot improve accuracy, report at least three experiments that you tried, with their accuracies.  Please provide a brief, organized summary.

8) (2 pts) Finalize the code in the block below to show the first image in the mnist training set.

In [None]:
# import packages
import numpy as np
import matplotlib.pyplot as plt

first_image = None
first_image = np.array(first_image, dtype='float')
pixels = first_image.reshape((None, None))
plt.imshow(None, cmap='gray')
plt.show()