# Session 7: Neural Networks #

Machine learning and artificial intelligence technology is growing at an impressive rate. From robotics and self-driving cars to augmented reality devices and facial recognition software, models that make predictions from data are all around us. Many of these applications implement neural networks, which basically allows the computer to analyze data similar to the way the human brain analyzes data.

With recent advancements in computing power and the explosion of big data, we can now implement large models that perform end-to-end learning (deep learning). This means that we can create a model, feed it tons and tons of data, and the model will learn features from the data that are important for accomplishing the task.

Session outline:
* Introduce the simplest neural network, the perceptron
* Discuss the general architecture for neural networks
* Implement a neural network to solve a hand writing recognition task
* Introduce deep learning (convolutional neural networks)
* Implement a deep neural network to solve a hand writing recognition task

#### Preparation for the workshop: ####

1. Watch the following videos:
* https://www.youtube.com/watch?v=aircAruvnKk
* https://www.youtube.com/watch?v=uXt8qF2Zzfo&t=1973s (Watch first 12 min)
* https://www.youtube.com/watch?v=YRhxdVk_sIs

2. Pull session 7 materials from GitHub
* https://github.com/pabloinsente/LUCID_data_workshop

## Breakout Session #1 ##

The following code has been modified from:

https://www.tensorflow.org/tutorials/keras/basic_classification

**_Please visit the website to see the original code and explaination._

#### Instructions: ####

Read through the explaination for each section of cade and then run each block of code in consecutive order. If you have any questions about the code or the underlying theory please raise your hand and ask (_if you have questions after the session, please send me an email at doudlah@wisc.edu_).

\

For this session, we will use **TensorFlow**, an open source deep learning library and **Keras**, a neural network API. To learn more, visit the TensorFlow website (https://www.tensorflow.org/). There are many great tutorials to check out!

## Step 1: Preparing the data ##

_**Note:** The MNIST dataset is freely available (http://yann.lecun.com/exdb/mnist/). Feel free to visit the website to learn more about the dataset and how it was created._

Keras has a built in module for downloading the MNIST dataset. This makes it easy to load the data and split it into training and testing sets. It is always a good idea to get to know your data before you dive into building the model. You should always examine the data that you are working with to get a better feel for how you should try to analyze the data.

Here, we plot the first 25 training images. What are some things that you notice about the images? Do you think you could code your own rules for classifying these hand written digits?

In [0]:
# Import required libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Check tenserflow version
print(tf.__version__)

# Load MNIST data
mnistData = keras.datasets.mnist
(trainImages, trainLabels), (testImages, testLabels) = mnistData.load_data()

# Check sizes of data
print('Shape of trainImages: ',trainImages.shape)
print('Length of trainLabels: ',len(trainLabels))
print('Shape of testImages: ',testImages.shape)
print('Length of testLabels: ',len(testLabels))

# Check the image labels
print(trainLabels)
print(testLabels)

# Set labels for each class (digits 0:9)
class_names = ['Digit_0','Digit_1','Digit_2','Digit_3','Digit_4','Digit_5',
              'Digit_6','Digit_7','Digit_8','Digit_9',]

# Display example hand written digits
plt.figure(figsize=(10,10))
for i in range(25):
  plt.subplot(5,5,i+1)
  plt.xticks([])
  plt.yticks([])
  plt.grid(False)
  plt.imshow(trainImages[i], cmap=plt.cm.binary)
  plt.title(class_names[trainLabels[i]])


## Step 2: Building the Neural Network Model ##

The first layer will be a flattened version of our images, so it will have 784 nodes (28x28). We can then choose how many hidden layers we want, and how many nodes each layer will have. Feel free to modify the code to add more layers or change the number of nodes in each layer to see how the accuracy of the model changes. The final layer must have 10 nodes because we have 10 different classes that we are trying to discriminate between.

\

**Note:**

'_relu_' is a type of activation function, which stands for "rectified linear unit" and is defined as:

$
f(x)= 
\begin{cases}
    x,& \text{for } x > 0\\
    0,              & \text{otherwise}
\end{cases}
$

\

Recall from the lecture slides all of the connections between all of the different nodes in the different layers. Could you imagine if you had to code each connection? TensorFlow and Keras allow you to build complicated models in a relatively few lines of code. 

Here we first define our model architecture and then compile the model and choose a loss function and optimizer. 


In [0]:
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28,28)),
    keras.layers.Dense(128,activation=tf.nn.relu),
    keras.layers.Dense(10,activation=tf.nn.softmax)
])

model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])


## Step 3: Train the network ##

Here, we will train the neural network and print the accuracy at the end of training. Feel free to play with the number of epochs, or times the model will train on all of the images but be careful of overfitting. 

There is a lot that goes into training a model. You need to send some images through your model, calculate a loss from the true labels, and then update all of the weights in the network. With TensorFlow and Keras, all of that is condensed into one line of code!

_**Note:** This make take a few minutes to run because it is processing 60,000 images. The number of epochs is directly related to the time that it will take the model to run._

In [0]:
model.fit(trainImages,trainLabels,epochs=5)

## Step 4: Test the network ##

As with any kind of machine learning, it is always important to test the network on data that it did not see in training. Here, we use our "testing" dataset to check the actual accuracy of the model. 

In [0]:
testLoss, testAccuracy = model.evaluate(testImages,testLabels)
print('Test accuracy:', testAccuracy)

## Step 5: Check your results ##

By just using a simple neural network with relatively few parameters we already get a pretty high accuracy. It is always advisable to check your output of your model to verify that it is working as expected. 

Here, we look at one prediction vector to see what the data looks like. Then we plot some of the test images and look at the model's guesses. 

In [0]:
# Print what a prediction looks like from your model
predictions = model.predict(testImages)
print(predictions[0])

# Print which label the model predicted
print(class_names[np.argmax(predictions[0])])

# Plot image with model predictions
# Plot first numRows or randomly select
isRand = False
numRows = 5
plt.figure(figsize=(8,numRows*2))
for i in range(numRows):
  # if isRand == True, randomly select
  if isRand:
    j = np.random.randint(len(testLabels),size=(1,1))
    j = np.asscalar(j)
  else:
    j = i
    
  # Plot image with true label
  plt.subplot(numRows,2,(2*i)+1)
  plt.xticks([])
  plt.yticks([])
  plt.grid(False)
  plt.imshow(testImages[j].squeeze(), cmap=plt.cm.binary)
  plt.title(class_names[testLabels[j]])
  
  # Plot histogram with guess percentage
  plt.subplot(numRows,2,(2*i)+2)
  plt.xticks([])
  plt.yticks([])
  plt.grid(False)
  thisplot = plt.bar(range(10), predictions[j], color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions[j])
  
  # Set color based on correct/incorrect
  if predicted_label == testLabels[j]:
    color = 'green'
  else:
    color = 'red'
  plt.title("{} {:2.0f}%".format(class_names[predicted_label],
                                      100*np.max(predictions[j])),
           color=color)