# Session 7: Neural Networks #

Machine learning and artificial intelligence technology is growing at an impressive rate. From robotics and self-driving cars to augmented reality devices and facial recognition software, models that make predictions from data are all around us. Many of these applications implement neural networks, which basically allows the computer to analyze data similar to the way the human brain analyzes data.

With recent advancements in computing power and the explosion of big data, we can now implement large models that perform end-to-end learning (deep learning). This means that we can create a model, feed it tons and tons of data, and the model will learn features from the data that are important for accomplishing the task.

Session outline:
* Introduce the simplest neural network, the perceptron
* Discuss the general architecture for neural networks
* Implement a neural network to solve a hand writing recognition task
* Introduce deep learning (convolutional neural networks)
* Implement a deep neural network to solve a hand writing recognition task

#### Preparation for the workshop: ####

1. Watch the following videos:
* https://www.youtube.com/watch?v=aircAruvnKk
* https://www.youtube.com/watch?v=uXt8qF2Zzfo&t=1973s (Watch first 12 min)
* https://www.youtube.com/watch?v=YRhxdVk_sIs

2. Pull session 7 materials from GitHub
* https://github.com/pabloinsente/LUCID_data_workshop

## Breakout Session #2: Convolutional Neural Networks ##

The following code has been modified from:

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

**_Please visit the website to see the original code and explaination._

#### Instructions: ####

Read through the explaination for each section of cade and then run each block of code in consecutive order. If you have any questions about the code or the underlying theory please raise your hand and ask (_if you have questions after the session, please send me an email at doudlah@wisc.edu_).

\

For this session, we will use **Pytorch**, another open source deep learning library. To learn more, visit the Pytorch website (https://pytorch.org/). There are many great tutorials to check out!

## Step 1: Preparing the data ##

_**Note:** The MNIST dataset is freely available (http://yann.lecun.com/exdb/mnist/). Feel free to visit the website to learn more about the dataset and how it was created._

Pytorch also has a built-in fuction for downloading the MNIST dataset. We will load a training dataset and a test dataset. We also display some example images from the MNIST dataset. Do you notice any similarities to breakout session #1?

In [0]:
# Import required libraries
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np

# Transform torchvision dataset from [0,1] to [-1,1]
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5,), (0.5,))])

# Load training data
trainData = torchvision.datasets.MNIST(root='./CNN_Data',train=True,
                                      download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainData, batch_size=10,
                                         shuffle=False, num_workers=2)

# Load testing data
testData = torchvision.datasets.MNIST(root='./CNN_Data',train=False,
                                      download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testData, batch_size=10,
                                         shuffle=False, num_workers=2)

classes = ('Digit_0','Digit_1','Digit_2','Digit_3','Digit_4','Digit_5',
           'Digit_6','Digit_7','Digit_8','Digit_9')

# Display example images
dataiter = iter(trainloader)
images,labels = dataiter.next()

# Display example hand written digits
plt.figure(figsize=(8, 8))
for i in range(9):
  plt.subplot(3, 3, i+1)
  plt.xticks([])
  plt.yticks([])
  plt.grid(False)
  plt.imshow(images[i].squeeze(), cmap=plt.cm.binary)
  plt.title(classes[labels[i]])

## Step 2: Building the Convolutional Neural Network Model ##

To build the model, we define a class and add all of the layers and computational steps. Here we have three fully connected layers with 120, 84, and 10 nodes in each respective layer. Notice how the last layer must still have 10 nodes becasue we are differentiating between 10 classes. 

The "forward" function takes the input image $x$ and propagates the data through the network. Notice that after each convolution we apply the "relu" activation function and then pool (or downsample) the data. 

Feel free to modify the number of layers and the number of nodes in each layer. Remember that the time it will take to train the network is directly proportional to the number of layers and the number of nodes in each layer. 

In [0]:
# Define the model architecture
class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    self.conv1 = nn.Conv2d(1, 6, 4)
    self.pool = nn.MaxPool2d(2, 2)
    self.conv2 = nn.Conv2d(6, 16, 4)
    self.fc1 = nn.Linear(16 * 4 * 4, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, 10)
    
  def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = x.view(-1, 16 * 4 * 4)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x
  
net = Net()

# Define a loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

## Step 3: Train the network ##

Here, we will train the neural network and print the accuracy at the end of training. Feel free to play with the number of epochs, or times the model will train on all of the images but be careful of overfitting. 

There is a lot that goes into training a model. You need to send some images through your model, calculate a loss from the true labels, and then update all of the weights in the network.

_**Note:** This make take a few minutes to run because it is processing 60,000 images. The number of epochs is directly related to the time that it will take the model to run._

In [0]:
# Set the number of times you want to loop over the dataset
numEpochs = 2

# Train the network
for epoch in range(numEpochs):
  running_loss = 0.0
  for i, data in enumerate(trainloader, 0):
    # Get the image and the corresponding label
    inputs, labels = data
    
    # Zero the gradients of the parameters
    optimizer.zero_grad()
    
    # Run data through the network, calculate the loss and update weights
    outputs = net(inputs)
#     print('Inputs:',inputs.shape)
#     print('Outputs:',outputs.shape)
#     print('Labels:',labels.shape)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()
    
    # Print statistics every 2000 epochs
    running_loss += loss.item()
    if i % 2000 == 1999:
      print('[%d, %5d] loss: %0.3f' % 
           (epoch + 1, i + 1, running_loss/2000))
      
print('Finished Training!')

## Step 4: Test the network ##

As with any kind of machine learning, it is always important to test the network on data that it did not see in training. Here, we use our "testing" dataset to check the actual accuracy of the model. 

In [0]:
correct = 0
total = 0
with torch.no_grad():
  for data in testloader:
    images, labels = data
    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

## Step 5: Check your results ##

By just using a convolutional neural network with only a few fully connected layers we can get a pretty high accuracy. It is always advisable to check your output of your model to verify that it is working as expected. 

We can also check the accuracy of each class that was tested? Did any class have a lower accuracy than the rest?

In [0]:
# Create list to hold accuracy of each class
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))

with torch.no_grad():
  for j, data in enumerate(testloader, 0):
    images, labels = data
    outputs = net(images)
    _, predicted = torch.max(outputs, 1)
    c = (predicted == labels).squeeze()
    # Update accuracy for each class
    for i in range(10):
      label = labels[i]
      class_correct[label] += c[i].item()
      class_total[label] += 1

for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))