<a href="https://colab.research.google.com/github/DanielOlson/CompBioAsia/blob/main/CompBioAsia_nn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# REAL Deep Learning
We are now ready to do real deep learning. Here you're going to build your own models to try to solve real problems. I'll take care of the optimization code so you can focus on the fun stuff. Your goal is to get the largest testing accuracy you can!

## MNIST
The first thing we're going to try to learn is hand written digits. First we'll download the MNIST dataset (our dataset of hand written digits). Then we'll build a a model and train that model on MNIST. Finally we'll test our model and try to figure out how we can make our model better.

#### Setup

---

Load libraries, download data, build some helper functions

In [None]:
import torch # The big pytorch library
from torch import nn # Pytorch's neural network library
import torch.nn.functional as F # Some useful helper functions

# Torchvision will let us load up the MNIST dataset EZ PZ
from torchvision import datasets, transforms 

import matplotlib.pyplot as plt # Matplotlib for visualization


In [None]:
# This cell grabs MNIST and loads it into tensors for us

transform = transforms.Compose([
                                transforms.ToTensor(),
                                transforms.Normalize(0.1307, 0.3081)
])

batch_size = 128

train_data = datasets.MNIST('../data', train=True, download=True,
                       transform=transform)
test_data = datasets.MNIST('../data', train=False, download=True,
                       transform=transform)


train_loader = torch.utils.data.DataLoader(train_data, batch_size=len(train_data), shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=len(test_data), shuffle=True)

x_train, y_train = iter(train_loader).next()
x_train = x_train.squeeze()
y_train = F.one_hot(y_train)
y_train = y_train.float()

x_small = x_train[0:500]
y_small = y_train[0:500]
#y_small = F.one_hot(y_small)
y_small = y_small.float()

x_test, y_test = iter(test_loader).next()
x_test = x_test.squeeze()
y_test = F.one_hot(y_test)
y_test = y_test.float()


print("x_train shape:", x_train.shape)
print("y_train shape:", y_train.shape)

print("x_test shape:", x_test.shape)
print("y_test shape:", y_test.shape)

In [None]:
# This cell has a bunch of helper functions

def train_with_data(x, y, 
                    model, batch_size, 
                    steps, learning_rate, 
                    loss_function, checkin=100):
  dev = 'cpu'
  if torch.cuda.is_available():
    dev = 'cuda:0'

  model.to(dev)
  optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

  for step in range(steps):
    batch_i = torch.randint(0, len(x), (batch_size,))
    batch_x = x[batch_i].to(dev)
    batch_y = y[batch_i].to(dev)

    out = model(batch_x)
    loss = loss_function(out, batch_y)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

    if step % checkin == 0:
      print('progress: {:.2%} '.format(step / steps), 'loss:', float(loss))

  model.to('cpu')

# This function just tests our accuracy
def test_accuracy(x, y, model, show_failures=0):
  dev = 'cpu'
  if torch.cuda.is_available():
    dev = 'cuda:0'

  model.to(dev)
  x = x.to(dev)
  y = y.to(dev)

  result = 0
  with torch.no_grad():
    out = model(x)

    y = torch.argmax(y, dim=-1)
    out = torch.argmax(out, dim=-1)
    
    result = out == y
    result = result.float()
    result = torch.mean(result, dim=-1)

    if show_failures > 0:
      failures = out != y

      failure_out = out[failures]
      failure_y = y[failures]

      failure_images = x[failures]
      
      random_failures = torch.randint(0, len(failure_images), (show_failures,))

      for f in random_failures:
        print("Target:", int(failure_y[f]), "Predicted:", int(failure_out[f]))
        plt.imshow(failure_images[f].to('cpu'))
        plt.show()

  model.to('cpu')

  return float(result)

#### Digits and recognizing them!

---

The interesting code starts here!

In [None]:
# Let's have a look at some of these hand written digits!

figure = plt.figure()
num_of_images = 100
for index in range(1, num_of_images + 1):
    plt.subplot(10, 10, index)
    plt.axis('off')
    plt.imshow(x_train[index], cmap='gray_r')
plt.show()

Training on the entire trainset

---

It's finally time! The place holder model I've made is awful. Real bad. It connects 784 inputs to a single neuron, and then connects that neuron to 10 outputs. The first layer of the model needs to stay there (nn.Flatten()), and the last layer of the model should stay there (nn.Softmax()). Everything else can change! Some ways you might try to make improvements:



1.   Making the network deeper (more layers).
2.   Making the network wider (more neurons).
3.   Using different activation functions (replacing nn.Tanh) such as nn.Sigmoid(), nn.ReLU(), nn.ELU(). a complete list of activation functions can be found at https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. 
4.   Try bigger and smaller batch sizes
5.   Try more/less steps

Feel free to change any of the code in the cell, and also feel free to use any of the tools in the nn library: https://pytorch.org/docs/stable/nn.html

In [None]:

batch_size = 32 # How many images we train on at every step
steps = 20000 # How many total steps we will train for
learning_rate = 0.001 # How fast we adjust gradients with gradient descent
loss_function = nn.MSELoss() # Which loss function we're using
checkin = int(steps / 5) # How often we print our loss (smaller = more frequently)

# Change this model to change try and improve results
model = nn.Sequential(nn.Flatten(), # <--- the model needs to begin with this
                      nn.Linear(28*28, 1), 
                      nn.Tanh(),
                      nn.Linear(1, 10), 
                      nn.Softmax(dim=-1)) # <--- the model needs to end with this

# Here we get a starting testing accuracy
acc = test_accuracy(x_test, y_test, model)
print("Starting accuracy: {:.2%}".format(acc))

# Train our model with our train_with_data helper function
train_with_data(x_train, y_train,
                model, batch_size, steps,
                learning_rate, loss_function, checkin=checkin)

# Check the ending testing accuracy
acc = test_accuracy(x_test, y_test, model)
print("Ending accuracy: {:.2%}".format(acc))

# Show some random failed predictions
_ = test_accuracy(x_test, y_test, model, show_failures=1) 
# Change this to change how many failures are shown    ^

Training on a small trainset

---

Before we were training on a 'uuuge dataset. Well maybe not ginormous, it was a whole 60,000 images - the entire MNIST training set.

Deep learning becomes way harder when we have a smaller dataset to work with. Below we're going to try to build a model that can learn using only 500 images. Feel free to change naything in the cell below

In [None]:

batch_size = 32 # How many images we train on at every step
steps = 20000 # How many total steps we will train for
learning_rate = 0.001 # How fast we adjust gradients with gradient descent
loss_function = nn.MSELoss() # Which loss function we're using
checkin = int(steps / 5) # How often we print our loss (smaller = more frequently)

# Change this model to change try and improve results
model = nn.Sequential(nn.Flatten(), # <--- the model needs to begin with this
                      nn.Linear(28*28, 1), 
                      nn.Tanh(),
                      nn.Linear(1, 10), 
                      nn.Softmax(dim=-1)) # <--- the model needs to end with this

# Here we get a starting testing accuracy
acc = test_accuracy(x_test, y_test, model)
print("Starting accuracy: {:.2%}".format(acc))

# Train our model with our train_with_data helper function
train_with_data(x_small, y_small,
                model, batch_size, steps,
                learning_rate, loss_function, checkin=checkin)

# Check the ending testing accuracy
acc = test_accuracy(x_test, y_test, model)
print("Ending accuracy: {:.2%}".format(acc))

# Show some random failed predictions
_ = test_accuracy(x_test, y_test, model, show_failures=1) 
# Change this to change how many failures are shown    ^