## Adding Cost Function and Optimization to Neural Networks

For this exercise, your task is to add a Cost Function and Optimizer to the neural network you built in the last exercise. You will need to figure out what is the correct cost function and optimizer to use for your neural network architecture. Here are the steps you need to do:

1. Complete the `create_model()` function. You can either create a new network or use the network you built in the previous exercise
2. Add your cost function and optimizer

**Note**: It may take 5 - 10 minutes to download the data sets. 

In case you get stuck, you can look at the solution below.

### Try It Out!
- Change the parameters of your optimizer and for your network. How does your model accuracy change? These values are called hyperparameters and they can change the performance of our model. In a later lesson, we will learn how to automatically search for hyperparameters that give the best results.

### Import Libraries

In [None]:
import numpy as np
import torch
from torchvision import datasets, transforms
from torch import nn, optim

### Download and load data

#### Proprocess data
Here we are not actually performing transformations but rather insntancig the tranformation functions that will be applied to the datasets.

In [None]:
# Instance preprocessing/transformation functions
# Perform data augmentation with the training dataset
training_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
    ])

testing_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
    ])

In [None]:
# Download and load data
batch_size = 64

trainset = datasets.MNIST('data/', download=True, train=True, transform=training_transform)
testset = datasets.MNIST('data/', download=True, train=False, transform=testing_transform)

train_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True)

### Create and Instance a NN model

In [None]:
def create_model():
    '''Creates a PyTorch NN using the Sequential API'''

    input_size = 784
    output_size = 10

    model = nn.Sequential(nn.Linear(input_size, 128),
                          nn.ReLU(),
                          nn.Linear(128, 64),
                          nn.ReLU(),
                          nn.Linear(64, output_size),
                          nn.LogSoftmax(dim=1))
    
    return model

In [None]:
model = create_model()

### Train the model

In [None]:
# Define training loop
def train(model, train_loader, cost, optimizer, epoch):
    model.train()
    for e in range(epoch):
        running_loss = 0
        correct = 0
        # 1. Loop through data
        for data, target in train_loader:
            # Reshape data
            data = data.view(data.shape[0], -1)
             # 4. Optimizer zero grad
            # Before the backward pass, use the optimizer object to zero all of the
            # gradients for the variables it will update (which are the learnable
            # weights of the model). This is because by default, gradients are
            # accumulated in buffers( i.e, not overwritten) whenever .backward()
            # is called.
            optimizer.zero_grad()
            # 2. Forward pass
            pred = model(data)
            # 3. Compute loss
            loss = cost(pred, target)
            running_loss+=loss
            # 5. Backpropagation: compute gradient of the loss with respect to model parameters
            loss.backward()
            # 6. Update the weights using gradient descent
            optimizer.step()
            # Get the index of the max log-probabilty
            pred = pred.argmax(dim=1, keepdim=True)
            # Count the number of correct predictions
            correct += pred.eq(target.view_as(pred)).sum().item()
        print(f"Epoch {e}: Loss {running_loss/len(train_loader.dataset)}, Accuracy {100*(correct/len(train_loader.dataset))}%")

# Define testing loop
def test(model, test_loader):
    # Set the model to evaluation mode
    model.eval()
    correct = 0
    with torch.no_grad(): # Disable gradient calculation
        # Loope through data in batches
        for data, target in test_loader:
            data = data.view(data.shape[0], -1) # Reshape data
            output = model(data) 
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    print(f'Test set: Accuracy: {correct}/{len(test_loader.dataset)} = {100*(correct/len(test_loader.dataset))}%)')

In [None]:
# Set model configs
loss_fn = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Set model Hyperparameters
epochs = 10

train(model,
      train_loader,
      loss_fn,
      optimizer,
      epochs)

In [None]:

test(model, test_loader)