# Fully Connected Feed Forward Nueral Network

These notes are from Harrisons Tutorial linked below:
Data - Deep Learning and Neural Networks with Python and Pytorch 
https://youtu.be/i2yPxY2rOzs

pip install torch torchvision

In [4]:
# Here we are simply importing a balanced & batched training and testing data
import torch
from torchvision import transforms, datasets

train = datasets.MNIST('', train=True, download=True,
                       transform=transforms.Compose([
                           transforms.ToTensor()
                       ]))

tests = datasets.MNIST('', train=False, download=True,
                      transform=transforms.Compose([
                          transforms.ToTensor()
                      ]))

trainset = torch.utils.data.DataLoader(train, batch_size=10, shuffle=True)
testset = torch.utils.data.DataLoader(tests, batch_size=10, shuffle=False)

In [3]:
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(28*28, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 10)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return F.log_softmax(x, dim=1)
net = Net()
print(net)

Net(
  (fc1): Linear(in_features=784, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=64, bias=True)
  (fc3): Linear(in_features=64, out_features=64, bias=True)
  (fc4): Linear(in_features=64, out_features=10, bias=True)
)


Our Nueral Network Class Net Contains:

__nn.Linear 4 layers__
Parameter 1  :: input size
    This is 28 * 28 pixel image (1x784)
Parameter 2 :: output size
    Number of output Classes
    
__feed-forward func__
x :: image representation as a 1xDIM

__relu activation function__
output :: (input data) * weights

__soft-max eval__
Softmax is for multi-class problems, where each thing can only be one class or the other

In [22]:
import torch.optim as optim

loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

loss_function: calculates "how far off" our classifications are from reality
Optimizer: adjusts our model's adjustable parameters like the weights, to slowly, over time, fit our data
Adam: Adaptive Momentum, is the standard go-to optimizer usually
lr: learning rate, range between 0.001 or 1e-3.

In [23]:
for epoch in range(3): # 3 full passes over the data
    for data in trainset:  # `data` is a batch of data
        X, y = data  # X is the batch of features, y is the batch of targets.
        net.zero_grad()  # sets gradients to 0 before loss calc. You will do this likely every step.
        output = net(X.view(-1,784))  # pass in the reshaped batch (recall they are 28x28 atm)
        loss = F.nll_loss(output, y)  # calc and grab the loss value
        loss.backward()  # apply this loss backwards thru the network's parameters
        optimizer.step()  # attempt to optimize weights to account for loss/gradients
    print(loss)  # print loss. We hope loss (a measure of wrong-ness) declines! 

tensor(0.1320, grad_fn=<NllLossBackward0>)
tensor(0.0460, grad_fn=<NllLossBackward0>)
tensor(0.0327, grad_fn=<NllLossBackward0>)


Grab the features (X) and labels (y) from current batch
Zero the gradients (net.zero_grad)
Pass the data through the network
Calculate the loss
Adjust weights in the network with the hopes of decreasing loss

In [24]:
correct = 0
total = 0

with torch.no_grad():
    for data in testset:
        X, y = data
        output = net(X.view(-1,784))
        #print(output)
        for idx, i in enumerate(output):
            #print(torch.argmax(i), y[idx])
            if torch.argmax(i) == y[idx]:
                correct += 1
            total += 1

print("Accuracy: ", round(correct/total, 3))

Accuracy:  0.964
