<a href="https://colab.research.google.com/github/JonathanKernaghan/JupyterNotebook/blob/main/BasicFNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Feedforward Artificial Neural Network (Basic)**
There are 7 tasks to create a basic feedforward neural net and they are as follows:

1 ) Import Libraries & Set Device to GPU

In [8]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

2) Define Hyperparameters

In [9]:
batch_size = 100
input_size = 784
hidden_size = 100
num_classes = 10
num_epochs = 2
learning_rate = 0.001

3) Define datasets & loaders

In [None]:
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)

4) Define Model

In [11]:
model = nn.Sequential(
        nn.Linear(input_size, hidden_size),
        nn.ReLU(),
        nn.Linear(hidden_size, num_classes)
)

5) Define Loss Function and Optimiser

In [12]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

6) Training Loop

In [None]:
n_total_steps = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = images.reshape(-1, 28*28).to(device)
        labels = labels.to(device)

        outputs = model(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print(f'epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss.item():.4f}')

7) Testing Loop

In [None]:
with torch.no_grad():
    n_correct = 0
    n_samples = 0
    for images, labels in test_loader:
        images = images.reshape(-1, 28*28).to(device)
        labels = labels.to(device)
        outputs = model(images)

        _, predictions = torch.max(outputs, 1)
        n_samples += labels.shape[0]
        n_correct += (predictions == labels).sum().item()

    acc = 100.0 * n_correct / n_samples
    print(f'accuracy = {acc}')

***The basic moving parts of a simple Feedforward Neural Network:***

- The **architecture** of the model
  - Number of hidden layers i.e. the depth
  - Number of neurons in hidden layers i.e. the width
  - Activation function used for neurons i.e. Sigmoid, ReLU, PReLU
- **Loss function** used i.e. MSE, Cross Entropy Loss
- **Optimization algorithm** used i.e. SGD, SGD w/ Momentum
- **Learning rate** i.e. 0.001
- **Epochs** i.e. runs through all samples
- **Batch size**

The above Neural Network is by no means optimal, and doesn't score well on MNIST. By adjusting the above apparatus, the accuracy will increase vastly. There are other techniques too, such as data augmentation - to jitter the digits to different angles and positions - and create more samples for the training set.

Good post here explaining the additonal techniques and special features:

[How to score 97%, 98%, 99%, and 100% in MNIST](https://www.kaggle.com/c/digit-recognizer/discussion/61480)

I will create an optimised feedforward model in another notebook - to see how much accuracy I can gain using this non-convolutional linear architecture.