# 04_Feed-forward_Neural_Networks
In this notebook, we will see how to define simple feed-foward neural networks.

In [17]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms

import matplotlib
%matplotlib inline

torch.manual_seed(777)  # reproducibility

## Neural Networks
A typical training procedure for a neural network is as follows:

- Define the neural network that has some learnable parameters (or weights)
- Iterate over a dataset of inputs
- Process input through the network
- Compute the loss (how far is the output from being correct)
- Propagate gradients back into the network’s parameters
- Update the weights of the network, typically using an optimizer.

We will look at all the above processes with a concrete example, MNIST.

### Define the network
First of all, we need a new feed-foward neural network for performing image classification on MNIST.
In PyTorch, you can build your own neural network using the `torch.nn`package:

In [18]:
# Hyper-parameters
input_size = 784
hidden_size = 256
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# Device configuration
# device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device = torch.device('cpu')

# Fully connected neural network with one hidden layer
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        # Define the operations to use for input processing.
        # torch.nn.Linear(in_features, out_features, bias=True)
        #               : a linear projection(fc) layer(in_feeatures -> out_features)
        # torch.nn.RELU(inplace=False): a ReLU activation function
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)  
    
    def forward(self, x):
        # Define the input processing through network
        z1 = self.fc1(x)
        h1 = self.relu(z1)
        out = self.fc2(h1)
        return out

model = NeuralNet(input_size, hidden_size, num_classes).to(device)
print(model)

NeuralNet(
  (fc1): Linear(in_features=784, out_features=256, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=256, out_features=10, bias=True)
)


You just have to define the `forward` function, and the `backward` function (where gradients are computed) is automatically defined for you using `autograd`.

The architecture of the above `NeuralNet` is as follows:
<img src="images/nn_architecture.png" width="500">

Here, x and y are the input, target (true label) values, respectively.

The learnable parameters of a model are returned by `model.parameters()`.

In [19]:
params = list(model.parameters())
print(len(params))
print(params[0].size())  # fc1's .weight

4
torch.Size([256, 784])


### Loss function and Optimizer
A loss function takes the (output, target) pair of inputs, and computes a value that estimates how far away the output is from the target.

There are several different loss functions under the nn package.
We use `nn.CrossEntropyLoss()`.

In [21]:
input = torch.randn(1, 784) # a random input, for example
output = model(input) # output: (batch_size, num_classes)
print(output)

target = torch.tensor([0])  # a dummy target, for example. target: (batch_size) where 0 <= each element < num_classes
criterion = nn.CrossEntropyLoss()

loss = criterion(output, target)
print(loss)

tensor([[ 0.0156,  0.0507,  0.2329, -0.0002, -0.1893,  0.0986,  0.1121, -0.2747,
          0.0184, -0.2787]], grad_fn=<ThAddmmBackward>)
tensor(2.2784, grad_fn=<NllLossBackward>)


Furtheremore, PyTorch supports several optimizers from `torch.optim`.
We use an Adam optimizer.

In [22]:
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

### DataLoader

In [23]:
# MNIST dataset 
train_dataset = torchvision.datasets.MNIST(root='./data', 
                                           train=True, 
                                           transform=transforms.ToTensor(),  
                                           download=True)

test_dataset = torchvision.datasets.MNIST(root='./data', 
                                          train=False, 
                                          transform=transforms.ToTensor())

# Data loader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

# plot one example
print(mnist_train.train_data.size())                 # (60000, 28, 28)
print(mnist_train.train_labels.size())               # (60000)

idx = 0
plt.imshow(mnist_train.train_data[idx,:,:].numpy(), cmap='gray')
plt.title('%i' % mnist_train.train_labels[idx])

NameError: name 'torchvision' is not defined