## Neural Networks

Neural networks can be constructed using the `torch.nn` package.

Now that you had a glipse of `autograd`, `nn` depends on `autograd` to define models and differentiate them. An `nn.Module` contains layers, and a method `forward(input)` that returns the `output`.

For example, look at a simple convolutional neural network that classifies digit images:

![Convnet](../images/mnist.png)

It is a simple convolutional neural network. It takes the input, feeds it through several layers one after the other and then finally gives the output.

A typical training procedure for a neural network is as follows:

- Define the neural network that has some learnable parameters (or weights)
- Iterate over a dataset of inputs
- Process input through the network
- Compute the loss *(how far is the output from being correct)*
- Propagate gradients back into the network's parameters
- Update the weights of the network, typically using a simple update rule: `weight = weight - learning_rate * gradient`

## Define the Network

Let's define this network:

In [2]:
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # in_channels: 1
        # out_channels: 6
        # kernel_size: 5x5 
        # Defaults->stride:1, padding:0, dialation:1, groups:1, bias:True
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # dense(or fully connected, or affine) layer
        # y = Wx + b
        # in_features: size of each input sample
        # out_features: size of each output sample
        # bias: Learn an additive bias. Default: `True`
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
    
    def forward(self, x):
        # Max pooling over a 2x2 window
        x = F.max_pool2d(F.relu(self.conv1(x)), 2)
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        # Flatten
        x = x.view(-1, self.num_flat_features(x))
        # Apply relu to the fully connected layers
        x = F.relu(self.)
        return x
    
    def num_flat_features(self, x):
        # get all dimensions except for the batch dimension
        size = x.size()[1:]
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

    
# Create an instance of the Net class
net = Net()
print(net)

In [14]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        
        self.conv1 = nn.Conv2d(1, 16, 5)
        self.conv2 = nn.Conv2d(16, 32, 5)
        # fully connected layers
        self.fc1 = nn.Linear(32 * 5 * 5, 512)
        self.fc2 = nn.Linear(1024, )
    
    def forward(self, x):
        pass
    
    def _flatten(self, x):
        size = x.size()[1:]
        features = 1
        for s in size:
            features *= s
        return features