# Neural Networks
Building a neural networks using `torch.nn` package.\
\
A typical training procedure for a neural network is as follows:
1. Define the neural network tha has some learnable parameters(wandb)
2. Iterate over a dataset inputs
3. Process input through the network (forward pass)
4. Compute the loss (how far is the output from labels)
5. Propagete gradients back into the network's parameter (back pass)
6. Update the weights of the network, typically using a simple update rule: `weight` = `weight` - `learning_rate` * `gradient`

# Define the neural network

In [1]:
# import libraries
import torch
import torch.nn as nn
import torch.nn.functional as F

In [2]:
# build model
class Net(nn.Module):
    
    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120) # 5x5 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2))
        x = torch.flatten(x, 1) # flatten all dimensions except except the batch dimension
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        
        return x

net = Net()
print(net)

Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


In [3]:
# the learnable parameters of a model are returned by net.parameter()
parameters = list(net.parameters())

# size of the net
print(parameters[0].size())

torch.Size([6, 1, 5, 5])


In [4]:
# try random 32x32 input
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

tensor([[ 0.0579,  0.1460,  0.0127,  0.1587,  0.0546,  0.0322,  0.0606, -0.1081,
          0.0075, -0.1608]], grad_fn=<AddmmBackward0>)


**Recap:**
- `torch.Tensor` - A multi-dimensional array with support for autograd operations
- `nn.Module` - Neural network module. Convenient way of encapsulating parameters, with helpers for moving them to GPU, exporting, loading, etc.
- `nn.Parameter` - A kind of Tensor, that is automatically registered as a parameter when assigned as an attribute to a `Module`.
- `autograd.Function` - Implements forward and backward definitions of an autograd operation.

## Compute Loss

In [5]:
output = net(input)
target = torch.randn(10) # a dummy target
target = target.view(1, -1)
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)

tensor(0.3354, grad_fn=<MseLossBackward0>)


# Backprop

In [6]:
net.zero_grad() # zeros the gradient buffers of all parameters

print(net.conv1.bias.grad)

None
tensor([-0.0106,  0.0033, -0.0070, -0.0204, -0.0194,  0.0068])


## Update the weights
The simplest update rules used in practice is the Stochastic Gradient Descent (SGD)
`weight` -= `learning rate` * `gradient`

In [7]:
lr = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * lr)

## Optimization

In [8]:
import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your traing loop:
optimizer.zero_grad()
output = net(input)
lost = criterion(output, target)
loss.backward()
optimizer.step()

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.