# Network Operations

Let's see how the networks perform operations using the layers defined as modules.

In [2]:
import torch
import numpy as np

from torch import nn as nn

In **Linear Layers**, the operation is a matrix multiplication. Given the output $\mathbf{y}_{i}$ of the $i$th layer, the output of the $(i+1)$th layer is $W_{i+1}\mathbf{y}_{i}$, where $W_{i+1}$ is a matrix containing the weights of connections between layers $i$ and $i+1$

In [9]:
in_features = torch.tensor([1,2,3,4], dtype=torch.float32)
weight_matrix = torch.tensor([
    [1,2,3,4],
    [2,3,4,5],
    [3,4,5,6]
], dtype=torch.float32)
weight_matrix.matmul(in_features)

tensor([30., 40., 50.])

The ```torch.tensor``` method has a callable attribute ```matmul()``` that can be used to perform matrix multiplications.

The layers constructed using ```torch.nn``` modules have callable functions that perform the corresponding operation.

In [12]:
dense = nn.Linear(in_features=4, out_features=3)
dense.weight

Parameter containing:
tensor([[ 0.0697, -0.1287,  0.4758, -0.0875],
        [ 0.1852, -0.2088, -0.0620,  0.1778],
        [-0.3900,  0.3577,  0.4638, -0.4139]], requires_grad=True)

The ```nn.Linear``` module creates contains a weight matrix that is randomly initialised, which is a learnable parameter indicated by the ```requires_grad``` flag. The module is also callable, that provides the matrix multiplication operation.

In [13]:
dense(in_features)

tensor([0.6775, 0.2688, 0.2498], grad_fn=<AddBackward0>)

The output is a tensor with the ```grad_fn``` attribute that keeps track of the gradient computations in the graph.

We can change the ```weight``` of the ```dense``` layer by using the ```nn.Parameter()``` wrapper class, to then get the same output as the initial construction.

In [14]:
dense.weight = nn.Parameter(weight_matrix)
dense.weight

Parameter containing:
tensor([[1., 2., 3., 4.],
        [2., 3., 4., 5.],
        [3., 4., 5., 6.]], requires_grad=True)

In [15]:
dense(in_features)

tensor([29.7880, 39.9760, 50.1886], grad_fn=<AddBackward0>)

The result is not accurate. This is due to the additional bias that the linear layer adds to the result.

In [16]:
dense.bias

Parameter containing:
tensor([-0.2120, -0.0240,  0.1886], requires_grad=True)

We can set this off in two ways.

In [19]:
# dense = nn.Linear(in_features=4, out_features=3, bias=False)
dense.bias = nn.Parameter(torch.zeros(3))
dense(in_features)

tensor([30., 40., 50.], grad_fn=<AddBackward0>)