A netwok has 3 components - an input layer, one or more hidden layers and one output layer. Information travels from left to right.

Input layer is not composed of neurons. It just contains raw data and number of nodes = number of features of the data.

Hidden layers are the brain of a neural network. Width of a hidden layer is number of neurons in it and depth is the number of layers. Each neuron in a hidden layer is connected to all outputs of the previous layer. They detect complex patterns like layer 1 may detect shapes, layer 2 may combine shapes to make objects, etc.

Output layer contains number of neurons = number of outputs we want. Activation function used in this is task specific.

# **NEURON TO LAYER**

Say we have n neurons each having m weights. We can then make a weight matrix W of shape m x n where each column vector of the matrix corresponds to the weight vector of that neuron. Similarly we can have a bias vector of size n where each element represents bias of that neuron.

For a single neuron, linear step was wTx where w was weight vector. For this operation it is the matrix vector operation x @ W + b and result is a vector containing pre-activation of all neurons.

In [4]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class Layer(nn.Module):
  def __init__(self, n_input, n_neurons, activation = None):
    super().__init__()
    self.weights = nn.Parameter(torch.randn(n_input, n_neurons))
    self.bias = nn.Parameter(torch.zeros(n_neurons))
    self.activation = activation

  def forward_pass(self, x):
    logits = x @ self.weights + self.bias
    if self.activation is not None:
      return self.activation(logits)
    else:
      return logits



In [6]:
n_neurons = 3
n_inputs = 5
batch_size = 2

my_layer = Layer(n_inputs, n_neurons, activation = F.relu)
x = torch.randn(batch_size, n_inputs)
output = my_layer.forward_pass(x)
print(output.shape)

torch.Size([2, 3])
