<a href="https://colab.research.google.com/github/and-is/learning-pytorch/blob/main/nn_module.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Learning to use NN module to better create our neural network.

In [1]:
# create model  class (we're building a 5 features to one output)
import torch
import torch.nn as nn

In [5]:
class Model(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.linear = nn.Linear(num_features, 1)     # we give no of features on input and that on output which is 1 here
    self.sigmoid = nn.Sigmoid()    # activation function for our simple nn

  def forward(self, features):
    out = self.linear(features)      # this does the o=wz+b thingy
    out = self.sigmoid(out)      # applying the activation obv
    return out



In [6]:
# Testing the model now
features = torch.rand(10,5)     # dataset with 10 rows and 5 columns

model = Model(features.shape[1])     # creates a model instance of that class above with column shape of features array features

model(features)     # can also do model.forward(features) but this is the standard way built in pytorch only. forward method is automatically triggered here

tensor([[0.4378],
        [0.5530],
        [0.4483],
        [0.5417],
        [0.5097],
        [0.5116],
        [0.5196],
        [0.4696],
        [0.4800],
        [0.4431]], grad_fn=<SigmoidBackward0>)

In [13]:
# viewing weights and biases
model.linear.weight

Parameter containing:
tensor([[-0.0008,  0.0886, -0.4345, -0.2613,  0.0685]], requires_grad=True)

In [14]:
model.linear.bias

Parameter containing:
tensor([0.2013], requires_grad=True)

In [15]:
!pip install torchinfo

Collecting torchinfo
  Downloading torchinfo-1.8.0-py3-none-any.whl.metadata (21 kB)
Downloading torchinfo-1.8.0-py3-none-any.whl (23 kB)
Installing collected packages: torchinfo
Successfully installed torchinfo-1.8.0


In [16]:
from torchinfo import summary
summary(model, input_size=(10,5))

Layer (type:depth-idx)                   Output Shape              Param #
Model                                    [10, 1]                   --
├─Linear: 1-1                            [10, 1]                   6
├─Sigmoid: 1-2                           [10, 1]                   --
Total params: 6
Trainable params: 6
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

Now a little more complex neural network with a 5 feature input layer, 3 feature hidden layer and 1 output layer. ReLu used in hidden layer and sigmoid on the output layer.

In [18]:
class Model2(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.linear = nn.Linear(num_features, 3)     # hidden layer will get 3 outputs from the output now.
    self.relu = nn.ReLU()
    self.linear2 = nn.Linear(3,1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, features):
    out = self.linear(features)
    out = self.sigmoid(out)
    out = self.linear2(out)
    out = self.sigmoid(out)

    return out

In [19]:
features = torch.rand(10,5)

model2 = Model2(features.shape[1])

model2(features)

tensor([[0.5152],
        [0.5077],
        [0.5190],
        [0.5031],
        [0.5203],
        [0.5178],
        [0.5124],
        [0.5121],
        [0.5225],
        [0.5209]], grad_fn=<SigmoidBackward0>)

In [21]:
model2.linear2.weight

Parameter containing:
tensor([[0.4893, 0.4547, 0.1855]], requires_grad=True)

In [22]:
model2.linear.weight

Parameter containing:
tensor([[ 0.0770,  0.1483, -0.0442,  0.0839, -0.4253],
        [ 0.1490, -0.2899, -0.3300,  0.2447,  0.3597],
        [ 0.2289,  0.0722, -0.1697, -0.3163,  0.0849]], requires_grad=True)

The way of feeding one into another is hectic. So, let's make sequential containers.

In [25]:
class Model3(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.network = nn.Sequential(
        nn.Linear(num_features, 3),
        nn.ReLU(),
        nn.Linear(3,1),
        nn.Sigmoid()
    )

  def forward(self, features):
    out = self.network(features)

    return out

In [26]:
features = torch.rand(10,5)

model3 = Model3(features.shape[1])

model3(features)

tensor([[0.6394],
        [0.5984],
        [0.6010],
        [0.6111],
        [0.5884],
        [0.6287],
        [0.6205],
        [0.6259],
        [0.6295],
        [0.6157]], grad_fn=<SigmoidBackward0>)

Now a ploy to improve that earlier training pipeline that we did

In [27]:
class NN(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.linear = nn.Linear(num_features, 1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, features):
    out = self.linear(features)
    out = self.sigmoid(out)
    return out

  def loss_function(self, y_pred, y):
    epsilon = 1e-7
    y_pred = torch.clamp(y_pred, epsilon, 1 - epsilon)

    # Calculate loss
    loss = -(y_train_tensor * torch.log(y_pred) + (1 - y_train_tensor) * torch.log(1 - y_pred)).mean()
    return loss

Now let's so for built in loss function and optimizers.

In [29]:
loss_function = nn.BCELoss()

# Then simply during the training in epochs, can use
# loss = loss_function(y_pred, y_train_tensor)

In [30]:
class NN(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.linear = nn.Linear(num_features, 1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, features):
    out = self.linear(features)
    out = self.sigmoid(out)
    return out

  # def loss_function(self, y_pred, y):
  #   epsilon = 1e-7
  #   y_pred = torch.clamp(y_pred, epsilon, 1 - epsilon)
# can simply use loss_function(y_pred,y) now!

Now on to torch.optim which is a module with optimization algorithms.

In [31]:
# define optimizer
learning_rate = 0.1
# model.parameters() gives all weights and biases with a looping way
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)             # stochastic gradient descent
# To call, we can use
# optimizer.step()
# optimizer.zero_grad()    to turn all gradients zero with a single work. do this clearing before backward pass usually.