# What is torch.nn?
torch.nn is a module in PyTorch that provides classes and functions to build neural networks. It includes layers, loss functions, and other tools necessary for building and training models.

In [1]:
import torch
import torch.nn as nn

 nn.Linear: This layer applies a linear transformation to the input data. It is defined as nn.Linear(in_features, out_features), where in_features is the number of input features and out_features is the number of output features. It has parameters (weights and biases) that are learned during training.

Capabilities: Linear transformation of input data.
Limitations: Cannot capture non-linear relationships in the data.
Weaknesses: Limited in modeling complex patterns without activation functions.
Type: It is a hidden layer when placed between input and output layers.
Data Transfer: It transforms data through a weighted sum and adds a bias.

nn.Sigmoid: The sigmoid activation function maps input values to a range between 0 and 1. It is defined as nn.Sigmoid().

Why Sigmoid?: It is used for binary classification as it outputs a probability value.
Function: It introduces non-linearity, allowing the network to learn complex patterns.

nn.Sequential is a container module that sequences a series of layers in a linear stack. It simplifies the model-building process by allowing you to define the forward pass of the network in a straightforward manner.

Capabilities: Easy to define and stack layers sequentially.
Limitations: Less flexible for complex models with branching or skipping connections.

In [4]:
# Define the model using nn.Sequential
model = nn.Sequential(
  nn.Linear(3,2), # nn.Linea(in_features, out_features)
  nn.Linear(2,1), # Hidden layer to Output layer
  nn.Sigmoid()  # Activation Layer
)

In [6]:
# Each sub-list of the tensor must have a length matching the initial value of the Models Input Linear Layer and must be a float.

# Acceptable
torch.tensor([
    [1., 1., 1.]
])
torch.tensor([
    [0.1, 0.1, 0.1]
])

torch.tensor([
    [0.1, 0.2, 0.3],
    [0.4, 0.5, 0.6],
    [0.7, 0.8, 0.9],  # each sub-list within the tensor would receive a binary classification
    [1.0, 1.1, 1.2],
    [1.3, 1.4, 1.5]
])

# Unacceptable
# torch.tensor([
#     [1, 1, 1]
# ])
# torch.tensor([
#     [0.1, 0.2, 0.3, 0.4],
#     [0.1, 0.2, 0.3, 0.4],
#     [0.1, 0.2, 0.3, 0.4]
# ])

tensor([[0.1000, 0.2000, 0.3000],
        [0.4000, 0.5000, 0.6000],
        [0.7000, 0.8000, 0.9000],
        [1.0000, 1.1000, 1.2000],
        [1.3000, 1.4000, 1.5000]])

In [7]:
## Define a tensor for our model
input_tensor = torch.tensor([
    [0.1, 0.2, 0.3]
])

# Ensure the tensor is of type float
input_tensor = input_tensor.float()

# Define the model using nn.Sequential
model = nn.Sequential(
  nn.Linear(3,2), # Input layer to Hidden Layer
  nn.Linear(2,1), # Hidden layer to Output layer
  nn.Sigmoid()  # Activation Layer
)

# Feed tensor into the model
output = model(input_tensor)
print(output)

tensor([[0.6428]], grad_fn=<SigmoidBackward0>)


In [8]:
#Weights and biases are parameters of the linear layer that are learned during training. You can inspect them as follows:

# View weights and biases
for name, param in model.named_parameters():
    if param.requires_grad:
        print(name, param.data)

0.weight tensor([[-0.0406,  0.1965, -0.1198],
        [-0.1632,  0.4881,  0.0515]])
0.bias tensor([-0.0393, -0.2983])
1.weight tensor([[-0.4210, -0.6575]])
1.bias tensor([0.4380])


# Weights

Weights are the parameters that connect neurons between layers. Each weight determines the strength and direction of the relationship between two neurons. In your model:

In [None]:
# First Layer Weights (0.weight): These weights connect the 3 input features to the 2 neurons in the hidden layer. Each row corresponds to the weights of a neuron in the hidden layer, and each column corresponds to an input feature.

# 0.weight tensor([[-0.0406,  0.1965, -0.1198],
#        [-0.1632,  0.4881,  0.0515]])

# Second Layer Weights (1.weight): These weights connect the 2 neurons in the hidden layer to the single output neuron. Each weight corresponds to the connection strength from a hidden layer neuron to the output neuron.

# 1.weight tensor([[-0.4210, -0.6575]])

# Biases

Biases are additional parameters that allow the model to fit the data better by providing an offset. They are added to the weighted sum of inputs before applying the activation function.

In [None]:
# First Layer Biases (0.biases): These biases are added to the weighted sum of inputs for each neuron in the hidden layer.

# 0.bias tensor([-0.0393, -0.2983])

# Second Layer Biases (1.biases): This bias is added to the weighted sum of inputs for the output neuron.

# 1.bias tensor([0.4380])

# How weights and biases are applied
First Linear Layer: The input tensor [[0.1, 0.2, 0.3]] is multiplied by the first layer's weights and then the biases are added.

mathematically: hidden_layer_output = input_tensor * 0.weight^T + 0.bias

result: hidden_layer_output = [[0.1, 0.2, 0.3] * [[-0.4012, -0.5649, 0.3153], [-0.4333, 0.0119, -0.4002]]^T] + [-0.3478, -0.4386]


Second Linear Layer: The output from the hidden layer is multiplied by the second layer's weights and the biases are added.

mathematically: output = hidden_layer_output * 1.weight^T + 1.bias

# Activation Function
The Sigmoid activation function is applied to the output of the second linear layer to produce the final output.

mathematically: final_output = sigmoid(output) OR sigmoid(x) = 1 / (1 + e^(-x))

# EXAMPLE
hidden_layer_output = [0.1, 0.2, 0.3] * [[-0.4012, -0.5649, 0.3153], [-0.4333, 0.0119, -0.4002]]^T + [-0.3478, -0.4386]

hidden_layer_output[0] = (0.1 * -0.4012) + (0.2 * -0.5649) + (0.3 * 0.3153) - 0.3478
                      = -0.04012 - 0.11298 + 0.09459 - 0.3478
                      = -0.40631

hidden_layer_output[1] = (0.1 * -0.4333) + (0.2 * 0.0119) + (0.3 * -0.4002) - 0.4386
                      = -0.04333 + 0.00238 - 0.12006 - 0.4386
                      = -0.59961
output = [-0.40631, -0.59961] * [-0.6352, -0.1661] + [0.4511]

output = (-0.40631 * -0.6352) + (-0.59961 * -0.1661) + 0.4511
       ≈ 0.25806 + 0.09957 + 0.4511
       ≈ 0.80873

final_output = sigmoid(0.80873) ≈ 1 / (1 + e^(-0.80873)) ≈ 0.6918

In [None]:
# Setting wieghts and biases
# Although not recommended, there may be a time where you'll need to set your weights and biases manually and PyTorch provides us an easy way to do that with the following command:

model[0].weight = nn.Parameter(torch.tensor([[-0.5, 0.2, 0.1], [0.4, -0.1, -0.3]]))

# Defining a Model with nn.Module
We just went in depth into building our Learning Model utilizing the built in Sequential method. It was great, but it has some limitations that we may not be able to work around depending on the model we are building. Let's turn this model into a Python class for more complex models, it is better to define a custom neural network class by extending nn.Module.

In [None]:
class BinaryModel(nn.Module):
    def __init__(self):
        super(BinaryModel, self).__init__()
        self.linear1 = nn.Linear(3, 2)
        self.linear2 = nn.Linear(2, 1)
        self.sigmoid = nn.Sigmoid()
    
    def forward(self, x):
        x = self.linear1(x)
        x = self.linear2(x)
        x = self.sigmoid(x)
        return x

# Create an instance of the model
model = BinaryModel()

# Expand your knowledge
Modify the Model: Add another nn.Linear layer and experiment with different activation functions like "nn.ReLU()".

Inspect Parameters: Write code to inspect and print the weights and biases before and after training the model.

Dataset Preparation: Generate a simple synthetic dataset and pre-process it for training.