# Introduction to PyTorch


## Sequential Model
Sequential model is a linear stack of layers but you are limited to one input and output. Sequential model is best choice when you want simple stack by stack layer model. The first layer of sequential model need to know input shape of dataset.

In [2]:
import torch
from torch import nn
model = nn.Sequential(
  nn.Linear(4, 5),
  nn.ReLU(),
  nn.Linear(5, 8),
  nn.ReLU(),
  nn.Linear(8, 3),
  nn.Softmax(dim=-1),
)
# Model Architecture
print(model)
# forward pass with random input
output = model(torch.randn(5, 4))
print(output)

Sequential(
  (0): Linear(in_features=4, out_features=5, bias=True)
  (1): ReLU()
  (2): Linear(in_features=5, out_features=8, bias=True)
  (3): ReLU()
  (4): Linear(in_features=8, out_features=3, bias=True)
  (5): Softmax(dim=-1)
)
tensor([[0.2808, 0.4005, 0.3187],
        [0.3185, 0.3433, 0.3382],
        [0.2809, 0.4010, 0.3181],
        [0.3347, 0.3279, 0.3374],
        [0.2722, 0.4062, 0.3216]], grad_fn=<SoftmaxBackward0>)


## Manual building parameters and operating using torch.nn.functional API
The torch.nn.functional API delivers a more low-level and flexible way to build models than the sequential model. It can handle complex models, multiple input or output models, custom operations, as well as models that share layers.

In [3]:
import torch
import torch.nn.functional as F
# generate random input, 4 examples with 28*28 features each are created
inputs = torch.randn(4, 28*28)

# generate parameters(Weights and Biases) for each layers,
# gradient calculation for this parameters need to be enabled for back propagation
# define the weights, shape should be (out_dim, in_dim)
W1 = torch.randn((64, 28*28), requires_grad=True)
W2 = torch.randn((64, 64), requires_grad=True)
W3 = torch.randn((10, 64), requires_grad=True)

# define the bias terms, shape should be (out_dim)
B1 = torch.randn((64), requires_grad=True)
B2 = torch.randn((64), requires_grad=True)
B3 = torch.randn((10), requires_grad=True)

# Layer 1 operation
h1 = F.relu(F.linear(inputs, W1, B1))
# Layer 2 operation
h2 = F.relu(F.linear(h1, W2, B2))
# Output Layer operation
output = F.softmax(F.linear(h2, W3, B3), dim=-1)
print(output)
print(output.shape)

tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00,
         0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00,
         0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
         1.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00,
         8.7662e-40, 0.0000e+00, 0.0000e+00, 0.0000e+00]],
       grad_fn=<SoftmaxBackward0>)
torch.Size([4, 10])


## Model Subclassing
Model subclassing is completely customizable and allows us to build our own custom forward-pass of the model. In PyTorch, the nn.Module class is the root class used to define a model architecture. In model subclassing implementation, MyModel inherits from the nn.Module class. The structure of model subclassing is that we create layers in the initializer __init__() and define the forward pass in the forward() method.

In [4]:
import torch
from torch import nn
# MyModel inherits from the nn.Module class
class MyModel(nn.Module):
  #Initialize the layers
  def __init__(self, in_features, **kwargs):
    super(MyModel,self).__init__(**kwargs)
    self.dense1 = nn.Linear(in_features, 16)
    self.act1 = nn.ReLU()
    self.dense2 = nn.Linear(16, 32)
    self.act2 = nn.ReLU()
    self.dense3 = nn.Linear(32, 5)
    self.act3 = nn.Softmax(dim=-1)

  # Forward pass
  def forward(self,inputs):
    x = self.act1(self.dense1(inputs))
    x = self.act2(self.dense2(x))
    x = self.act3(self.dense3(x))
    return x
# Model initialization and architecture
model = MyModel(4)
print(model)
# Forward pass with random input
output = model(torch.randn(8, 4))
print(output)

MyModel(
  (dense1): Linear(in_features=4, out_features=16, bias=True)
  (act1): ReLU()
  (dense2): Linear(in_features=16, out_features=32, bias=True)
  (act2): ReLU()
  (dense3): Linear(in_features=32, out_features=5, bias=True)
  (act3): Softmax(dim=-1)
)
tensor([[0.1520, 0.1910, 0.1675, 0.2593, 0.2302],
        [0.1750, 0.2015, 0.1694, 0.2297, 0.2243],
        [0.1388, 0.2035, 0.1608, 0.2643, 0.2326],
        [0.1638, 0.2022, 0.1655, 0.2385, 0.2300],
        [0.1533, 0.2049, 0.1726, 0.2422, 0.2270],
        [0.1462, 0.2080, 0.1634, 0.2507, 0.2317],
        [0.1659, 0.2094, 0.1630, 0.2315, 0.2301],
        [0.1485, 0.2055, 0.1657, 0.2484, 0.2318]], grad_fn=<SoftmaxBackward0>)


## Multi-input and Multi-output Models
PyTorch model subclassing allows for flexible architectures, including multiple inputs and outputs. Below are examples for single/multi input/output scenarios and shared layers.

In [5]:
import torch
from torch import nn
import torch.nn.functional as F
class SingleInpMultiOutModel(nn.Module):
    def __init__(self, in_features, **kwargs):
        super(SingleInpMultiOutModel, self).__init__(**kwargs)
        # Common Layer
        self.dense = nn.Linear(in_features, 128)

        # Output layers
        self.out1 = nn.Linear(128, 3)
        self.out2 = nn.Linear(128, 1)

    def forward(self, inputs):
        comm = F.relu(self.dense(inputs))
        out1 = F.softmax(self.out1(comm), dim=-1)
        out2 = torch.sigmoid(self.out2(comm))
        return out1, out2

# Define Model
model = SingleInpMultiOutModel(64)
print(model)
# Forward pass with random input
out1, out2 = model(torch.randn(4, 64))
print(out1)
print(out2)

SingleInpMultiOutModel(
  (dense): Linear(in_features=64, out_features=128, bias=True)
  (out1): Linear(in_features=128, out_features=3, bias=True)
  (out2): Linear(in_features=128, out_features=1, bias=True)
)
tensor([[0.2984, 0.4038, 0.2978],
        [0.3555, 0.3255, 0.3190],
        [0.3401, 0.3732, 0.2867],
        [0.3355, 0.3006, 0.3639]], grad_fn=<SoftmaxBackward0>)
tensor([[0.4712],
        [0.4777],
        [0.4711],
        [0.4636]], grad_fn=<SigmoidBackward0>)
