#**Introduction to PyTorch**
In this reading assignment, you'll be introduced to a different method to build a Pytorch model. We will be using PyTorch throughout the course as well.

##Prerequisites:


1.   Basics of Python Programming
2.   Basics of Object Oriented Programming

##Learning Objectives:


1.   Learn to implement Pytorch model





# <h1><b>Methods to build a Pytorch model</b></h1>



*   Sequential
*   Functional API
*   Model Subclassing





##<h2><b>Sequential Model</b></h2>

Sequential model is a linear stack of layers but you are limited to one input and output. Sequential model is best choice when you want simple stack by stack layer model. The first layer of sequential model need to know input shape of dataset.


In [None]:
import torch
from torch import nn

In [None]:
model = nn.Sequential(
  nn.Linear(4, 5),
  nn.ReLU(),
  nn.Linear(5, 8),
  nn.ReLU(),
  nn.Linear(8, 3),
  nn.Softmax(dim=-1),
)

In [None]:
# Model Architecture
model

Sequential(
  (0): Linear(in_features=4, out_features=5, bias=True)
  (1): ReLU()
  (2): Linear(in_features=5, out_features=8, bias=True)
  (3): ReLU()
  (4): Linear(in_features=8, out_features=3, bias=True)
  (5): Softmax(dim=-1)
)

In [None]:
# forward pass with random input
model(torch.randn(5, 4))

## <h2><b>Manual building parameters and operating using torch.nn.functional API</b></h2>

The `torch.nn.functional` api deliver more low level and flexible way to build model than sequential model. It can handle complex model, multiple input or output models, custom operations as well as models that share layers.



First we need to define an input, parameters of each layers of ANN(random for this tutorial). Then `torch.nn.functional` API is used to manipulate the inputs  with the custom defined model parameters.



In [None]:
import torch
import torch.nn.functional as F

In [None]:
# generate random input, 4 examples with 28*28 features each are created
inputs = torch.randn(4, 28*28)

# generate parameters(Weights and Biases) for each layers,
# gradient calculation for this parameters need to be enabled for back propagation
# define the weights, shape should be (out_dim, in_dim)
W1 = torch.randn((64, 28*28), requires_grad=True)
W2 = torch.randn((64, 64), requires_grad=True)
W3 = torch.randn((10, 64), requires_grad=True)

# define the bias terms, shape should be (out_dim)
B1 = torch.randn((64), requires_grad=True)
B2 = torch.randn((64), requires_grad=True)
B3 = torch.randn((10), requires_grad=True)

# Layer 1 operation
h1 = F.relu(F.linear(inputs, W1, B1))
# Layer 2 operation
h2 = F.relu(F.linear(h1, W2, B2))
# Output Layer operation
output = F.softmax(F.linear(h2, W3, B3), dim=-1)
output

tensor([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]], grad_fn=<SoftmaxBackward0>)

In [None]:
Basicoutput.shape

torch.Size([4, 10])

## <h2><b>Model Subclassing</b></h2>

The last method to build the PyTorch model is called model subclassing. Model subclassing is completely customizable and allows us to build our own custom forward-pass of the model. In PyTorch the `nn.Module` class is the root class used to define a model architecture. In model subclassing implementation, MyModel inherits from the `nn.Module` class. The structure of model subclassing is that we create layers in the initializer __init__() and define the forward pass in the forward() method.

<center>
<figure>
<img src="https://doc.google.com/a/fusemachines.com/uc?export=download&id=1_4Xbdu_folAQh4tUzGl1JLwiozwW0eHm" alt="gradient_update">
<figcaption align="center">Model Subclassing</figcaption>
</figure>
</center>

In [None]:
import torch
from torch import nn

In [None]:
# MyModel inherits from the nn.Module class
class MyModel(nn.Module):
  #Initialize the layers
  def __init__(self, in_features, **kwargs):
    super(MyModel,self).__init__(**kwargs)
    self.dense1 = nn.Linear(in_features, 16)
    self.act1 = nn.ReLU()
    self.dense2 = nn.Linear(16, 32)
    self.act2 = nn.ReLU()
    self.dense3 = nn.Linear(32, 5)
    self.act3 = nn.Softmax(dim=-1)

  # Forward pass
  def forward(self,inputs):
    x = self.act1(self.dense1(inputs))
    x = self.act2(self.dense2(x))
    x = self.act3(self.dense3(x))
    return x

In [None]:
# Model initialization and architecture
model = MyModel(4)
model

MyModel(
  (dense1): Linear(in_features=4, out_features=16, bias=True)
  (act1): ReLU()
  (dense2): Linear(in_features=16, out_features=32, bias=True)
  (act2): ReLU()
  (dense3): Linear(in_features=32, out_features=5, bias=True)
  (act3): Softmax(dim=-1)
)

In [None]:
# Forward pass with random input
model(torch.randn(8, 4))

tensor([[0.1960, 0.1673, 0.2338, 0.1879, 0.2150],
        [0.2018, 0.1580, 0.2144, 0.1979, 0.2279],
        [0.1986, 0.1812, 0.2193, 0.2048, 0.1960],
        [0.2236, 0.1663, 0.2267, 0.1839, 0.1995],
        [0.2124, 0.1840, 0.2268, 0.1893, 0.1875],
        [0.1929, 0.1562, 0.2399, 0.1763, 0.2346],
        [0.1839, 0.1659, 0.2421, 0.1830, 0.2250],
        [0.1875, 0.1620, 0.2132, 0.2181, 0.2192]], grad_fn=<SoftmaxBackward0>)

# **Multi inputs and outputs**

Scenario


*   One input and two output: if we only have image data and we need to find out that the image is flower or not. if it is a flower what kind of flower it is?


In [None]:
import torch
from torch import nn
import torch.nn.functional as F

In [None]:
class SingleInpMultiOutModel(nn.Module):
    def __init__(self, in_features, **kwargs):
        super(SingleInpMultiOutModel, self).__init__(**kwargs)
        # Common Layer
        self.dense = nn.Linear(in_features, 128)

        # Output layers
        self.out1 = nn.Linear(128, 3)
        self.out2 = nn.Linear(128, 1)

    def forward(self, inputs):
        comm = F.relu(self.dense(inputs))
        out1 = F.softmax(self.out1(comm), dim=-1)
        out2 = torch.sigmoid(self.out2(comm))
        return out1, out2

# Define Model
model = SingleInpMultiOutModel(64)
model

SingleInpMultiOutModel(
  (dense): Linear(in_features=64, out_features=128, bias=True)
  (out1): Linear(in_features=128, out_features=3, bias=True)
  (out2): Linear(in_features=128, out_features=1, bias=True)
)

In [None]:
# Forward pass with random input
model(torch.randn(4, 64))

(tensor([[0.2854, 0.3190, 0.3956],
         [0.3456, 0.2737, 0.3807],
         [0.3145, 0.2875, 0.3980],
         [0.2339, 0.4925, 0.2736]], grad_fn=<SoftmaxBackward0>),
 tensor([[0.4719],
         [0.5441],
         [0.5017],
         [0.4375]], grad_fn=<SigmoidBackward0>))

*   Two input and one output: if we only have image data and structured data and we need to find out that what kind of flower is it?

In [None]:
class MultiInpSingleOutModel(nn.Module):
    def __init__(self, in_features1, in_features2, **kwargs):
        super(MultiInpSingleOutModel, self).__init__(**kwargs)
        # separate input path Layer
        self.dense1 = nn.Linear(in_features1, 10)
        self.dense2 = nn.Linear(in_features2, 64)

        # Output layers
        self.output = nn.Linear(64+10, 3) # after concatenation features to this layer will have shape of(batch_size, 64+10)

    def forward(self, input1, input2):
        x1 = F.relu(self.dense1(input1))
        x2 = F.relu(self.dense2(input2))
        x = torch.concat([x1, x2], dim=-1)  # concatenate using last dimension
        out = F.softmax(self.output(x), dim=-1)
        return out

# Define Model
model = MultiInpSingleOutModel(784, 4)
model

MultiInpSingleOutModel(
  (dense1): Linear(in_features=784, out_features=10, bias=True)
  (dense2): Linear(in_features=4, out_features=64, bias=True)
  (output): Linear(in_features=74, out_features=3, bias=True)
)

In [None]:
# Forward pass with random inputs
model(torch.randn(5, 784),
      torch.randn(5, 4))

tensor([[0.2752, 0.3207, 0.4041],
        [0.2402, 0.3278, 0.4320],
        [0.2811, 0.2587, 0.4602],
        [0.3392, 0.3496, 0.3112],
        [0.2647, 0.2636, 0.4718]], grad_fn=<SoftmaxBackward0>)


 **Two input and Two output**: If we only have image data and structured data and we need to find out the image is flower or not. if it is a flower what kind of flower it is?

In [None]:
class MultiInpMultiOutModel(nn.Module):
    def __init__(self, in_features1, in_features2, **kwargs):
        super(MultiInpMultiOutModel, self).__init__(**kwargs)
        # separate input path Layer
        self.dense1 = nn.Linear(in_features1, 10)
        self.dense2 = nn.Linear(in_features2, 64)

        # Output layers
        self.output1 = nn.Linear(64+10, 3) # after concatenation features to this layer will have shape of(batch_size, 64+10)
        self.output2 = nn.Linear(64+10, 1)

    def forward(self, input1, input2):
        x1 = F.relu(self.dense1(input1))
        x2 = F.relu(self.dense2(input2))
        x = torch.concat([x1, x2], dim=-1)  # concatenate using last dimension
        out1 = F.softmax(self.output1(x), dim=-1)
        out2 = torch.sigmoid(self.output2(x))
        return out1, out2

# Define Model
model = MultiInpMultiOutModel(784, 4)
model

MultiInpMultiOutModel(
  (dense1): Linear(in_features=784, out_features=10, bias=True)
  (dense2): Linear(in_features=4, out_features=64, bias=True)
  (output1): Linear(in_features=74, out_features=3, bias=True)
  (output2): Linear(in_features=74, out_features=1, bias=True)
)

In [None]:
# Forward pass with random inputs
model(torch.randn(5, 784),
      torch.randn(5, 4))

(tensor([[0.2486, 0.4236, 0.3278],
         [0.2756, 0.3695, 0.3549],
         [0.2355, 0.3857, 0.3788],
         [0.2473, 0.3889, 0.3638],
         [0.1644, 0.3778, 0.4577]], grad_fn=<SoftmaxBackward0>),
 tensor([[0.5052],
         [0.4569],
         [0.4962],
         [0.4677],
         [0.4761]], grad_fn=<SigmoidBackward0>))

<h2><b>Shared layer</b></h2>

One of the benefit to use model subclassing is it allows to define shared layers which share the same parameters. Shared layers are layer instances that are reused multiple times in the same model. As you can see in implementation, to share a layer in the model we call the same layer instance multiple times in `forward` method.

In [None]:
class SharedLayerModel(nn.Module):
    def __init__(self, in_features, **kwargs):
        super(SharedLayerModel, self).__init__(**kwargs)
        self.input_layer = nn.Linear(in_features, 64)
        self.share_layer = nn.Linear(64, 64)
        self.output_layer = nn.Linear(64, 5)

    def forward(self, inputs):
        x = F.relu(self.input_layer(inputs))
        # Now we apply the share layer thrice
        for _ in range(3):
            x = F.relu(self.share_layer(x))

        x = F.softmax(self.output_layer(x), dim=-1)
        return x
# Define Model
model = SharedLayerModel(128)
model

SharedLayerModel(
  (input_layer): Linear(in_features=128, out_features=64, bias=True)
  (share_layer): Linear(in_features=64, out_features=64, bias=True)
  (output_layer): Linear(in_features=64, out_features=5, bias=True)
)

In [None]:
# Forward pass with random inputs=
model(torch.randn(5, 128))

tensor([[0.2091, 0.2111, 0.2165, 0.1879, 0.1754],
        [0.2042, 0.2142, 0.2182, 0.1879, 0.1755],
        [0.2057, 0.2142, 0.2177, 0.1875, 0.1749],
        [0.2042, 0.2124, 0.2193, 0.1894, 0.1747],
        [0.2035, 0.2132, 0.2198, 0.1894, 0.1741]], grad_fn=<SoftmaxBackward0>)

##Key Takeaways:


1.   Learn to implement keras model in three different ways
2.   Sequential model is used for implementing simple stacking layers, Functional api is used for implementing complex models like multiple inputs and outputs, sharable layers and Model subclassing allows us to build our own custom model.

