# Sequential & Batch Norm
**In this episode, we're going to learn how to use PyTorch's Sequential class to build neural networks.**
## PyTorch Sequential Module
The `Sequential` class allows us to build PyTorch neural networks on-the-fly **without** having to build an **explicit class**. This make it much easier to rapidly build networks and allows us to skip over the step where we implement the `forward()` method. When we use the sequential way of building a PyTorch network, we construct the `forward()` method implicitly by defining our network's architecture sequentially.

A sequential module is a **container or wrapper** class that **extends** the `nn.Module` base class and allows us to compose modules together. We can compose any `nn.Module` with in any other `nn.Module`.

This means that we can compose layers to make networks, and since networks are also `nn.Module` instances, we can also compose networks with one another. Additionally, since the Sequential class is also a `nn.Module` itself, we can even compose `Sequential` modules with one another.

At this point, we may be wondering about other required functions and operations, like pooling operations or activation functions. We'll, the answer is that all of the functions and operations in the `nn.functional` API have been wrapped up into `nn.Module` classes. This allows us to pass things like activation function to `Sequential` wrappers to fully build out our networks in a sequential way.

## Building PyTorch Sequential Networks
There are **three** ways to create a Sequential model. Let's see them in action.
### Code Setup
Firstly, we handle our imports.

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms

import matplotlib.pyplot as plt
import math

from collections import OrderedDict

torch.set_printoptions(linewidth=150)

Then, we need to create a dataset that we can use for the purposes of passing a sample to the networks we will be building.

In [2]:
train_set = torchvision.datasets.FashionMNIST(
    root = './data',
    train = True,
    download = True,
    transform = transforms.Compose([transforms.ToTensor()])
)

Now, we'll grab a sample image from the FashionMNIST dataset instance.

In [3]:
image, label = train_set[0]
image.shape

torch.Size([1, 28, 28])

Now, we'll grab some values that will be used to construct our network

In [4]:
in_features = image.numel()
in_features

784

In [5]:
out_features = math.floor(in_features / 2)
out_features

392

In [6]:
type(image.numel())

int

In [7]:
out_classes = len(train_set.classes)
out_classes

10

### Sequential Model Initialization: Way 1
The first way to create a sequential model is to pass `nn.Module` instances **directly** to the `Sequential` class constructor.

In [8]:
network1 = nn.Sequential(
    nn.Flatten(start_dim = 1),
    nn.Linear(in_features, out_features),
    nn.Linear(out_features, out_classes)
)

In [10]:
network1

Sequential(
  (0): Flatten()
  (1): Linear(in_features=784, out_features=392, bias=True)
  (2): Linear(in_features=392, out_features=10, bias=True)
)

### Sequential Model Initialization: Way 2
The second way to create a sequential model is to create an `OrderedDict` that contains `nn.Module` instances. Then, pass the dictionary to the `Sequential` class constructor.

In [9]:
layers = OrderedDict([
    ('flat',nn.Flatten(start_dim=1)),
    ('hidden',nn.Linear(in_features, out_features)),
    ('output',nn.Linear(out_features, out_classes))
])

network2 = nn.Sequential(layers)

In [11]:
network2

Sequential(
  (flat): Flatten()
  (hidden): Linear(in_features=784, out_features=392, bias=True)
  (output): Linear(in_features=392, out_features=10, bias=True)
)

This way of initialization allows us to name the `nn.Module` instances explicitly.
### Sequential Model Initialization: Way 3
The third way of creating a sequential model is to create a sequential instance using an empty constructor. Then, we can use the `add_module()` method to add `nn.Module` instances to the network after it has already been initialize.

In [12]:
network3 = nn.Sequential()
network3.add_module('flat',nn.Flatten(start_dim = 1))
network3.add_module('hidden',nn.Linear(in_features,out_features))
network3.add_module('output',nn.Linear(out_features,out_classes))

In [13]:
network3

Sequential(
  (flat): Flatten()
  (hidden): Linear(in_features=784, out_features=392, bias=True)
  (output): Linear(in_features=392, out_features=10, bias=True)
)

## Class Definition Vs Sequential
So far in this course, we've been working with a CNN that we defined using a class definition. The network is defined like this:

In [14]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 12, 5)

        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)

    def forward(self, t):

        t = F.relu(self.conv1(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        t = F.relu(self.conv2(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        t = t.flatten(start_dim=1)
        t = F.relu(self.fc1(t))
        t = F.relu(self.fc2(t))
        t = self.out(t)

        return t

We get an instance of the network like so:
```python
network = Network()
```
Now, let's see how this same network can be created using the `Sequential` class. It works like this:

In [15]:
sequential = nn.Sequential(
      nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
    , nn.ReLU()
    , nn.MaxPool2d(kernel_size=2, stride=2)
    , nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
    , nn.ReLU()
    , nn.MaxPool2d(kernel_size=2, stride=2)
    , nn.Flatten(start_dim=1)  
    , nn.Linear(in_features=12*4*4, out_features=120)
    , nn.ReLU()
    , nn.Linear(in_features=120, out_features=60)
    , nn.ReLU()
    , nn.Linear(in_features=60, out_features=10)
)

We said that these networks are the **same**. But what do we mean? In this case, we mean that the networks have the **same architecture**. From a programming standpoint, the two networks are **different types** under the hood.

Note that we can get the same output predictions for these two networks if we fix the seed that is used to generate random numbers in PyTorch. This is because both network's will have randomly generated weights. To be sure the weights are the same, we use the PyTorch method below before creating each network.
`torch.manual_seed(50)`
It's important to note that the method must be called twice, once before each network initialization.

## Quiz 01
1. The `Sequential` class allows us to build PyTorch neural networks on-the-fly without having to build an explicit _______________.
  * class

2. When we build a `Sequential` model, our `forward()` method is defined explicitly.
  * False
  
3. A sequential module is a container or _______________ class that allows us to compose modules together.
  * wrapper
  
4. The Sequential class extends the _______________ class.
  * nn.Module
  
5. The `nn.Flatten()` module is a wrapper class that wraps the `torch.flatten()` function.
  * True
  
---
---