# $$\text{The Sequential Model}$$

Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library.

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# 1. When to use a Sequential Model
 A sequential model is appropriate for a **plain stack of layers** where each layer has **exactly one input tensor and one output tensor.** Schematically, the following `Sequential` model:

In [2]:
model = keras.Sequential(
    [
        layers.Dense(2, activation = 'relu', name = 'layer1'),
        layers.Dense(3, activation = 'relu', name = 'layer2'),
        layers.Dense(4, name = 'layer3')
    ]
)

x = tf.ones((3,3))
y = model(x)

In [3]:
y

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]], dtype=float32)>

is equivalent to this function

In [4]:
# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))

In [5]:
y

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 0.04394899,  0.04787931, -0.1282379 , -0.02191474],
       [ 0.04394899,  0.04787931, -0.1282379 , -0.02191474],
       [ 0.04394899,  0.04787931, -0.1282379 , -0.02191474]],
      dtype=float32)>

A Sequential model is not appropriate when:

- Your model has multiple inputs or multiple outputs
- Any of your layers has multiple inputs or multiple outputs
- You need to do layer sharing
- You want non-linear topology (e.g. a residual connection, a multi-branch model)

# 2. Creating a Sequential model

You can create a sequential model by passing a list of layers to the Sequential constructor:

In [6]:
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

Layers can be accessed via `layers` attribute:

In [7]:
model.layers

[<keras.layers.core.dense.Dense at 0x21db470db80>,
 <keras.layers.core.dense.Dense at 0x21db470d160>,
 <keras.layers.core.dense.Dense at 0x21db470d370>]

You can also create a Sequential model incrementally via the `add()` method. Also note that the Sequential constructor accepts a `name` argument, just like any layer or model in Keras :

In [8]:
model = keras.Sequential(name = "my_sequential")
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

Note that there's also a corresponding `pop()` method to remove layers: a Sequential model very much like a list of layers.

In [9]:
model.pop()
print(len(model.layers))

2


# 3. Specifying the input shape in advance

Generally, all layers in Keras need to know the shape of their inputs in order to be able to create their weights. So when you create a layer like this, initially, it has no weights:

In [10]:
layer = layers.Dense(3)
layer.weights

[]

It creates its weights the first time it is called on an input, since the shape of the weights depends on the shape of the inputs:

In [11]:
x = tf.ones((1,4))
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)

[<tf.Variable 'dense_6/kernel:0' shape=(4, 3) dtype=float32, numpy=
 array([[ 0.6092166 ,  0.5091746 , -0.71757865],
        [-0.7222787 , -0.5411536 ,  0.16909516],
        [ 0.05130666, -0.04509193,  0.25447464],
        [-0.53271365, -0.31469423, -0.82948416]], dtype=float32)>,
 <tf.Variable 'dense_6/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

Naturally, this also applies to Sequential models. When you instantiate a Sequential model without an input shape, it isn't "built": it has no weights (and calling `model.weights` results in an error stating just this). The weights are created when the model first sees some input data:

In [12]:
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

x = tf.ones((1, 4))
y = model(x)

Once a model is "built", you can call its `summary()` method to display its contents:

In [13]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_7 (Dense)             (1, 2)                    10        
                                                                 
 dense_8 (Dense)             (1, 3)                    9         
                                                                 
 dense_9 (Dense)             (1, 4)                    16        
                                                                 
Total params: 35
Trainable params: 35
Non-trainable params: 0
_________________________________________________________________


However, it can be very useful when building a Sequential model incrementally to be able to display the summary of the model so far, including the current output shape. In this case, you should start your model by passing an `Input` object to your model, so that it knows its input shape from the start:

In [14]:
model = keras.Sequential(
    [
        keras.Input(shape = (4, )),
        layers.Dense(2, activation = "relu")
    ]
)

model.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_10 (Dense)            (None, 2)                 10        
                                                                 
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________


#### Note that the `Input` object is not displayed as part of `model.layers`, since it isn't a layer.

In [15]:
model.layers

[<keras.layers.core.dense.Dense at 0x21da9b5a910>]

A simple alternative is to just pass an input_shape argument to your first layer. Pass an `input_shape` ***specifying the batch size of a single training example***. In `input_shape`, ***the batch dimension is not included.***

In [16]:
model = keras.Sequential(
    [
        layers.Dense(2, activation = "relu", input_shape = (4, ))
    ]
)

model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_11 (Dense)            (None, 2)                 10        
                                                                 
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________


Models built with a predefined input shape like this always have weights (even before seeing any data) and always have a defined output shape.

__*In general, it's a recommended best practice to always specify the input shape of a Sequential model in advance if you know what it is*__.

# 4. A common debugging workflow: `add()`  + `summary()`

When building a new Sequential architecture, it's useful to incrementally stack layers with `add()` and frequently print model summaries. For instance, this enables you to monitor how a stack of `Conv2D` and `MaxPooling2D` layers is downsampling image feature maps:

In [17]:
model = keras.Sequential()
model.add(keras.Input(shape = (250,250,3)))
model.add(layers.Conv2D(32, 5, strides = 2, activation = "relu"))
model.add(layers.Conv2D(32, 3, activation = "relu"))
model.add(layers.MaxPooling2D(3))

model.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 123, 123, 32)      2432      
                                                                 
 conv2d_1 (Conv2D)           (None, 121, 121, 32)      9248      
                                                                 
 max_pooling2d (MaxPooling2D  (None, 40, 40, 32)       0         
 )                                                               
                                                                 
Total params: 11,680
Trainable params: 11,680
Non-trainable params: 0
_________________________________________________________________


The answer was: (40, 40, 32), so we can keep downsampling...

In [18]:
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

model.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 123, 123, 32)      2432      
                                                                 
 conv2d_1 (Conv2D)           (None, 121, 121, 32)      9248      
                                                                 
 max_pooling2d (MaxPooling2D  (None, 40, 40, 32)       0         
 )                                                               
                                                                 
 conv2d_2 (Conv2D)           (None, 38, 38, 32)        9248      
                                                                 
 conv2d_3 (Conv2D)           (None, 36, 36, 32)        9248      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 12, 12, 32)       0         
 2D)                                                  

# 5. What to do once you have a model?

Once your model architecture is ready, you will want to:

- Train your model, evaluate it, and run inference.

- Save your model to disk and restore it.

- Speed up model training by leveraging multiple GPUs.


# 6. Feature Extraction with a Sequential Model

Once a sequential model is built it behaves like a functional API Model. This means that every layer has an `input` and `output` attribute. These attributes can be used to do neat things, like quickly creating a model that extracys the outputs of all intermediate layers in a Sequential Model.

In [19]:
initial_model = keras.Sequential(
    [
        keras.Input(shape  =(250,250,3)), #inputting a RGB Channel
        layers.Conv2D(32, 5, strides = 2, activation = 'relu'),
        layers.Conv2D(32, 3, activation = 'relu'),
        layers.Conv2D(32, 3, activation = 'relu')
    ]
)

feature_extractor = keras.Model(
    inputs = initial_model.inputs, outputs = [layer.output for layer in initial_model.layers]
)

x = tf.ones((1,250,250,3))
features = feature_extractor(x)

Here's a similar example that only extract features from one layer:

In [20]:
initial_model = keras.Sequential(
    [
        keras.Input(shape  =(250,250,3)), #inputting a RGB Channel
        layers.Conv2D(32, 5, strides = 2, activation = 'relu'),
        layers.Conv2D(32, 3, activation = 'relu', name = 'my_intermediate_layer'),
        layers.Conv2D(32, 3, activation = 'relu')
    ]
)

feature_extractor = keras.Model(
    inputs = initial_model.inputs, outputs = initial_model.get_layer(name = 'my_intermediate_layer').output
)

x = tf.ones((1,250,250,3))
features = feature_extractor(x)

# 7. Transfer Learning with a Sequential Model

Transfer learning consists of freezing the bottom layers in a model and only training the top layers.

Here are two common transfer learning blueprint involving Sequential models.

First, let's say that you have a Sequential model, and you want to freeze all layers except the last one. In this case, you would simply iterate over model.layers and set layer.trainable = False on each layer, except the last one. Like this:

In [21]:
"""
model = keras.Sequential([
    keras.Input(shape = (784,)),
    layers.Dense(32, activation = 'relu'),
    layers.Dense(32, activation = 'relu'),
    layers.Dense(32, activation = 'relu'),
    layers.Dense(10)
])

model.load_weights(...)

for layer in model.layers[:-1]:
    layer.trainable = False


model.compile(...)
model.fit(...)
"""

"\nmodel = keras.Sequential([\n    keras.Input(shape = (784,)),\n    layers.Dense(32, activation = 'relu'),\n    layers.Dense(32, activation = 'relu'),\n    layers.Dense(32, activation = 'relu'),\n    layers.Dense(10)\n])\n\nmodel.load_weights(...)\n\nfor layer in model.layers[:-1]:\n    layer.trainable = False\n\n\nmodel.compile(...)\nmodel.fit(...)\n"

Another common blueprint is to use a Sequential model to stack a pre-trained model and some freshly initialized classification layers. Like this:

In [22]:
""""
# Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

# Freeze the base model
base_model.trainable = False

# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
    base_model,
    layers.Dense(1000),
])

# Compile & train
model.compile(...)
model.fit(...)
"""

'"\n# Load a convolutional base with pre-trained weights\nbase_model = keras.applications.Xception(\n    weights=\'imagenet\',\n    include_top=False,\n    pooling=\'avg\')\n\n# Freeze the base model\nbase_model.trainable = False\n\n# Use a Sequential model to add a trainable classifier on top\nmodel = keras.Sequential([\n    base_model,\n    layers.Dense(1000),\n])\n\n# Compile & train\nmodel.compile(...)\nmodel.fit(...)\n'

If you do transfer learning, you will probably find yourself frequently using these two patterns.