# Keras Sequential model

**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2020/04/12<br>
**Last modified:** 2023/06/25<br>
**Description:** Complete guide to the Sequential model.

## Setup

In [None]:
import keras
from keras import layers
from keras import ops

## When to use a Sequential model

A `Sequential` model is appropriate for **a plain stack of layers**
where each layer has **exactly one input tensor and one output tensor**.

Schematically, the following `Sequential` model:

In [None]:
# Define Sequential model with 3 layers
model = keras.Sequential(
    [
        keras.Input(shape=(3,1)),
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)


A Sequential model is **not appropriate** when:

- Your model has multiple inputs or multiple outputs
- Any of your layers has multiple inputs or multiple outputs
- You need to do layer sharing
- You want non-linear topology (e.g. a residual connection, a multi-branch
model)

## Creating a Sequential model

You can create a Sequential model by passing a list of layers to the Sequential
constructor:

In [None]:
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

Its layers are accessible via the `layers` attribute:

In [None]:
model.layers

[<Dense name=dense, built=False>,
 <Dense name=dense_1, built=False>,
 <Dense name=dense_2, built=False>]

You can also create a Sequential model incrementally via the `add()` method:

In [None]:
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

Note that there's also a corresponding `pop()` method to remove layers:
a Sequential model behaves very much like a list of layers.

In [None]:
model.pop()
print(len(model.layers))  # 2

2


Also note that the Sequential constructor accepts a `name` argument, just like
any layer or model in Keras. This is useful to annotate TensorBoard graphs
with semantically meaningful names.

In [None]:
model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

## Specifying the input shape in advance

Generally, all layers in Keras need to know the shape of their inputs
in order to be able to create their weights. So when you create a layer like
this, initially, it has no weights:

In [None]:
layer = layers.Dense(3)
layer.weights  # Empty

[]

It creates its weights the first time it is called on an input, since the shape
of the weights depends on the shape of the inputs:

In [None]:
# Call layer on a test input
x = ops.ones((1, 4)) # creates a tensor of shape (1, 4) filled with ones.
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)

[<Variable path=dense_6/kernel, shape=(4, 3), dtype=float32, value=[[ 0.772823    0.559764   -0.4932681 ]
  [ 0.01619363 -0.34241778  0.6054685 ]
  [ 0.7129122   0.34047163 -0.08443046]
  [-0.6911168  -0.6273356   0.03513336]]>,
 <Variable path=dense_6/bias, shape=(3,), dtype=float32, value=[0. 0. 0.]>]

Naturally, this also applies to Sequential models. When you instantiate a
Sequential model without an input shape, it isn't "built": it has no weights
(and calling
`model.weights` results in an error stating just this). The weights are created
when the model first sees some input data:

In [None]:
model = keras.Sequential(
    [
        #keras.Input(shape=(4,)), # if this removed, No weights defined at this stage!
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

model.weights


[]

In [None]:
# Call the model on a test input
# performs a forward pass of the input x through the model.
x = ops.ones((1, 4))
y = model(x)
# Now we have weights becuase input shape is defined
model.weights

[<Variable path=sequential_11/dense_31/kernel, shape=(4, 2), dtype=float32, value=[[ 0.5186703  -0.89275837]
  [-0.32034516  0.22481966]
  [ 0.1135633  -0.88960075]
  [-0.47233725  0.6958945 ]]>,
 <Variable path=sequential_11/dense_31/bias, shape=(2,), dtype=float32, value=[0. 0.]>,
 <Variable path=sequential_11/dense_32/kernel, shape=(2, 3), dtype=float32, value=[[ 0.10157859  0.19588745  0.05911839]
  [ 0.4548638   0.4834206  -1.0363262 ]]>,
 <Variable path=sequential_11/dense_32/bias, shape=(3,), dtype=float32, value=[0. 0. 0.]>,
 <Variable path=sequential_11/dense_33/kernel, shape=(3, 4), dtype=float32, value=[[ 0.08667135  0.63437176  0.64811933 -0.4640239 ]
  [ 0.4898951  -0.38968176  0.40150666 -0.8447258 ]
  [-0.09795547  0.2846216  -0.6438199  -0.68790245]]>,
 <Variable path=sequential_11/dense_33/bias, shape=(4,), dtype=float32, value=[0. 0. 0. 0.]>]

In [None]:
y

<tf.Tensor: shape=(1, 4), dtype=float32, numpy=
array([[-0.06730284, -0.033949  , -0.05837943,  0.04649062]],
      dtype=float32)>

Once a model is "built", you can call its `summary()` method to display its
contents:

In [None]:
model.summary()

However, it can be very useful when building a Sequential model incrementally
to be able to display the summary of the model so far, including the current
output shape. In this case, you should start your model by passing an `Input`
object to your model, so that it knows its input shape from the start:

In [None]:
model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()

Note that the `Input` object is not displayed as part of `model.layers`, since
it isn't a layer:

In [None]:
model.layers

Models built with a predefined input shape like this always have weights (even
before seeing any data) and always have a defined output shape.

- In general, it's a recommended best practice to always specify the input shape
of a Sequential model in advance if you know what it is.

- When building a new Sequential architecture, it's useful to incrementally stack
layers with `add()` and frequently print model summaries. To
enable you to monitor how a stack of layers is working.

## Training, evaluation, and inference

Training, evaluation, and inference work exactly in the same way for models
built using the functional API as for `Sequential` models.

The `Model` class offers a built-in training loop (the `fit()` method)
and a built-in evaluation loop (the `evaluate()` method). Note
that you can easily customize these loops to implement your own training routines.

Here, load the MNIST image data, reshape it into vectors,
fit the model on the data (while monitoring performance on a validation split),
then evaluate the model on the test data:

In [None]:
model = keras.Sequential()
model.add(keras.Input(shape=(784,)))
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10, activation="softmax"))

model.summary()

Here's what the typical end-to-end workflow looks like, consisting of:

- Training
- Validation on a holdout set generated from the original training data
- Evaluation on the test data

We'll use MNIST data for this example.

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Preprocess the data (these are NumPy arrays)
x_train = x_train.reshape(60000, 784).astype("float32") / 255
x_test = x_test.reshape(10000, 784).astype("float32") / 255

y_train = y_train.astype("float32")
y_test = y_test.astype("float32")

# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


## The `compile()` method: specifying a loss, metrics, and an optimizer

To train a model with `fit()`, you need to specify a loss function, an optimizer, and
optionally, some metrics to monitor.

You pass these to the model as arguments to the `compile()` method:

Many built-in optimizers, losses, and metrics are available

We have to specify the training configuration (optimizer, loss, metrics):

* Use `SparseCategoricalCrossentropy` and `SparseCategoricalAccuracy` when
your labels are provided as integers (e.g., [0, 1, 2, ...]) rather than one-hot encoded vectors.

* Use `categorical_crossentropy` and `categorical_accuracy`, when	Multi-class classification with one-hot labels	[0, 0, 1], [1, 0, 0]
* Use `binary_crossentropy` and `BinaryAccuracy`, when	Binary classification	[0, 1, 1, 0]

In [None]:
model.compile(
    optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),  # Optimizer
    # Loss function to minimize
    loss=keras.losses.SparseCategoricalCrossentropy(),
    # List of metrics to monitor
    metrics=[keras.metrics.SparseCategoricalAccuracy()],
)

Now, let's review each piece of this workflow in detail.

The `metrics` argument should be a list -- your model can have any number of metrics.

If your model has multiple outputs, you can specify different losses and metrics for
each output, and you can modulate the contribution of each output to the total loss of
the model. You will find more details about this in the **Passing data to multi-input,
multi-output models** section.

Note that if you're satisfied with the default settings, in many cases the optimizer,
loss, and metrics can be specified via string identifiers as a shortcut:

In [None]:
model.compile(
    optimizer="rmsprop",
    loss="sparse_categorical_crossentropy",
    metrics=["sparse_categorical_accuracy"],
)

We call `fit()`, which will train the model by slicing the data into "batches" of size
`batch_size`, and repeatedly iterating over the entire dataset for a given number of
`epochs`.

In [None]:
print("Fit model on training data")
history = model.fit(
    x_train,
    y_train,
    batch_size=64,
    epochs=2,
    # We pass some validation for
    # monitoring validation loss and metrics
    # at the end of each epoch
    validation_data=(x_val, y_val),
)

Fit model on training data
Epoch 1/2
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 5ms/step - loss: 0.5874 - sparse_categorical_accuracy: 0.8398 - val_loss: 0.1857 - val_sparse_categorical_accuracy: 0.9482
Epoch 2/2
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 3ms/step - loss: 0.1720 - sparse_categorical_accuracy: 0.9499 - val_loss: 0.1634 - val_sparse_categorical_accuracy: 0.9500


*Parameter	Description*

**x**		: Input data (NumPy array, Tensor, or Dataset)

**y**		: Target labels (same format as output of the model)

**batch_size**	: Number of samples per gradient update (default: 32)

**epochs**	: Number of passes through the entire dataset

**validation_data**	: Tuple (val_x, val_y) for evaluating model at end of each epoch

**shuffle**	: Whether to shuffle training data before each epoch (default: True)

**verbose**		: 0 = silent, 1 = progress bar, 2 = one line per epoch

The returned `history` object holds a record of the loss values and metric values
during training:

In [None]:
print(history.history)

We evaluate the model on the test data via `evaluate()`:

In [None]:
# Evaluate the model on the test data using `evaluate`
print("Evaluate on test data")
results = model.evaluate(x_test, y_test, batch_size=128)
print("test loss, test acc:", results)

# Generate predictions (probabilities -- the output of the last layer)
# on new data using `predict`
print("Generate predictions for 3 samples")
predictions = model.predict(x_test[:3])
print("predictions shape:", predictions.shape)

### Exercise: Try different Optimizers:

- `SGD()` (with or without momentum)
- `Adam()`
- etc.

Incraese number of batch_size, epochs and notice the difference


# Self-Sudy: Keras Functional API

**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2019/03/01<br>
**Last modified:** 2023/06/25<br>
**Description:** Complete guide to the functional API.

## Introduction

The Keras *functional API* is a way to create models that are more flexible
than the `keras.Sequential` API. The functional API can handle models
with non-linear topology, shared layers, and even multiple inputs or outputs.

The main idea is that a deep learning model is usually
a directed acyclic graph (DAG) of layers.
So the functional API is a way to build *graphs of layers*.

Consider the following model:

<div class="k-default-codeblock">
```
(input: 784-dimensional vectors)
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (10 units, softmax activation)]
       ↧
(output: logits of a probability distribution over 10 classes)
```
</div>

This is a basic graph with three layers.
To build this model using the functional API, start by creating an input node:

In [None]:
inputs = keras.Input(shape=(784,))

The shape of the data is set as a 784-dimensional vector.
The batch size is always omitted since only the shape of each sample is specified.

If, for example, you have an image input with a shape of `(32, 32, 3)`,
you would use:

In [None]:
# Just for demonstration purposes.
img_inputs = keras.Input(shape=(32, 32, 3))

The `inputs` that is returned contains information about the shape and `dtype`
of the input data that you feed to your model.
Here's the shape:

In [None]:
inputs.shape

Here's the dtype:

In [None]:
inputs.dtype

You create a new node in the graph of layers by calling a layer on this `inputs`
object:

In [None]:
dense = layers.Dense(64, activation="relu")
x = dense(inputs)

The "layer call" action is like drawing an arrow from "inputs" to this layer
you created.
You're "passing" the inputs to the `dense` layer, and you get `x` as the output.

Let's add a few more layers to the graph of layers:

In [None]:
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)

At this point, you can create a `Model` by specifying its inputs and outputs
in the graph of layers:

In [None]:
model = keras.Model(inputs=inputs, outputs=outputs, name="mnist_model")

Let's check out what the model summary looks like:

In [None]:
model.summary()

You can also plot the model as a graph:

In [None]:
keras.utils.plot_model(model, "my_first_model.png")

And, optionally, display the input and output shapes of each layer
in the plotted graph:

In [None]:
keras.utils.plot_model(model, "my_first_model_with_shape_info.png", show_shapes=True)

This figure and the code are almost identical. In the code version,
the connection arrows are replaced by the call operation.

A "graph of layers" is an intuitive mental image for a deep learning model,
and the functional API is a way to create models that closely mirrors this.