## Getting started with the Keras functional API

The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.

In my opinion this is the api that you want to always use!

## First example: a densely-connected network

Some things to note:

* A layer instance is callable (on a tensor), and it returns a tensor
* Input tensor(s) and output tensor(s) can then be used to define a Model
* Such a model can be trained just like Keras Sequential models.


In [1]:
from keras.layers import Input, Dense
from keras.models import Model

# This returns a tensor
inputs = Input(shape=(784,))

# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)

# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

Using TensorFlow backend.


Notice that the only things we had to reference are the input and output tensors. We could make as complex interactions as we want, but all the model cares about are the inputs and outputs.

## All models are callable, just like layers

With the functional API, it is easy to re-use trained models: you can treat any model as if it were a layer, by calling it on a tensor. Note that by calling a model you aren't just re-using the architecture of the model, you are also re-using its weights.



In [2]:
x = Input(shape=(784,))
# This works, and returns the 10-way softmax we defined above.
y = model(x)

This can allow, for instance, to quickly create models that can process sequences of inputs. You could turn an image classification model into a video classification model, in just one line.

In [3]:
from keras.layers import TimeDistributed

# Input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))

# This applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)

## Multi-input and multi-output models

Here's a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams.

consider the below:

In [12]:
from keras.layers import concatenate

x_in = Input(shape=(100,), name='x_in')
y_in = Input(shape=(100,), name='y_in')

# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(x_in)
y = Dense(64, activation='relu')(y_in)

z = concatenate([x, y])

x = Dense(1, activation='sigmoid', name='x_out')(z)
y = Dense(10, activation='softmax', name='y_out')(z)

To define a model with multiple inputs or outputs, you just need to specify a list:

In [13]:
model = Model(inputs=[x_in, y_in], outputs=[x, y])

There are now a couple of ways to compile the model. First is just by passing in lists of losses and loss weights:

In [14]:
from keras.utils import to_categorical

import numpy as np
data = np.random.random((1000, 100))
xs = np.random.randint(2, size=(1000, 1))
ys = np.random.randint(10, size=(1000, 1))

model.compile(optimizer='rmsprop', loss=['binary_crossentropy', 'categorical_crossentropy'],
              loss_weights=[1., 0.2])

model.fit([data, data], [xs, to_categorical(ys)],
          epochs=1, batch_size=32)

Epoch 1/1


<keras.callbacks.History at 0x113cc8290>

The second is to specify a dictionary (refering to the names of the output tensors):

In [15]:
model.compile(optimizer='rmsprop',
              loss={'x_out': 'binary_crossentropy', 'y_out': 'categorical_crossentropy'},
              loss_weights={'x_out': 1., 'y_out': 0.2})

# And trained it via:
model.fit({'x_in': data, 'y_in': data},
          {'x_out': xs, 'y_out': to_categorical(ys)},
          epochs=1, batch_size=32)

Epoch 1/1


<keras.callbacks.History at 0x114e48f50>

## Shared layers

Another good use for the functional API are models that use shared layers. Let's take a look at shared layers.

The use is somewhat simple. We save the layer we want to use and apply it multiple times.

In [17]:
inputs = Input(shape=(64,))

# a layer instance is callable on a tensor, and returns a tensor
layer_we_share = Dense(64, activation='relu')

# Now we apply the layer twice
x = layer_we_share(inputs)
x = layer_we_share(x)

predictions = Dense(10, activation='softmax')(x)

model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

## The concept of layer "node"

Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a "node" to the layer, linking the input tensor to the output tensor. When you are calling the same layer multiple times, that layer owns multiple nodes indexed as 0, 1, 2...

In previous versions of Keras, you could obtain the output tensor of a layer instance via layer.get_output(), or its output shape via layer.output_shape. You still can (except get_output() has been replaced by the property output). But what if a layer is connected to multiple inputs?

As long as a layer is only connected to one input, there is no confusion, and .output will return the one output of the layer:

In [18]:
a = Input(shape=(140, 256))

dense = Dense(32)
affine_a = dense(a)

assert dense.output == affine_a

Not so if the layer has multiple inputs:

In [19]:
a = Input(shape=(140, 256))
b = Input(shape=(140, 256))

dense = Dense(32)
affine_a = dense(a)
affine_b = dense(b)

dense.output

AttributeError: Layer dense_16 has multiple inbound nodes, hence the notion of "layer output" is ill-defined. Use `get_output_at(node_index)` instead.

Okay then. The following works:

In [20]:
assert dense.get_output_at(0) == affine_a
assert dense.get_output_at(1) == affine_b

Simple enough, right?

The same is true for the properties input_shape and output_shape: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of "layer output/input shape" is well defined, and that one shape will be returned by layer.output_shape/layer.input_shape. But if, for instance, you apply a same Conv2D layer to an input of shape (3, 32, 32), and then to an input of shape (3, 64, 64), the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to.