##### Copyright 2019 The TensorFlow Authors.

In [1]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# The Keras functional API

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/guide/keras/functional"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/guide/keras/functional.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/docs/blob/master/site/en/guide/keras/functional.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/docs/site/en/guide/keras/functional.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

## Setup

In [1]:
from __future__ import absolute_import, division, print_function, unicode_literals

try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
except Exception:
    pass

import numpy as np

import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers

tf.keras.backend.clear_session()  # For easy reset of notebook state.

## 介绍
<!--
## Introduction
-->

Keras *函数式 API* 是一种比 `tf.keras.Sequential` API 更加灵活的创建模型的方法。函数式 API 可以处理具有非线性拓扑结构的模型、有共享层（shared layers）的模型以及有多个输入和输出的模型。
<!--
The Keras *functional API* is a way to create models that is more flexible than the `tf.keras.Sequential` API. The functional API can handle models with non-linear topology, models with shared layers, and models with multiple inputs or outputs.
-->

其主要思想是深度学习的模型通常是一个由层组成的有向无环图（DAG）。因而函数式 API 是用于构建*层的图*的。
<!--
The main idea that a deep learning model is usually a directed acyclic graph (DAG) of layers. So the functional API is a way to build *graphs of layers*.
-->

考虑以下模型：
<!--
Consider the following model:
-->

```
(input: 784-dimensional vectors)
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (10 units, softmax activation)]
       ↧
(output: logits of a probability distribution over 10 classes)
```

这是一个有着三个层的很基础的图。要想使用函数式 API 构建这个模型，我们首先需要创建一个输入节点：
<!--
This is a basic graph with three layers. To build this model using the functional API, start by creating an input node:
-->

In [2]:
inputs = keras.Input(shape=(784,))

输入数据的形状被设定为一个 784 维的矢量。我们通常略去 batch size，因为只需要指定每一个采样的形状即可。
<!--
The shape of the data is set as a 784-dimensional vector. The batch size is always omitted since only the shape of each sample is specified.
-->

举个例子，如果你需要输入的是形状为 `(32, 32, 3)` 的图像，你可以使用：
<!--
If, for example, you have an image input with a shape of `(32, 32, 3)`, you would use:
-->

In [3]:
# Just for demonstration purposes.
img_inputs = keras.Input(shape=(32, 32, 3))

返回的 `inputs` 包含了你的模型的输入数据的形状和 `dtype` 信息：
<!--
The `inputs` that is returned contains information about the shape and `dtype` of the input data that you feed to your model:
-->

In [4]:
inputs.shape

TensorShape([None, 784])

In [5]:
inputs.dtype

tf.float32

要想在这个由层构成的图中创建一个新节点，你可以对这个 `inputs` 对象调用一个层（call a layer on this `inputs` object）：
<!--
You create a new node in the graph of layers by calling a layer on this `inputs` object:
-->

In [6]:
dense = layers.Dense(64, activation='relu')
x = dense(inputs)

“调用层”的操作有点像是画了一个箭头，从“inputs”指向刚刚创建的这个层。你将输入数据“传入” `dense` 层，输出数据则是 `x`。
<!--
The "layer call" action is like drawing an arrow from "inputs" to this layer you created.
You're "passing" the inputs to the `dense` layer, and out you get `x`.
-->

让我们给这个图加入更多的层吧：
<!--
Let's add a few more layers to the graph of layers:
-->

In [7]:
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10)(x)

此时，你可以创建一个 `Model`，指定其输入和输出对应图中的哪些层：
<!--
At this point, you can create a `Model` by specifying its inputs and outputs in the graph of layers:
-->

In [8]:
model = keras.Model(inputs=inputs, outputs=outputs, name='mnist_model')

让我们检查一下模型概要（model summary）会给出什么内容：
<!--
Let's check out what the model summary looks like:
-->

In [9]:
model.summary()

Model: "mnist_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 784)]             0         
_________________________________________________________________
dense (Dense)                (None, 64)                50240     
_________________________________________________________________
dense_1 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________


你也可以将模型以图片格式绘制出来：
<!--
You can also plot the model as a graph:
-->

In [None]:
keras.utils.plot_model(model, 'my_first_model.png')

而且还可以显示所绘制的图中每一层输入输出的形状：
<!--
And, optionally, display the input and output shapes of each layer in the plotted graph:
-->

In [None]:
keras.utils.plot_model(model, 'my_first_model_with_shape_info.png', show_shapes=True)

这张图和上面的代码几乎等同。在代码中，层之间的连接尖头为调用操作所替代。
<!--
This figure and the code are almost identical. In the code version, the connection arrows are replaced by the call operation.
-->

“层的图”是对深度学习模型的一种直观思维表达，而函数式 API 正是采用这样的思路来创建模型的。
<!--
A "graph of layers" is an intuitive mental image for a deep learning model, and the functional API is a way to create models that closely mirror this.
-->

## 训练，评估和推断
<!--
## Training, evaluation, and inference
-->

对使用函数式 API 创建的模型进行训练、评估和推断，与 `Sequential` 模型是完全一样的。
<!--
Training, evaluation, and inference work exactly in the same way for models built using the functional API as for `Sequential` models.
-->

在这里，我们加载 MNIST 图像数据，将它们转变为矢量，在这些数据上拟合模型（同时在其中划分出来的验证数据集上监测模型性能），然后在测试数据集上评估模型：
<!--
Here, load the MNIST image data, reshape it into vectors, fit the model on the data (while monitoring performance on a validation split), then evaluate the model on the test data:
-->

In [12]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255

model.compile(loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              optimizer=keras.optimizers.RMSprop(),
              metrics=['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size=64,
                    epochs=5,
                    validation_split=0.2)

test_scores = model.evaluate(x_test, y_test, verbose=2)
print('Test loss:', test_scores[0])
print('Test accuracy:', test_scores[1])

Train on 48000 samples, validate on 12000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
10000/10000 - 0s - loss: 0.1029 - accuracy: 0.9690
Test loss: 0.10286470510240178
Test accuracy: 0.969


对于进一步的阅读资料，参见[训练和评估](./train_and_evaluate.ipynb)指南。
<!--
For further reading, see the [train and evaluate](./train_and_evaluate.ipynb) guide.
-->

## 保存和序列化
<!--
## Save and serialize
-->

对使用函数式 API 构建的模型进行保存和序列化的过程，与 `Sequential` 模型完全一致。保存一个函数式模型的标准方法是调用 `model.save()` 来将这整个模型保存为单个文件。在之后你可以使用这个文件重建出一样的模型，哪怕当初搭建这个模型的代码已无处可寻。
<!--
Saving the model and serialization work the same way for models built using the functional API as they do for `Sequential` models. The standard way to save a functional model is to call `model.save()` to save the entire model as a single file. You can later recreate the same model from this file, even if the code that built the model is no longer available.
-->

这个被保存的文件包括了：
- 模型架构
- 模型权重值（它们是在训练过程中需要被学习的对象）
- 训练模型的配置（如果存在的话；它们是被传递给 `compile` 的参数）
- 优化器及其状态（如果存在的话；若你将它们遗漏，那么训练需要从头开始）

<!--
This saved file includes the:
- model architecture
- model weight values (that were learned during training)
- model training config, if any (as passed to `compile`)
- optimizer and its state, if any (to restart training where you left off)
-->

In [None]:
model.save('path_to_my_model')
del model
# Recreate the exact same model purely from the file:
model = keras.models.load_model('path_to_my_model')

要想了解更多细节，请阅读模型的[保存和序列化](./save_and_serialize.ipynb)指南。
<!--
For details, read the model [save and serialize](./save_and_serialize.ipynb) guide.
-->

## 使用相同的层的图来定义多个模型
<!--
## Use the same graph of layers to define multiple models
-->

在函数式 API 中，我们通过在一张层的图（a graph of layers）中指定模型的输入和输出来创建我们的模型。这意味着同一张层的图可以用来生成多个模型。
<!--
In the functional API, models are created by specifying their inputs and outputs in a graph of layers. That means that a single graph of layers can be used to generate multiple models.
-->

在下面的例子中，你将使用同样的层结构来初始化两个模型：一个将图像输入转换为 16 维矢量的 `encoder` 模型，以及一个用于训练的端到端的（end-to-end）`autoencoder` 模型。
<!--
In the example below, you use the same stack of layers to instantiate two models: an `encoder` model that turns image inputs into 16-dimensional vectors,
and an end-to-end `autoencoder` model for training.
-->

In [14]:
encoder_input = keras.Input(shape=(28, 28, 1), name='img')
x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = layers.GlobalMaxPooling2D()(x)

encoder = keras.Model(encoder_input, encoder_output, name='encoder')
encoder.summary()

x = layers.Reshape((4, 4, 1))(encoder_output)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)

autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder')
autoencoder.summary()

Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0   

这里，解码的层次结构与编码的层次结构严格对称，故输出层的形状和输入层的形状相同，均为 `(28, 28, 1)`。
<!--
Here, the decoding architecture is strictly symmetrical to the encoding architecture, so the output shape is the same as the input shape `(28, 28, 1)`.
-->

一个 `Conv2D` 层的逆是一个 `Conv2DTranspose` 层，而 `MaxPooling2D` 层的逆是 `UpSampling2D` 层。
<!--
The reverse of a `Conv2D` layer is a `Conv2DTranspose` layer, and the reverse of a `MaxPooling2D` layer is an `UpSampling2D` layer.
-->

## 所有的模型都是可调用的，就像层一样
<!--
## All models are callable, just like layers
-->

你可以将任何模型看作是一个层，并将其作用于一个 `Input` 层，或者作用于另一个层的输出。调用一个模型时，你不仅仅是重用了模型的层次结构，你还重用了它所有权重值。
<!--
You can treat any model as if it were a layer by invoking it on an `Input` or on the output of another layer. By calling a model you aren't just reusing the architecture of the model, you're also reusing its weights.
-->

要想在实践中认识这一点，我们换一个角度来看自编码器的例子：我们创建一个编码器模型，一个解码器模型，然后在两次调用中将它们链接起来，从而得到自编码器模型：
<!--
To see this in action, here's a different take on the autoencoder example that creates an encoder model, a decoder model, and chain them in two calls to obtain the autoencoder model:
-->

In [15]:
encoder_input = keras.Input(shape=(28, 28, 1), name='original_img')
x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = layers.GlobalMaxPooling2D()(x)

encoder = keras.Model(encoder_input, encoder_output, name='encoder')
encoder.summary()

decoder_input = keras.Input(shape=(16,), name='encoded_img')
x = layers.Reshape((4, 4, 1))(decoder_input)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)

decoder = keras.Model(decoder_input, decoder_output, name='decoder')
decoder.summary()

autoencoder_input = keras.Input(shape=(28, 28, 1), name='img')
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(autoencoder_input, decoded_img, name='autoencoder')
autoencoder.summary()

Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
original_img (InputLayer)    [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 16)                0   

正如你所见，模型可以被嵌套：一个模型可以包含子模型（因为一个模型就像是一个层一样）。模型嵌套的一个常见用途是 `ensembling`（集成学习）。例如，下面的代码将一组模型集成（ensemble）为一个模型，其输出为这组模型预测值的平均值：
<!--
As you can see, the model can be nested: a model can contain sub-models (since a model is just like a layer). A common use case for model nesting is *ensembling*. For example, here's how to ensemble a set of models into a single model that averages their predictions:
-->

In [16]:
def get_model():
    inputs = keras.Input(shape=(128,))
    outputs = layers.Dense(1)(inputs)
    return keras.Model(inputs, outputs)

model1 = get_model()
model2 = get_model()
model3 = get_model()

inputs = keras.Input(shape=(128,))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)

## 操作复杂的图拓扑结构
<!--
## Manipulate complex graph topologies
-->

### 有多个输入和输出的模型
<!--
### Models with multiple inputs and outputs
-->

函数式 API 让我们能很容易地操纵多个输入和输出。这是用 `Sequential` API 不能处理的。
<!--
The functional API makes it easy to manipulate multiple inputs and outputs.
This cannot be handled with the `Sequential` API.
-->

例如，假设你正在构建一个系统，它根据优先级来对订制的 issue ticket 进行排序，并将它们发送给正确的部门。这个模型有三个输入：

- ticket 的标题（文本输入），
- ticket 的文字主体（文本输入），以及
- 用户添加的任何标签（类别输入）

<!--
For example, if you're building a system for ranking custom issue tickets by priority and routing them to the correct department, then the model will have three inputs:

- the title of the ticket (text input),
- the text body of the ticket (text input), and
- any tags added by the user (categorical input)
-->

这个模型有两个输出：

- 介于 0 和 1 之间的优先级（sigmoid 标量输出），以及
- 应当处理这个 ticket 的部门（从部门集合中得到的 softmax 输出）。

<!--
This model will have two outputs:

- the priority score between 0 and 1 (scalar sigmoid output), and
- the department that should handle the ticket (softmax output over the set of departments).
-->

使用函数式 API，你可以用几行代码构建这个模型：
<!--
You can build this model in a few lines with the functional API:
-->

In [17]:
num_tags = 12  # Number of unique issue tags
num_words = 10000  # Size of vocabulary obtained when preprocessing text data
num_departments = 4  # Number of departments for predictions

title_input = keras.Input(shape=(None,), name='title')  # Variable-length sequence of ints
body_input = keras.Input(shape=(None,), name='body')  # Variable-length sequence of ints
tags_input = keras.Input(shape=(num_tags,), name='tags')  # Binary vectors of size `num_tags`

# Embed each word in the title into a 64-dimensional vector
title_features = layers.Embedding(num_words, 64)(title_input)
# Embed each word in the text into a 64-dimensional vector
body_features = layers.Embedding(num_words, 64)(body_input)

# Reduce sequence of embedded words in the title into a single 128-dimensional vector
title_features = layers.LSTM(128)(title_features)
# Reduce sequence of embedded words in the body into a single 32-dimensional vector
body_features = layers.LSTM(32)(body_features)

# Merge all available features into a single large vector via concatenation
x = layers.concatenate([title_features, body_features, tags_input])

# Stick a logistic regression for priority prediction on top of the features
priority_pred = layers.Dense(1, name='priority')(x)
# Stick a department classifier on top of the features
department_pred = layers.Dense(num_departments, name='department')(x)

# Instantiate an end-to-end model predicting both priority and department
model = keras.Model(inputs=[title_input, body_input, tags_input],
                    outputs=[priority_pred, department_pred])

现在绘制这个模型：
<!--
Now plot the model:
-->

In [None]:
keras.utils.plot_model(model, 'multi_input_and_output_model.png', show_shapes=True)

在编译这个模型时，你可以对每个输出设定不同的 loss。你甚至可以为每个 loss 设置一个权重值——以便调节它们对整个训练 loss 的影响程度。
<!--
When compiling this model, you can assign different losses to each output. You can even assign different weights to each loss—to modulate their contribution to the total training loss.
-->

In [19]:
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss=[keras.losses.BinaryCrossentropy(from_logits=True),
                    keras.losses.CategoricalCrossentropy(from_logits=True)],
              loss_weights=[1., 0.2])

因为输出层有不同的名字，你也可以像下面这样来指定 loss 函数：
<!--
Since the output layers have different names, you could also specify the loss like this:
-->

In [20]:
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss={'priority':keras.losses.BinaryCrossentropy(from_logits=True),
                    'department': keras.losses.CategoricalCrossentropy(from_logits=True)},
              loss_weights=[1., 0.2])

传入输入数据和目标数据的 NumPy 数组列表来训练模型：
<!--
Train the model by passing lists of NumPy arrays of inputs and targets:
-->

In [21]:
# Dummy input data
title_data = np.random.randint(num_words, size=(1280, 10))
body_data = np.random.randint(num_words, size=(1280, 100))
tags_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')

# Dummy target data
priority_targets = np.random.random(size=(1280, 1))
dept_targets = np.random.randint(2, size=(1280, num_departments))

model.fit({'title': title_data, 'body': body_data, 'tags': tags_data},
          {'priority': priority_targets, 'department': dept_targets},
          epochs=2,
          batch_size=32)

Train on 1280 samples
Epoch 1/2
Epoch 2/2


<tensorflow.python.keras.callbacks.History at 0x7fb141a01e90>

拟合时如果使用 `Dataset` 对象，该对象应当生成（yield）一个由列表组成的元组，形如 `([title_data, body_data, tags_data], [priority_targets, dept_targets])`，或者是一个由字典组成的元组，形如 `({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department': dept_targets})`。
<!--
When calling fit with a `Dataset` object, it should yield either a tuple of lists like `([title_data, body_data, tags_data], [priority_targets, dept_targets])` or a tuple of dictionaries like
`({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department': dept_targets})`.
-->

对于更细致的解释，请参见[训练和评估](./train_and_evaluate.ipynb)指南。
<!--
For more detailed explanation, refer to the [training and evaluation](./train_and_evaluate.ipynb) guide.
-->

### 一个 ResNet 的玩具模型
<!--
### A toy ResNet model
-->

除了有着多个输入和输出的模型外，函数式 API 还让我们可以很容易地处理非线性连接的拓扑结构——即那些不是顺序连接的层，而这是 `Sequential` API 所不能处理的。
<!--
In addition to models with multiple inputs and outputs, the functional API makes it easy to manipulate non-linear connectivity topologies—these are models with layers that are not connected sequentially. Something the `Sequential` API can not handle.
-->

这种结构常见的例子是 residual connections。让我们为 CIFAR10 构建一个 ResNet “玩具”模型以作为演示：
<!--
A common use case for this is residual connections. Let's build a toy ResNet model for CIFAR10 to demonstrate this:
-->

In [22]:
inputs = keras.Input(shape=(32, 32, 3), name='img')
x = layers.Conv2D(32, 3, activation='relu')(inputs)
x = layers.Conv2D(64, 3, activation='relu')(x)
block_1_output = layers.MaxPooling2D(3)(x)

x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_1_output)
x = layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_2_output = layers.add([x, block_1_output])

x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_2_output)
x = layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_3_output = layers.add([x, block_2_output])

x = layers.Conv2D(64, 3, activation='relu')(block_3_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(10)(x)

model = keras.Model(inputs, outputs, name='toy_resnet')
model.summary()

Model: "toy_resnet"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
img (InputLayer)                [(None, 32, 32, 3)]  0                                            
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 30, 30, 32)   896         img[0][0]                        
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 28, 28, 64)   18496       conv2d_8[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 9, 9, 64)     0           conv2d_9[0][0]                   
_________________________________________________________________________________________

绘制这个模型：
<!--
Plot the model:
-->

In [None]:
keras.utils.plot_model(model, 'mini_resnet.png', show_shapes=True)

现在训练这个模型：
<!--
Now train the model:
-->

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss=keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['acc'])

model.fit(x_train, y_train,
          batch_size=64,
          epochs=1,
          validation_split=0.2)

## 共享层
<!--
## Shared layers
-->

函数式 API 另一个很好的用处是用来构建使用*共享层*的模型。共享层指的是在同一个模型中被重用多次的层实例——它们学习那些在层的图中不同路径所对应的特征。
<!--
Another good use for the functional API are for models that use *shared layers*. Shared layers are layer instances that are reused multiple times in a same model—they learn features that correspond to multiple paths in the graph-of-layers.
-->

共享层常用于编码来自相似空间的输入（例如，具有相似词汇的两段不同的文本）。它们使得这些不同的输入之间可以共享信息，并让我们有可能使用更少的数据来训练这样的模型。如果在其中一个输入中看到了某个单词，那么所有经过该共享层的输入数据的处理都将从中获益。
<!--
Shared layers are often used to encode inputs from similar spaces (say, two different pieces of text that feature similar vocabulary). They enable sharing of information across these different inputs, and they make it possible to train such a model on less data. If a given word is seen in one of the inputs, that will benefit the processing of all inputs that pass through the shared layer.
-->

要想在函数式 API 中共享一个层，只需多次调用那个层即可。例如，下面是一个在两个不同文本输入之间共享的 `Embedding` 层：
<!--
To share a layer in the functional API, call the same layer instance multiple times. For instance, here's an `Embedding` layer shared across two different text inputs:
-->

In [24]:
# Embedding for 1000 unique words mapped to 128-dimensional vectors
shared_embedding = layers.Embedding(1000, 128)

# Variable-length sequence of integers
text_input_a = keras.Input(shape=(None,), dtype='int32')

# Variable-length sequence of integers
text_input_b = keras.Input(shape=(None,), dtype='int32')

# Reuse the same layer to encode both inputs
encoded_input_a = shared_embedding(text_input_a)
encoded_input_b = shared_embedding(text_input_b)

## 提取和重用层的图中的节点
<!--
## Extract and reuse nodes in the graph of layers
-->

因为你正在操作的层的图是静态的数据结构，所以它能被访问和检查。这也正是为什么你能将函数式模型绘制为图片的原因。
<!--
Because the graph of layers you are manipulating is a static data structure, it can be accessed and inspected. And this is how you are able to plot functional models as images.
-->

这也意味着你能访问中间层（图中的“节点”）的激活函数并在其他地方重用它们——这对于特征提取等工作非常有用。
<!--
This also means that you can access the activations of intermediate layers ("nodes" in the graph) and reuse them elsewhere—which is very useful for something like feature extraction.
-->

让我们看一个例子。这是一个 VGG19 模型，其权重已经通过在 ImageNet 上预训练得到：
<!--
Let's look at an example. This is a VGG19 model with weights pretrained on ImageNet:
-->

In [None]:
vgg19 = tf.keras.applications.VGG19()

下面是模型内部各层的激活函数，只需查询图数据结构就可以得到：
<!--
And these are the intermediate activations of the model, obtained by querying the graph data structure:
-->

In [None]:
features_list = [layer.output for layer in vgg19.layers]

使用这些特征来创建一个新的特征提取模型，其返回的是各个中间层激活函数的值：
<!--
Use these features to create a new feature-extraction model that returns the values of the intermediate layer activations:
-->

In [None]:
feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=features_list)

img = np.random.random((1, 224, 224, 3)).astype('float32')
extracted_features = feat_extraction_model(img)

这对于诸如[神经样式转换（neural style transfer）](https://www.tensorflow.org/tutorials/generative/style_transfer)之类的任务非常有用。
<!--
This comes in handy for tasks like [neural style transfer](https://www.tensorflow.org/tutorials/generative/style_transfer), among other things.
-->

## 使用自定义的层来扩展 API
<!--
## Extend the API using custom layers
-->

`tf.keras` 囊括了很多种内置的层，例如：

- 卷积层：`Conv1D`，`Conv2D`，`Conv3D`，`Conv2DTranspose`
- 池化层：`MaxPooling1D`，`MaxPooling2D`，`MaxPooling3D`，`AveragePooling1D`
- RNN 层：`GRU`，`LSTM`，`ConvLSTM2D`
- `BatchNormalization`，`Dropout`，`Embedding`等等。

<!--
`tf.keras` includes a wide range of built-in layers, for example:

- Convolutional layers: `Conv1D`, `Conv2D`, `Conv3D`, `Conv2DTranspose`
- Pooling layers: `MaxPooling1D`, `MaxPooling2D`, `MaxPooling3D`, `AveragePooling1D`
- RNN layers: `GRU`, `LSTM`, `ConvLSTM2D`
- `BatchNormalization`, `Dropout`, `Embedding`, etc.
-->

但是如果从中找不到你所需要的层，你可以轻而易举地创建你自己的层来扩展 API。所有的层继承自 `Layer` 类，并且实现：

- `call` 方法，定义该层所完成的计算工作。
- `build` 方法，为该层创建权重参数（这只是代码风格的约定，因为你还可以在 `__init__` 里创建权重）。

<!--
But if you don't find what you need, it's easy to extend the API by creating your own layers. All layers subclass the `Layer` class and implement:

- `call` method, that specifies the computation done by the layer.
- `build` method, that creates the weights of the layer (this is just a style convention since you can create weights in `__init__`, as well).
-->

要想了解更多关于从头创建层的知识，请参阅[自定义层和模型](./custom_layers_and_models.ipynb)指南。
<!--
To learn more about creating layers from scratch, read [custom layers and models](./custom_layers_and_models.ipynb) guide.
-->

下面是对 `tf.keras.layers.Dense` 的一个基本实现：
<!--
The following is a basic implementation of `tf.keras.layers.Dense`:
-->

In [25]:
class CustomDense(layers.Layer):
    def __init__(self, units=32):
        super(CustomDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='random_normal',
                                 trainable=True)

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b


inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)

model = keras.Model(inputs, outputs)

若要使你的自定义层支持序列化，定义一个 `get_config` 方法，其返回层实例的构造函数参数：
<!--
For serialization support in your custom layer, define a `get_config` method that returns the constructor arguments of the layer instance:
-->

In [26]:
class CustomDense(layers.Layer):

    def __init__(self, units=32):
        super(CustomDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='random_normal',
                                 trainable=True)

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        return {'units': self.units}


inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)

model = keras.Model(inputs, outputs)
config = model.get_config()

new_model = keras.Model.from_config(
    config, custom_objects={'CustomDense': CustomDense})

你还可以定义类方法 `from_config(cls, config)`（可选），它被用于根据层实例的 config 字典来重建该层实例。`from_config` 的默认实现是：
<!--
Optionally, implement the classmethod `from_config(cls, config)` which is used when recreating a layer instance given its config dictionary. The default implementation of `from_config` is:
-->

```python
def from_config(cls, config):
  return cls(**config)
```

## 何时使用函数式 API
<!--
## When to use the functional API
-->

什么时候你应该使用 Keras 函数式 API 来创建一个新模型，而什么时候你又应该直接继承 `Model` 类？一般而言，函数式 API 更高级、更易用、更安全，而且拥有许多派生模型类所不支持的特性。
<!--
When should you use the Keras functional API to create a new model, or just subclass the `Model` class directly? In general, the functional API is higher-level, easier and safer, and has a number of features that subclassed models do not support.
-->

然而，在所构建的模型不能容易地被表达为层的有向无环图时，模型派生为我们带来了更高的便利性。例如，你不能使用函数式 API 来实现一个 Tree-RNN，而是必须直接继承 `Model`。
<!--
However, model subclassing provides greater flexibility when building models that are not easily expressible as directed acyclic graphs of layers. For example, you could not implement a Tree-RNN with the functional API and would have to subclass `Model` directly.
-->

对于深入解读函数式 API 和模型派生之间的区别，请阅读文章《[What are Symbolic and Imperative APIs in TensorFlow 2.0?](https://blog.tensorflow.org/2019/01/what-are-symbolic-and-imperative-apis.html)》。
<!--
For in-depth look at the differences between the functional API and model subclassing, read [What are Symbolic and Imperative APIs in TensorFlow 2.0?](https://blog.tensorflow.org/2019/01/what-are-symbolic-and-imperative-apis.html).
-->

### 函数式 API 的强大
<!--
### Functional API strengths
-->

以下性质 `Sequential` 模型也具备（它们都是数据结构，data structures），但是派生的模型是没有这些性质的（派生的模型是 Python 字节码，而非数据结构）。
<!--
The following properties are also true for Sequential models (which are also data structures), but are not true for subclassed models (which are Python bytecode, not data structures).
-->

#### 更简洁
<!--
#### Less verbose
-->

使用函数式 API，无须编写 `super(MyClass, self).__init__(...)`、`def call(self, ...):` 等代码。
<!--
There is no `super(MyClass, self).__init__(...)`, no `def call(self, ...):`, etc.
-->

比较一下：
<!--
Compare:
-->

```python
inputs = keras.Input(shape=(32,))
x = layers.Dense(64, activation='relu')(inputs)
outputs = layers.Dense(10)(x)
mlp = keras.Model(inputs, outputs)
```

而派生版如下：
<!--
With the subclassed version:
-->

```python
class MLP(keras.Model):

    def __init__(self, **kwargs):
        super(MLP, self).__init__(**kwargs)
        self.dense_1 = layers.Dense(64, activation='relu')
        self.dense_2 = layers.Dense(10)

    def call(self, inputs):
        x = self.dense_1(inputs)
        return self.dense_2(x)

# Instantiate the model.
mlp = MLP()
# Necessary to create the model's state.
# The model doesn't have a state until it's called at least once.
_ = mlp(tf.zeros((1, 32)))
```

#### 在定义时就检查模型
<!--
#### Model validation while defining
-->

在函数式 API 中，输入数据的格式（形状和 dtype）是（通过 `Input`）提前被创建好的。每次你调用一个层，它都会检查传递给它的数据格式是否与预先的假设相匹配。如果不匹配，它会抛出一条有价值的错误信息。
<!--
In the functional API, the input specification (shape and dtype) is created in advance (using `Input`). Every time you call a layer, the layer checks that the specification passed to it matches its assumptions, and it will raise a helpful error message if not.
-->

这保证了你使用函数式 API 创建的任何模型都能正常运行。除了与模型收敛性相关的调试外，所有调试都在模型构建过程中静态地发生，而不是在执行时发生。这类似于编译器中的类型检查。
<!--
This guarantees that any model you can build with the functional API will run. All debugging—other than convergence-related debugging—happens statically during the model construction and not at execution time. This is similar to type checking in a compiler.
-->

#### 函数式模型可以被绘图和访问
<!--
#### A functional model is plottable and inspectable
-->

你可以将函数式模型作为一张图绘制出来，而且你能轻而易举地访问图内部的节点。例如，要想提取出中间层的激活函数并重用它们，你可以像这样（和上文中的例子一样）：
<!--
You can plot the model as a graph, and you can easily access intermediate nodes in this graph. For example, to extract and reuse the activations of intermediate layers (as seen in a previous example):
-->

```python
features_list = [layer.output for layer in vgg19.layers]
feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=features_list)
```

#### 函数式模型可以被序列化和复制
<!--
#### A functional model can be serialized or cloned
-->

因为函数式模型是一个数据结构而非一段代码，所以我们可以很容易地将其安全地序列化并存储为一个文件，我们可以使用这个文件重建出完全相同的模型，而无需拥有最原始的那些代码。参见[存储和序列化指南](./save_and_serialize.ipynb)。
<!--
Because a functional model is a data structure rather than a piece of code, it is safely serializable and can be saved as a single file that allows you to recreate the exact same model without having access to any of the original code. See the [saving and serialization guide](./save_and_serialize.ipynb).
-->


### 函数式 API 的弱点
<!--
### Functional API weakness
-->

#### 不支持动态结构
<!--
#### Does not support dynamic architectures
-->

函数式 API 将模型当作层的有向无环图来处理。这对于大部分深度学习结构而言都是成立的，但也存在例外——例如，递归网络和 Tree RNN 不遵循此假设，因此无法使用函数式 API 实现。
<!--
The functional API treats models as DAGs of layers. This is true for most deep learning architectures, but not all—for example, recursive networks or Tree RNNs do not follow this assumption and cannot be implemented in the functional API.
-->

#### 一切从头创建
<!--
#### Everything from scratch
-->

在编写高级结构时，你可能想要做一些有意思的事，而不仅仅是定义一个层的有向无环图。例如，要想在你的模型实例上开放多种自定义的训练和推断方法（to expose multiple custom training and inference methods on your model instance），你必须使用模型派生。
<!--
When writing advanced architectures, you may want to do things that are outside the scope of defining a DAG of layers. For example, you must use model subclassing to expose multiple custom training and inference methods on your model instance.
-->

## 混合-匹配 API 风格
<!--
## Mix-and-match API styles
-->

选择函数式 API 还是模型继承并不是一个二元的决定，这不会限制你只能使用它们当中的一种模型。`tf.keras` API 中的所有模型都可以搭配使用，无论它们是 `Sequential` 模型，函数式模型，还是从头编写的派生模型。
<!--
Choosing between the functional API or Model subclassing isn't a binary decision that restricts you into one category of models. All models in the `tf.keras` API can interact with each other, whether they're `Sequential` models, functional models, or subclassed models that are written from scratch.
-->

你始终可以将函数式模型或 `Sequential` 模型当作派生模型/派生层的一部分来使用：
<!--
You can always use a functional model or `Sequential` model as part of a subclassed model or layer:
-->

In [27]:
units = 32
timesteps = 10
input_dim = 5

# Define a Functional model
inputs = keras.Input((None, units))
x = layers.GlobalAveragePooling1D()(inputs)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)


class CustomRNN(layers.Layer):
    def __init__(self):
        super(CustomRNN, self).__init__()
        self.units = units
        self.projection_1 = layers.Dense(units=units, activation='tanh')
        self.projection_2 = layers.Dense(units=units, activation='tanh')
        # Our previously-defined Functional model
        self.classifier = model

    def call(self, inputs):
        outputs = []
        state = tf.zeros(shape=(inputs.shape[0], self.units))
        for t in range(inputs.shape[1]):
            x = inputs[:, t, :]
            h = self.projection_1(x)
            y = h + self.projection_2(state)
            state = y
            outputs.append(y)
        features = tf.stack(outputs, axis=1)
        print(features.shape)
        return self.classifier(features)


rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, timesteps, input_dim)))

(1, 10, 32)


你也可以在函数式 API 中使用任意派生层或者派生模型，只要其以如下之一的方式实现 `call` 方法：

- `call(self, inputs, **kwargs)`：`inputs` 是一个张量，或者张量组成的潜逃结构（例如一个张量列表）；`**kwargs` 是非张量（非输入数据）的函数参数。
- `call(self, inputs, training=None, **kwargs)`：`training` 是一个布尔值，指定该层应当以训练模式还是以推断模式运行。
- `call(self, inputs, mask=None, **kwargs)`：`mask` 是一个布尔值组成的掩码张量（在诸如 RNN 之类的情况下有用）。
- `call(self, inputs, training=None, mask=None, **kwargs)`：显而易见，这种定义方式将掩码和训练时的行为同时给定。

<!--
You can use any subclassed layer or model in the functional API as long as it implements a `call` method that follows one of the following patterns:

- `call(self, inputs, **kwargs)`  —Where `inputs` is a tensor or a nested structure of tensors (e.g. a list of tensors), and where `**kwargs` are non-tensor arguments (non-inputs).
- `call(self, inputs, training=None, **kwargs)` —Where `training` is a boolean indicating whether the layer should behave in training mode and inference mode.
- `call(self, inputs, mask=None, **kwargs)` —Where `mask` is a boolean mask tensor (useful for RNNs, for instance).
- `call(self, inputs, training=None, mask=None, **kwargs)` —Of course, you can have both masking and training-specific behavior at the same time.
-->

此外，如果你在你的自定义层/模型中实现了 `get_config` 方法，那么你基于它们创建的函数式模型仍然是可序列化和可复制的。
<!--
Additionally, if you implement the `get_config` method on your custom Layer or model, the functional models you create will still be serializable and cloneable.
-->

下面的简单例子展示了如何在函数式模型中使用一个从头编写的自定义 RNN：
<!--
Here's a quick example of a custom RNN written from scratch in a functional model:
-->

In [28]:
units = 32
timesteps = 10
input_dim = 5
batch_size = 16


class CustomRNN(layers.Layer):
    def __init__(self):
        super(CustomRNN, self).__init__()
        self.units = units
        self.projection_1 = layers.Dense(units=units, activation='tanh')
        self.projection_2 = layers.Dense(units=units, activation='tanh')
        self.classifier = layers.Dense(1)

    def call(self, inputs):
        outputs = []
        state = tf.zeros(shape=(inputs.shape[0], self.units))
        for t in range(inputs.shape[1]):
            x = inputs[:, t, :]
            h = self.projection_1(x)
            y = h + self.projection_2(state)
            state = y
            outputs.append(y)
        features = tf.stack(outputs, axis=1)
        return self.classifier(features)


# Note that you specify a static batch size for the inputs with the `batch_shape`
# arg, because the inner computation of `CustomRNN` requires a static batch size
# (when you create the `state` zeros tensor).
inputs = keras.Input(batch_shape=(batch_size, timesteps, input_dim))
x = layers.Conv1D(32, 3)(inputs)
outputs = CustomRNN()(x)

model = keras.Model(inputs, outputs)

rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, 10, 5)))