# 7.2 构建keras模型的不同方式.ipynb

具体而言是 3 种构建方式

- Sequential model: 最容易上手的方式,脱胎于 python 的 list,基本上只是层的堆叠.
- Functional API: 以类似图的方式构建模型,在易用性和灵活性算是取了一个平衡点.
- Model subclassing: 非常底层的 api,需要自行实现所有,如果你需要控制全部细节,就用这个.代价是许多 keras 内置功能无法使用,自行编码的错误可能会更高.

下图显示了构建 keras 模型复杂性一步一步上升的过程

![progressive_disclosure_of_complexity_models](./progressive_disclosure_of_complexity_models.png)


## Sequential model


In [1]:
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

这是 mnist 例子中的模型


In [2]:
model = keras.Sequential()
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

上文说到了 Sequential model 是类似 python list 的存在,因此也可以通过 `add` 构建相同的模型.


In [3]:
# model.weights

model 只有在第一次调用时候才真正被初始化,换句话说层的权重只有在第一次调用时才会初始化.上面直接输出权重就错误了.


In [4]:
model.build(input_shape=(None, 3))
print(model.weights)

[<tf.Variable 'dense_2/kernel:0' shape=(3, 64) dtype=float32, numpy=
array([[-0.12450404, -0.2864981 ,  0.08807212, -0.07101445,  0.21552074,
         0.15272623, -0.27197868,  0.13138908, -0.21067362, -0.15888381,
        -0.27524817,  0.00893292,  0.15862867, -0.29530695,  0.20490652,
        -0.21327066,  0.02698416, -0.20854318, -0.2510046 , -0.17867965,
        -0.07292557, -0.07820041,  0.21656394,  0.16597024, -0.12260599,
        -0.25309044, -0.01648897,  0.13855463,  0.06278503, -0.28944796,
        -0.07055576,  0.01551777,  0.1646626 ,  0.2560295 ,  0.00901112,
        -0.27386504,  0.05184311,  0.00488189, -0.03607276, -0.10236272,
        -0.05304053,  0.00050977,  0.02817604, -0.06821556,  0.19598797,
        -0.23459485,  0.20048857,  0.27072448, -0.20874166,  0.09871808,
        -0.19106713,  0.19012186, -0.24459982, -0.28274274,  0.16840756,
        -0.17296644, -0.14608458, -0.27889997, -0.06115852,  0.26868278,
         0.12120223, -0.16153614, -0.09385027,  0.12430

`build` 以后,有了输入的规模,weight 才能初始化.


In [5]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 64)                256       
_________________________________________________________________
dense_3 (Dense)              (None, 10)                650       
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________


一旦模型初始化后,调用 `model.summary()` 就能输出模型的结构.对调试很方便.


In [6]:
model = keras.Sequential(name="my_example_model")
model.add(layers.Dense(64, activation="relu", name="my_first_layer"))
model.add(layers.Dense(10, activation="softmax", name="my_last_layer"))
model.build((None, 3))
model.summary()

Model: "my_example_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
my_first_layer (Dense)       (None, 64)                256       
_________________________________________________________________
my_last_layer (Dense)        (None, 10)                650       
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________


model 默认名称是 `sequential_1` 每一层是 `dense_2` `dense_3`.
实际上 keras 允许通过 `name` 字段设置 model/layer 名称. just for fun.


In [7]:
model = keras.Sequential()
model.add(layers.InputLayer(input_shape = (3,)))
model.add(layers.Dense(64, activation="relu"))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 64)                256       
Total params: 256
Trainable params: 256
Non-trainable params: 0
_________________________________________________________________


In [8]:
model.add(layers.Dense(10, activation="softmax"))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 64)                256       
_________________________________________________________________
dense_5 (Dense)              (None, 10)                650       
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________


上文无法直接调用 `model.summary()` 是模型还没有初始化,模型不知道输入规模.但是如果提前指定了输入规模,那直接调用 `model.summary()` 就能输出了.

上面这样添加层再 `summary()` 查看,是挺常见的工作流,下一章学习卷积层时会常用.


## Functional API

Sequential 构建模型非常方便,但是通过 Sequential 构建的模型却非常受限,它只能有单一的输入输出,以顺序的形式堆叠.Sequential 无法处理实践中常常会有一次性多个输入(例如: 输入图片和对应的元数据),多个输出(例如: 预测数据多个方面).同时 Sequential 无法构建非线性拓扑模型.

这种情况下就是 Functional API 的天下了,使用 Functional API 构建的模型几乎是最常见的 keras 模型.

想想乐高积木,Functional API 与之类似.


### 单输入单输出模型


In [9]:
inputs = keras.Input(shape=(3,), name="my_input")
features = layers.Dense(64, activation="relu")(inputs)
outputs = layers.Dense(10, activation="softmax")(features)
model = keras.Model(inputs=inputs, outputs=outputs)

与上面例子相同的模型,2层,单输入单输出.


In [10]:
inputs = keras.Input(shape=(3,), name="my_input")

In [11]:
inputs.shape

TensorShape([None, 3])

In [12]:
inputs.dtype

tf.float32

从 Input 开始,Input 定义了模型的输入.(Input 也有 name属性).

我们把这边实例称为 符号张量 (symbolic tensor 没找到这个词对应的翻译,暂且如此称呼了),它不包含任何实际的数据,但是代表了将会输入模型的实际张量的规模.


In [13]:
features = layers.Dense(64, activation="relu")(inputs)

In [14]:
features.shape

TensorShape([None, 64])

所有 keras 层都可以调用真实的数据张量或者符号张量,调用符号张量时,keras 层返回的也是符号张量,代表这一层输出的规模.


In [15]:
outputs = layers.Dense(10, activation="softmax")(features)
model = keras.Model(inputs=inputs, outputs=outputs)
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
my_input (InputLayer)        [(None, 3)]               0         
_________________________________________________________________
dense_8 (Dense)              (None, 64)                256       
_________________________________________________________________
dense_9 (Dense)              (None, 10)                650       
Total params: 906
Trainable params: 906
Non-trainable params: 0
_________________________________________________________________


最后我们定义了模型的输出,调用 `summary` 查看模型的情况.


### 多输入多输出模型

实际上列表堆叠的模型适用范围很小,很多场景下模型的拓扑更像图而不是简单的层叠.构建这样的模型是 Functional API 的强项了.


In [16]:
vocabulary_size = 10000  #文本编码长度
num_tags = 100  # one-hot 编码长度
num_departments = 4  # 部门几个
# 输入数据
title = keras.Input(shape=(vocabulary_size, ), name="title")
text_body = keras.Input(shape=(vocabulary_size, ), name="text_body")
tags = keras.Input(shape=(num_tags, ), name="tags")
# 中间 layer
features = layers.Concatenate()([title, text_body, tags])
features = layers.Dense(64, activation="relu")(features)
# 输出数据
priority = layers.Dense(1, activation="sigmoid",
                        name="priority")(features)  #优先级
department = layers.Dense(num_departments,
                          activation="softmax",
                          name="department")(features)  #部门

model = keras.Model(inputs=[title, text_body, tags],
                    outputs=[priority, department])

例如上面的模型是按照优先级对客户的票据进行排序,并将其转交给适当部门.

输入

- 票据的标题: 文本
- 票据的正文: 文本
- 用户添加的任何标签: 这里假定是 one-hot 编码

输出

- 票据的优先级: 0~1 sigmoid
- 票据需要转交的部分: 部门的集合 softmax

文本信息编码成了 vocabulary_size 的 01 数组,详情见 11 章.


#### 训练多输入多输出模型


In [17]:
import numpy as np

num_samples = 1280

title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))  #标题
text_body_data = np.random.randint(0, 2,
                                   size=(num_samples, vocabulary_size))  #正文
tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))  #标签

priority_data = np.random.random(size=(num_samples, 1))  #优先级
department_data = np.random.randint(0, 2,
                                    size=(num_samples, num_departments))  #部门

model.compile(
    optimizer="adam",  #优化器
    loss=["mean_squared_error", "categorical_crossentropy"],  #损失函数
    metrics=[["mean_absolute_error"], ["accuracy"]])  #指标

model.fit([title_data, text_body_data, tags_data],
          [priority_data, department_data],
          epochs=1)  #训练模型,传入数据要和 model 声明一致



<tensorflow.python.keras.callbacks.History at 0x194b7a758b0>

训练过程和上面的单输入单输出模型一样,只是输入和输出的时组个 `[]` 传入.


In [18]:
results = model.evaluate([title_data, text_body_data, tags_data],
               [priority_data, department_data])#评估模型 传入数据要和 model 声明一致
results



[7.965667724609375,
 0.3465381860733032,
 7.6191301345825195,
 0.5110427141189575,
 0.26171875]

测试集评估也是一样.


In [19]:
priority_preds, department_preds = model.predict(
    [title_data, text_body_data, tags_data])  # 获取预测值

获取预测值也是一样.


In [20]:
model.compile(
    optimizer="adam",  #优化器
    loss={
        "priority": "mean_squared_error",
        "department": "categorical_crossentropy"
    },  #损失函数
    metrics={
        "priority": ["mean_absolute_error"],
        "department": ["accuracy"]
    })  #指标

model.fit({
    "title": title_data,
    "text_body": text_body_data,
    "tags": tags_data
}, {
    "priority": priority_data,
    "department": department_data
},
          epochs=1)




<tensorflow.python.keras.callbacks.History at 0x194b8d47d30>

格式化完成略喜感..

---

如果不想遵守严格的顺序,可以采用类似字典的方式.

定义损失函数等也能用类似字典.


In [21]:
priority_preds, department_preds = model.predict({
    "title": title_data,
    "text_body": text_body_data,
    "tags": tags_data
})


In [22]:
priority_preds, department_preds = model.predict({
    "title": title_data,
    "text_body": text_body_data,
    "tags": tags_data
})

评估和获取预测值类似.

In [25]:
keras.utils.plot_model(model, "ticket_classifier.png")

('You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) ', 'for plot_model/model_to_dot to work.')
