# Keras 高级用法及其他

## Keras Function API

在 Keras 中使用 `Sequential` 定义的模型，只能有一个输入和一个输出，这常常是不够灵活的。要实现更多样的网络结构，需要用到 Function API。

下面两个代码块定义了两个模型，分别采用 `Sequential` 和 Function API 写法，但这两个模型本质上都是一样的。在 `Sequential` 这个类中，只需要添加各种功能的层，该类能够帮助完成数据的传递，本质就是把数据以流水线的形式输入给一个一个的层得到最终结果。

In [2]:
from keras.models import Sequential, Model
from keras import layers
from keras import Input

seq_model = Sequential()
seq_model.add(layers.Dense(32, activation='relu', input_shape=(64,)))
seq_model.add(layers.Dense(32, activation='relu'))
seq_model.add(layers.Dense(10, activation='softmax'))
seq_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_5 (Dense)              (None, 32)                1056      
_________________________________________________________________
dense_6 (Dense)              (None, 10)                330       
Total params: 3,466
Trainable params: 3,466
Non-trainable params: 0
_________________________________________________________________


使用 Function API 则由自己编程来完成数据的流动，因此更加灵活。在使用 `Sequential` 时，把 `layers.Dense(32, activation='relu')` 当做一个层，其实 `layers.Dense(32, activation='relu')`  本质上是一个函数，它对输入做运行后得到输出。

In [4]:
input_tensor = Input(shape=(64,))
x = layers.Dense(32, activation='relu')(input_tensor)
x = layers.Dense(32, activation='relu')(x)
output_tensor = layers.Dense(10, activation='softmax')(x)

model = Model(input_tensor, output_tensor)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 64)                0         
_________________________________________________________________
dense_7 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_8 (Dense)              (None, 32)                1056      
_________________________________________________________________
dense_9 (Dense)              (None, 10)                330       
Total params: 3,466
Trainable params: 3,466
Non-trainable params: 0
_________________________________________________________________


## 有多个输入的模型

![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/05/11/5cd6a5753a213b04174bc8e1.jpg)

针对上面这种模型，下面就故意造一个例子，对 IMDB 的影评进行分类，两个输入分别是评论文本和翻转的评论文本。本质可能就和双向 LSTM 差不多吧。

In [6]:
from keras.datasets import imdb
from keras.preprocessing import sequence

vocab_size = 10000
max_len = 200

(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=vocab_size)

# pad_sequences 的功能是让所有序列都一样长，长度截断，短的补零
input_train = sequence.pad_sequences(input_train, maxlen=max_len)
input_test = sequence.pad_sequences(input_test, maxlen=max_len)

# 把样本中的所有句子倒转过来
input_train_reversed = input_train[:,::-1]
input_test_reversed = input_test[:,::-1]

In [18]:
from keras import layers, Input, Model

text1_input = Input(shape=(None,), dtype='int32', name='text1')
embeded_text1 = layers.Embedding(vocab_size, 128)(text1_input)
coded_text1 = layers.LSTM(32)(embeded_text1)

text2_input = Input(shape=(None,), dtype='int32', name='text2')
embeded_text2 = layers.Embedding(vocab_size, 128)(text2_input)
coded_text2 = layers.LSTM(32)(embeded_text2)

concatenated = layers.concatenate([coded_text1, coded_text2], axis=-1)

y = layers.Dense(50, activation='relu')(concatenated)
label = layers.Dense(1, activation='sigmoid')(y)

model = Model([text1_input, text2_input], label)
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['acc'])

In [19]:
model.fit([input_train, input_train_reversed], y_train,
          epochs=10, batch_size=128, validation_split=0.2)

Train on 20000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f80380f50f0>

看样子还真能 work，不过因为参数较多，产生了 overfitting。

前面 `model.fit` 函数还可以像下面这样调用，即使用一个 map 传入参数。这里 map 的 key 就是前面定义 Input 时给定的 name。

```python
model.fit({'text1': input_train, 'text2': input_train_reversed},
          y_train, epochs=10, batch_size=128, validation_split=0.2)
```

## 层共享

上例中，Embedding 层就没有必要出现两个，两个输入可以经过同一个 Embedding 层，然后分别经过不同的 LSTM 层。如此，就需要对 Embedding 层进行共享。

In [21]:
from keras import layers, Input, Model

embedding = layers.Embedding(vocab_size, 128)

text1_input = Input(shape=(None,), dtype='int32', name=' ')
embeded_text1 = embedding(text1_input)
coded_text1 = layers.LSTM(32)(embeded_text1)

text2_input = Input(shape=(None,), dtype='int32', name='text2')
embeded_text2 = embedding(text2_input)
coded_text2 = layers.LSTM(32)(embeded_text2)

concatenated = layers.concatenate([coded_text1, coded_text2], axis=-1)

y = layers.Dense(50, activation='relu')(concatenated)
label = layers.Dense(1, activation='sigmoid')(y)

model = Model([text1_input, text2_input], label)
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['acc'])

model.fit([input_train, input_train_reversed], y_train, epochs=5, batch_size=128, validation_split=0.2)

Train on 20000 samples, validate on 5000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f7ca1ab7ef0>

## Keras Callback

Callback 就是在训练的各个特定的点调用一些函数，这样用户可以干预到训练，比如早停，或者显示训练进度、打印日志、保存模型等等。

In [34]:
import time
import keras

class EpochTimeLogger(keras.callbacks.Callback):
    def __init__(self):
        self.epoch_start_time = None
    def set_model(self, model):
        self.model = model
    
    def on_epoch_begin(self, epoch, logs=None):
        self.epoch_start_time = time.time()
        self.epochs = self.params['epochs']
    
    def on_epoch_end(self, epoch, logs=None):
        now = time.time()
        duration = now - self.epoch_start_time
        print('Epoch %d/%d - %.1fs' % (epoch + 1, self.epochs, duration))

In [36]:
callbacks_list = [
    EpochTimeLogger()
]

model.fit([input_train, input_train_reversed], y_train,
          callbacks=callbacks_list, verbose=0,
          epochs=3, batch_size=512, validation_split=0.2)

Epoch 1/3 - 24.7s
Epoch 2/3 - 24.5s
Epoch 3/3 - 24.5s


<keras.callbacks.History at 0x7f80858345c0>

## TensorBoard

在终端输入 `tensorboard --logdir=logs` 后，打开浏览器在 6006 即可看到。

In [42]:
import keras

callbacks = [
    keras.callbacks.TensorBoard(
        # Log files will be written at this location
        log_dir='logs',
        # We will record activation histograms every 1 epoch
        histogram_freq=1,
    )
]

model.fit([input_train, input_train_reversed], y_train,
          callbacks=callbacks,
          epochs=5, batch_size=512, validation_split=0.2)

Train on 20000 samples, validate on 5000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f7afb3c1048>