# **7  Wide & Deep模型**

## **7.1 模型解读**

### **7.1.1 稀疏特征和密集特征**

- 谷歌于2016年发布的模型，原适用于推荐系统
- 可以适用于分类和回归
- 利用数据的稀疏特征和密集特征构建模型

**稀疏特征**
- 离散值特征
- wd模型中使用one-hot表示
- 稀疏特征之间便于进行组合，即特征交互
    - 便于产生新的描述特征
    - 便于处理共现特征
- 优点
    - 有效，大量用于工业界(CTR)
    - 便于进行需要特征分裂的算法，如树算法
- 缺点
    - 人工设计
    - 耗费存储
    
    

**密集特征**
- 向量表达
    -词表->向量：Word2Vec
- 优点
    - 带有语义和关联信息
    - 可以兼容新的特征组合
    - 较少人工参与
- 缺点
    - 过度泛化

###  **7.1.2 模型结构**

<div style='align-items:center'>
<img src='../image/1_1_wide_deep.png', alt='wide_deep_model'>
</div>

**wide & deep**模型将wide模型和deep模型组合起来，其中
- wide模型只有一层，所有的稀疏输入会连接到输出层上
- deep模型通过将稀疏特征的密集化表示，然后将密集特征输入到深层的神经网络中

**示例结构**

<div style='align-items:center'>
<img src='../image/1_2_wide_deep.png', alt='wide_deep_model'>
</div>

上图将*user installed app*和*impression app*进行特征组合后输入到了wide模型中，而其他的输入部分进行了密集嵌入，部分没有

## **7.2 代码实现（回归为例）**

### **7.2.1 数据导入**

In [1]:
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os, sys, time, gc
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import backend as K

In [2]:
from sklearn.datasets import fetch_california_housing

In [3]:
housing = fetch_california_housing()

In [4]:
from sklearn.model_selection import train_test_split
X_train_all, X_test, y_train_all, y_test = train_test_split(housing.data, housing.target, random_state=1)
X_train, X_val, y_train, y_val = train_test_split(X_train_all, y_train_all, random_state=1)

In [5]:
from sklearn.preprocessing import StandardScaler
std_scaler = StandardScaler()
X_train_scaled = std_scaler.fit_transform(X_train)
X_val_scaled = std_scaler.transform(X_val)
X_test_scaled = std_scaler.transform(X_test)

### **7.2.2 模型构建(函数式API)**

In [6]:
# 函数式API构建
#deep_model
input = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = keras.layers.Dense(30, activation='relu')(input)
hidden2 = keras.layers.Dense(30, activation='relu')(hidden1)

# 拼接
concat = keras.layers.concatenate([input, hidden2])
output = keras.layers.Dense(1)(concat)

model = keras.models.Model(inputs=[input], outputs=[output])

In [7]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 8)]          0                                            
__________________________________________________________________________________________________
dense (Dense)                   (None, 30)           270         input_1[0][0]                    
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 30)           930         dense[0][0]                      
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 38)           0           input_1[0][0]                    
                                                                 dense_1[0][0]                

In [8]:
model.compile(loss='mse', optimizer='Adam')

In [9]:
callbacks = [keras.callbacks.EarlyStopping(min_delta=1e-3, patience=3),]
history = model.fit(X_train_scaled, y_train, batch_size=128, epochs=20, validation_data=(X_val_scaled, y_val), callbacks=callbacks)

Train on 11610 samples, validate on 3870 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [10]:
del model; gc.collect()

556

In [11]:
K.clear_session()

### **7.2.3 模型构建（子类api）**

In [12]:
# 继承Model来创建一个Model

class WideDeepModel(keras.models.Model):
    def __init__(self):
        super(WideDeepModel, self).__init__()
        self.hidden1_layer = keras.layers.Dense(30, activation='relu')
        self.hidden2_layer = keras.layers.Dense(30, activation='relu')
        self.output_layer = keras.layers.Dense(1)
    
    def call(self, input):
        """完成模型的前向计算"""
        hidden1 = self.hidden1_layer(input)
        hidden2 = self.hidden2_layer(hidden1)
        concat = keras.layers.concatenate([input, hidden2])
        output = self.output_layer(concat)
        return output
model = WideDeepModel()
# 需要运行build来构建模型
model.build(input_shape=(None, 8))

In [13]:
model.summary()
# 也可以放到Sequential模型中
model.compile(loss='mse', optimizer='Adam')
callbacks = [keras.callbacks.EarlyStopping(min_delta=1e-3, patience=3),]
history = model.fit(X_train_scaled, y_train, batch_size=128, epochs=20, validation_data=(X_val_scaled, y_val), callbacks=callbacks)

Model: "wide_deep_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                multiple                  270       
_________________________________________________________________
dense_1 (Dense)              multiple                  930       
_________________________________________________________________
dense_2 (Dense)              multiple                  39        
Total params: 1,239
Trainable params: 1,239
Non-trainable params: 0
_________________________________________________________________
Train on 11610 samples, validate on 3870 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [18]:
del model; gc.collect();
K.clear_session()

### **7.2.4 模型构建（多输入）**

In [19]:
input_wide = keras.layers.Input(shape=[5])
input_deep = keras.layers.Input(shape=[6])
hidden1 = keras.layers.Dense(30, activation='relu')(input_deep)
hidden2 = keras.layers.Dense(30, activation='relu')(hidden1)
concat = keras.layers.concatenate([input_wide, hidden2])
output = keras.layers.Dense(1)(concat)
model = keras.models.Model(inputs=[input_wide, input_deep], outputs=[output])

In [22]:
X_train_scaled_wide = X_train_scaled[:, :5]
X_train_scaled_deep = X_train_scaled[:, 2:]
X_val_scaled_wide = X_val_scaled[:, :5]
X_val_scaled_deep = X_val_scaled[:, 2:]
model.compile(loss='mse', optimizer='Adam')
callbacks = [keras.callbacks.EarlyStopping(min_delta=1e-3, patience=3)]

In [23]:
history = model.fit((X_train_scaled_wide, X_train_scaled_deep), y_train,
                    validation_data=[[X_val_scaled_wide, X_val_scaled_deep], y_val], epochs=10, batch_size=128, callbacks=callbacks)

Train on 11610 samples, validate on 3870 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [24]:
del model; gc.collect();
K.clear_session()

### **7.2.5 模型构建（多输入多输出）**

多输出模型通常应用在多任务学习的模型中，wide&deep模型本身不是一个多输出模型

In [29]:
input_wide = keras.layers.Input(shape=[5])
input_deep = keras.layers.Input(shape=[6])
hidden1 = keras.layers.Dense(30, activation='relu')(input_deep)
hidden2 = keras.layers.Dense(30, activation='relu')(hidden1)
concat = keras.layers.concatenate([input_wide, hidden2])
output = keras.layers.Dense(1)(concat)
output2 = keras.layers.Dense(1)(hidden2)
model = keras.models.Model(inputs=[input_wide, input_deep], outputs=[output, output2])

In [30]:
model.compile(loss='mse', optimizer='Adam')
callbacks = [keras.callbacks.EarlyStopping(min_delta=1e-3, patience=3)]

In [31]:
history = model.fit((X_train_scaled_wide, X_train_scaled_deep), [y_train, y_train],
                    validation_data=[[X_val_scaled_wide, X_val_scaled_deep], [y_val, y_val]],
                                     epochs=10, batch_size=128, callbacks=callbacks)

Train on 11610 samples, validate on 3870 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [32]:
del model; gc.collect();
K.clear_session()