# Factorization Machine

之前的　Linear Regression 的形式是形如　$\hat{y}= w^T x$，Factorization Machine 在　Linear Regression　的基础之上添加了所谓的交叉项，即　$c_{ij}x_ix_j$, 即　$\hat{y} = w_1x_1 + \ldots + w_fx_f + \sum_{p=1}^{f-1} \sum_{q=p+1}^{f} c_{pq}x_px_q$，由于有些交叉项在实际中并不存在，所以使用向量相乘的办法，用一个　$f \times k$　的矩阵，从中任选两个向量 $v_i, v_j$　相乘作为系数，从　$n$　个向量中任选两个相乘构成的系数个数一共有 $C_f^2 = \frac{f(f-1)}{2}$　个，刚好等于后面交叉项的数量。

最后的交叉项可以写成　$\sum_{p=1}^{f-1} \sum_{q=p+1}^{f} c_{pq}x_px_q = \frac{1}{2}(\sum_{p=1}^f \sum_{q=1}^f c_{pq}x_px_q - \sum_p^f v_p^2x_p^2)=\frac{1}{2} \sum_{u=1}^k[(\sum_{p=1}^fv_{p,u}x_p)(\sum_{q=1}^f v_{q,u}x_q) - (\sum_{p=1}^f v_{p, u}^2x_p^2)]= \frac{1}{2} \sum_{u=1}^k[(\sum_{p=1}^fv_{p,u}x_p)^2- (\sum_{p=1}^f v_{p, u}^2x_p^2)]$，降低计算复杂度。

其中 $c_{pq} = v_p \times v_q$

计算　Loss 采用的函数仍是　MSE，即　$loss = \frac{1}{2} \sum_i^n(\hat{y}_i - y_i)^2$

Gradient 的计算及更新：

$\begin{align*}
w_i & = w_i - \eta \cdot [\sum_i^n x_i \cdot (\hat{y_i} - y_i) ] \\
v_{p,u} & = v_{p,u} - \eta \cdot \sum_{i=1}^n [(\hat{y_i} - y_i) \cdot (x_{i, p} (\sum_{p=1}^f v_{p, u} x_{i, p}) - x_{i, p}^2v_{p, u})]
\end{align*}$

Factorization Machine 是一个适用于回归场景的算法。为了演示这个算法，采用 Boston Housing 数据集

In [18]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split


boston = load_boston()
boston_data = pd.DataFrame(boston.data,  columns=boston.feature_names)

ss = StandardScaler()
boston_data = ss.fit_transform(boston_data)
boston_data = np.c_[boston_data, np.array(boston.target)]

shape = boston_data.shape
X_train, X_test, y_train, y_test = train_test_split(boston_data[0:shape[0], 0:-1], boston_data[0:shape[0], -1], test_size=0.25,
                                                    random_state=33)