### Factorization Machine Demo use `xlearn`

From https://github.com/aksnzhy/xlearn/blob/master/demo/classification/scikit_learn_demo/example_FM_wine.py

In [1]:
import numpy as np
import pandas as pd
import xlearn as xl
from sklearn.datasets import load_wine
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

Load data

In [2]:
wine_data = load_wine()
X = wine_data['data']
y = (wine_data['target'] == 1).astype(int)

In [3]:
print(wine_data.DESCR)

Wine Data Database

Notes
-----
Data Set Characteristics:
    :Number of Instances: 178 (50 in each of three classes)
    :Number of Attributes: 13 numeric, predictive attributes and the class
    :Attribute Information:
 		- 1) Alcohol
 		- 2) Malic acid
 		- 3) Ash
		- 4) Alcalinity of ash  
 		- 5) Magnesium
		- 6) Total phenols
 		- 7) Flavanoids
 		- 8) Nonflavanoid phenols
 		- 9) Proanthocyanins
		- 10)Color intensity
 		- 11)Hue
 		- 12)OD280/OD315 of diluted wines
 		- 13)Proline
        	- class:
                - class_0
                - class_1
                - class_2
		
    :Summary Statistics:
    
                                   Min   Max   Mean     SD
    Alcohol:                      11.0  14.8    13.0   0.8
    Malic Acid:                   0.74  5.80    2.34  1.12
    Ash:                          1.36  3.23    2.36  0.27
    Alcalinity of Ash:            10.6  30.0    19.5   3.3
    Magnesium:                    70.0 162.0    99.7  14.3
    Total Phenols:     

In [4]:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=.2, random_state=42)
print(X_train.shape, X_val.shape)

(142, 13) (36, 13)


#### Factorization Machine

In [5]:
fm_params = {
    'task'      : 'binary',  # binary classification
    'init'      : 0.1,       # model scale
    'epoch'     : 20,        # epochs
    'k'         : 5,         # number of latent factor
    'lr'        : 0.05,      # learning rate
    'opt'       : 'sgd',     # optimization algorithm
    'metric'    : 'auc'      # metrics
}
fm_clf = xl.FMModel(**fm_params)
fm_clf.fit(X_train, y_train, eval_set=[X_val, y_val])

Here the FM model is 

\\[
y = w_0 + \sum^{13}_{i=1}w_i x_i + \sum_{j > i} \langle v_i,v_j \rangle x_i x_j,
\\]

where each \\(x_i\\) corresponding one \\(v_i\\) - a vector with length 4.


In [6]:
print('Raw 13 features weights (with intercept):\n{}\n'.format(fm_clf.weights[0]))
print('13 features latent vectors (13 x 4):\n{}'.format(fm_clf.weights[1]))

Raw 13 features weights (with intercept):
[-0.386988    0.0178063   0.00099325  0.00332839  0.0370448   0.138444
  0.00435295  0.00519787  0.00053309  0.00321714 -0.00539737  0.00251978
  0.00642384 -0.102563  ]

13 features latent vectors (13 x 4):
[[1.80238e-08 6.65773e-05 4.70837e-04 6.98098e-04 6.95968e-04]
 [1.30372e-04 3.53939e-04 2.73548e-04 1.80693e-04 5.08456e-04]
 [6.21657e-05 3.89033e-04 4.04194e-04 5.62013e-04 4.02764e-04]
 [3.29510e-04 5.64495e-04 6.40769e-04 4.67633e-04 6.28706e-04]
 [1.92278e-04 2.75728e-04 5.82131e-04 4.22603e-04 5.80532e-04]
 [6.19164e-04 1.06064e-04 5.52488e-04 5.83949e-04 1.90285e-04]
 [5.99864e-04 2.65669e-04 1.97633e-04 3.29899e-04 2.49576e-04]
 [5.00978e-04 4.37138e-04 3.45245e-04 3.75778e-04 5.93917e-05]
 [3.84677e-04 2.72822e-04 3.78407e-04 5.33277e-04 3.10768e-04]
 [4.87783e-04 4.52295e-05 1.21092e-04 1.41131e-04 4.97492e-04]
 [5.48885e-05 1.18542e-04 4.03129e-04 3.78580e-04 4.78461e-04]
 [2.02280e-04 2.82586e-04 2.65291e-04 5.85265e-04 6.38979

In [7]:
y_val_pred_proba_fm = fm_clf.predict(X_val)
print('FM AUC = {}'.format(roc_auc_score(y_val, y_val_pred_proba_fm)))

FM AUC = 0.8701298701298702


#### Comparison with Logistic Regression

In [8]:
lr_clf = LogisticRegression()
lr_clf.fit(X_train, y_train)
y_val_pred_lr = lr_clf.predict(X_val)
y_val_pred_proba_lr = lr_clf.predict_proba(X_val)[:, 1]
print('LR AUC = {}'.format(roc_auc_score(y_val, y_val_pred_proba_lr)))

LR AUC = 0.9935064935064936
