# Factorization Machine

Factorization Machinesは、Steﬀen Rendleによって開発された予測モデルです。特徴は、スパースデータセットでの高い予測精度と、特徴間の相互作用を効率的に捉える能力にあります。これにより、推薦システムや予測タスクに適しています。
- GitHub：https://github.com/ibayer/fastFM-core

In [9]:
%%capture
!pip install fastFM==0.2.10
!pip install category_encoders==2.6.3

In [25]:
from fastFM import als
from scipy.sparse import csr_matrix

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import category_encoders as ce
from sklearn.metrics import mean_squared_error

### データの用意

In [17]:
df = pd.read_csv('https://raw.githubusercontent.com/fuyu-quant/data-science-wiki/develop/datasets/fm_sample.csv')
df.head()

Unnamed: 0,カテゴリ,rating,is_A,is_B,is_C,is_X,is_Y,is_Z
0,A,5,0,0,1,0,1,0
1,A,1,1,0,0,0,0,1
2,A,3,0,0,1,0,1,0
3,A,2,1,0,0,0,0,1
4,A,4,0,0,0,0,0,1


In [18]:
X = df.drop('rating', axis=1)
y = df['rating']


enc = ce.OneHotEncoder(cols=['カテゴリ'])
X = enc.fit_transform(X)

In [19]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=810)

### FMの実行

In [21]:
fm = als.FMRegression(n_iter=1000, init_stdev=0.1, rank=2, l2_reg_w=0.1, l2_reg_V=0.5)
fm.fit(csr_matrix(X_train), y_train)

### FMの予測

In [22]:
y_pred = fm.predict(csr_matrix(X_test))

In [23]:
y_pred

array([3.99248909, 2.74152162, 1.81844739, 3.83751232, 3.05686936,
       6.38743231, 1.55447592, 1.56438976, 1.18439236, 3.56716205,
       1.91208757, 3.79094619, 4.2139054 , 1.23664624, 3.08025806,
       2.98097866, 2.62600819, 1.58702925])

In [26]:
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print('RMSE : {:.3f}'.format(rmse))


RMSE : 1.286
