# Matrix Factorization

Matrix Factorizationは、高次元データを低次元の潜在要素に分解する技術で、主に推薦システムで使用されます。この手法は、ユーザーとアイテム間の関係を潜在的な特徴でモデル化し、これにより精度の高い推薦を実現します。特に、スパースなデータセットにおいて効果的で、Netflix Prizeのコンテストでの使用が有名です。

<a href="https://colab.research.google.com/github/fuyu-quant/data-science-wiki/blob/main/recommendation/collaborative_filtering(linear)/matrix_factorization.ipynb" target="_blank" rel="noopener noreferrer"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
%%capture
!pip install cornac==1.17.0

In [3]:
import cornac
from cornac.data import Reader
from cornac.datasets import netflix
from cornac.eval_methods import RatioSplit

### データの用意

In [4]:
feedback = netflix.load_feedback(variant="small", reader=Reader(bin_threshold=1.0))

ratio_split = RatioSplit(
    data=feedback,
    test_size=0.1,
    rating_threshold=1.0,
    exclude_unknowns=True,
    verbose=True,
)

Data from https://static.preferred.ai/cornac/datasets/netflix/data_small.zip
will be cached into /root/.cornac/netflix/data_small.csv


0.00B [00:00, ?B/s]

Unzipping ...
File cached!
rating_threshold = 1.0
exclude_unknowns = True
---
Training data:
Number of users = 9985
Number of items = 4924
Number of ratings = 547022
Max rating = 1.0
Min rating = 1.0
Global mean = 1.0
---
Test data:
Number of users = 8215
Number of items = 3366
Number of ratings = 60748
Number of unknown users = 0
Number of unknown items = 0
---
Total users = 9985
Total items = 4924


### Matrix Factorizationの実行

In [5]:
mf = cornac.models.MF(k=10, max_iter=25, learning_rate=0.01, lambda_reg=0.02, use_bias=True, seed=123)

In [6]:
# Use AUC and Recall@20 for evaluation
auc = cornac.metrics.AUC()
rec_20 = cornac.metrics.Recall(k=20)

### 学習と予測

In [7]:
exp =cornac.Experiment(
    eval_method=ratio_split,
    models=[mf],
    metrics=[auc, rec_20],
    user_based=True,
).run()


[MF] Training started!

[MF] Evaluation started!


Ranking:   0%|          | 0/8215 [00:00<?, ?it/s]


TEST:
...
   |    AUC | Recall@20 | Train (s) | Test (s)
-- + ------ + --------- + --------- + --------
MF | 0.4989 |    0.0002 |    0.6990 |  65.3450

