# Non-negative Matrix Factorization(非負値行列因子分解)

非負値行列因子分解(NMF)は、データを正の部分行列に分解する手法で、データの潜在的な特徴を発見するのに用いられます。レコメンデーション以外にデータの圧縮やノイズ除去に有効です。
- GitHub：https://github.com/NicolasHug/Surprise/tree/master

<a href="https://colab.research.google.com/github/fuyu-quant/data-science-wiki/blob/main/recommendation/collaborative_filtering(linear)/nmf.ipynb" target="_blank" rel="noopener noreferrer"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%%capture
!pip install scikit-surprise

In [29]:
from surprise import NMF
from surprise import Dataset
from surprise import Reader
from surprise import accuracy
from surprise.model_selection import train_test_split

import pandas as pd

### データの用意

In [30]:
# カラム名は何でも良い
ratings_dict = {
    "itemID": [1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 1],
    "userID": [9, 32, 2, 45, "user_foo", 10, 20, 32, 4, 5, 6, 2],
    "rating": [3, 2, 4, 3, 1, 2, 2, 1, 4, 4, 3, 2]
}
df = pd.DataFrame(ratings_dict)

# rating_scaleは必須
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[["userID", "itemID", "rating"]], reader)
trainset, testset = train_test_split(data, test_size=0.25)

### NMFの実行

In [31]:
algo = NMF()
algo.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.NMF at 0x7b8c56f97a00>

In [32]:
predictions = algo.test(testset)

accuracy.rmse(predictions)

RMSE: 1.2949


1.2948821908633337

### NMFの予測

In [33]:
uid = 20
iid = 2

pred = algo.predict(uid, iid, r_ui=4, verbose=True)

user: 20         item: 2          r_ui = 4.00   est = 2.03   {'was_impossible': False}
