# Implicit Matrix Factorization
MovieLensのデータを利用し，IMFを試す．

## セットアップ
親のフォルダのパスを追加し，モジュールをインポート

In [1]:
# 親のフォルダのパスを追加
import sys; sys.path.insert(0, '..')

from util.data_loader import DataLoader
from util.metric_calculator import MetricCalculator

データの読み込み

In [2]:
# Movielensのデータの読み込み
data_loader = DataLoader(num_users = 1000, num_test_items = 5, data_path = '../data/ml-10M100K/')
movielens = data_loader.load()

テキストのコードから，`IMFRecommender`クラスを読み込む

In [3]:
# Implicit Matrix Factorizationレコメンド
%load_ext autoreload
%autoreload 2

from src.imf import IMFRecommender

import os
os.environ["MKL_NUM_THREADS"] = "1"

recommender = IMFRecommender()
recommend_result = recommender.recommend(movielens)

お試しとしてどの程度予測できているかを確認する．

In [4]:
#  評価
metric_calculator = MetricCalculator()
metrics = metric_calculator.calc(
    movielens.test.rating.tolist(), recommend_result.rating.tolist(),
    movielens.test_user2items, recommend_result.user2items, k=20)
print(metrics)

rmse=0.000000, Precision@K=0.013492, Recall@K=0.083817


# Gauss過程回帰を用いたハイパーパラメータ調整
IMFの実装において，閲覧回数を信頼度に変換する重み$\alpha$や正則化パラメータ$\lambda$など，いくつかのハイパーパラメータが存在する．
Gauss過程回帰により，ハイパーパラメータを調整する．


いま，スコア$y$を最大化するようなハイパーパラメータ${\bf \theta}_*$を求めたい．
すでに$n$個のハイパーパラメータの組$\Theta = ({\bf \theta}_1^t, \cdots, {\bf \theta}_n^t)$に対して，スコアが${\bf y} = (y_1, \cdots, y_n)^t$のように得られているものとする．
このときGauss過程回帰によって，新しいハイパーパラメータ${\bf \theta}_{n+1}$に対するスコアの予測値$\hat{y}({\bf \theta}_{n+1})$とその標準偏差$\sigma({\bf \theta}_{n+1})$が求められる．


獲得関数をUCB(Upper Confidence Bound)とすると，以下を最大化するように${\bf \theta}_{n+1}$を決めたい：
$$
\alpha({\bf \theta}_{n+1}) = 
\hat{y}({\bf \theta}_{n+1}) + \sqrt{\dfrac{\log (n+1)}{n + 1} } \sigma({\bf \theta}_{n+1})
$$


第一項はスコアが高いところを，第二項は標準偏差が大きいところ(未探索の箇所)を探索するインセンティブとなる．特に後者の効果は，$n$が小さいほど(探索が進んでいない学習の初期において)大きくなる．


Gauss過程回帰において，$\alpha({\bf \theta}_{n+1})$を最大化する${\bf \theta}_{n+1}$を解析的に求めることはできない．
数値的な微分も難しい(面倒)であるため，適当に用意したハイパーパラメータの集合の中で，$\alpha({\bf \theta}_{n+1})$を最大化するものを選択する．

In [34]:
import numpy as np
import pandas as pd
import itertools
import random

factors_list = [5, 10, 20, 30, 50, 100] # matrix factorizationのファクター数
alpha_list = np.geomspace(0.01, 100.0, 20) # 信頼度に関するパラメータ
minimum_num_rating_list = [0, 1, 5, 10, 20, 50, 100, 300] # 評価に関する閾値
regularization_list = np.geomspace(0.01, 100.0, 20) # 正則化パラメータ

# それぞれのパラメータの組み合わせを作成
df_params = pd.DataFrame(list(itertools.product(factors_list, alpha_list, minimum_num_rating_list, regularization_list)),
                         columns = ["factors", "alpha", "minimum_num_rating", "regularization"])
df_params["Precision@K"] = np.nan # 未探索のスコアはNULL
df_params["Recall@K"] = np.nan # 未探索のスコアはNULL
df_params["Score"] = np.nan # 未探索のスコアはNULL
display(df_params)

Unnamed: 0,factors,alpha,minimum_num_rating,regularization,Precision@K,Recall@K,Score
0,5,0.01,0,0.010000,,,
1,5,0.01,0,0.016238,,,
2,5,0.01,0,0.026367,,,
3,5,0.01,0,0.042813,,,
4,5,0.01,0,0.069519,,,
...,...,...,...,...,...,...,...
19195,100,100.00,300,14.384499,,,
19196,100,100.00,300,23.357215,,,
19197,100,100.00,300,37.926902,,,
19198,100,100.00,300,61.584821,,,


`Gauss`過程回帰のために，ある程度のサンプル数はランダムに調べておく

In [35]:
import re

num_sample_for_pre = 20 # Gauss過程回帰のために事前に調べるサンプル数

for idx_sample in range(num_sample_for_pre):
    idx_param = random.choice(df_params[df_params["Score"].isnull()].index.values) # 調査するハイパーパラメータをランダムに指定
    factors, alpha, minimum_num_rating, regularization = df_params.loc[idx_param].values[0:4]
    factors = int(factors) # recommenderのために整数に変換
    minimum_num_rating = int(minimum_num_rating)
    
    print(idx_param, factors, alpha, minimum_num_rating, regularization)
    
    # IMFによるレコメンド
    recommend_result = recommender.recommend(movielens, factors = factors, regularization = regularization,
                                             minimum_num_rating = minimum_num_rating, alpha=alpha)
    metrics = metric_calculator.calc(
        movielens.test.rating.tolist(), recommend_result.rating.tolist(),
            movielens.test_user2items, recommend_result.user2items, k = 20)
    
    # スコアを適当に取り出す
    rmse, precision_at_k, recall_at_k = re.findall(r'-?\d+\.?\d*', str(metrics))
    precision_at_k = float(precision_at_k)
    recall_at_k = float(recall_at_k)
    
    # スコアを保存
    df_params.at[idx_param, "Precision@K"] = precision_at_k
    df_params.at[idx_param, "Recall@K"] = recall_at_k
    df_params.at[idx_param, "Score"] = (precision_at_k + recall_at_k) / 2.0 # 最適化するスコアはこれ
    

8165 20 2.06913808111479 0 0.11288378916846889
7713 20 0.4832930238571752 1 5.455594781168514
1215 5 0.29763514416313175 20 14.38449888287663


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


16136 100 0.01 100 23.357214690901213


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


4516 10 0.4832930238571752 1 23.357214690901213
17007 100 0.18329807108324356 5 0.29763514416313175


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


577 5 0.04281332398719394 20 37.92690190732246


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


4573 10 0.4832930238571752 20 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


15293 50 14.38449888287663 20 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


17381 100 0.4832930238571752 50 0.016237767391887217


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


7107 20 0.06951927961775606 10 0.29763514416313175


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


18186 100 5.455594781168514 50 0.18329807108324356


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


7631 20 0.29763514416313175 50 2.06913808111479


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


6837 20 0.026366508987303583 50 37.92690190732246


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


12131 30 14.38449888287663 100 2.06913808111479


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


13256 50 0.026366508987303583 100 23.357214690901213


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


13887 50 0.18329807108324356 100 0.29763514416313175


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


813 5 0.11288378916846889 0 5.455594781168514
2654 5 23.357214690901213 20 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


15642 50 37.92690190732246 100 0.026366508987303583


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


In [36]:
df_params[~(df_params["Score"].isnull())]

Unnamed: 0,factors,alpha,minimum_num_rating,regularization,Precision@K,Recall@K,Score
577,5,0.042813,20,37.926902,0.000684,0.004643,0.002664
813,5,0.112884,0,5.455595,0.011576,0.072569,0.042072
1215,5,0.297635,20,14.384499,0.011357,0.071438,0.041398
2654,5,23.357215,20,8.858668,0.010673,0.065645,0.038159
4516,10,0.483293,1,23.357215,0.010892,0.067688,0.03929
4573,10,0.483293,20,5.455595,0.013711,0.086462,0.050086
6837,20,0.026367,50,37.926902,0.003366,0.022478,0.012922
7107,20,0.069519,10,0.297635,0.011221,0.070142,0.040681
7631,20,0.297635,50,2.069138,0.013218,0.083342,0.04828
7713,20,0.483293,1,5.455595,0.013957,0.088287,0.051122


`Gauss`過程回帰の準備(ライブラリを適当に利用する)

In [37]:
from sklearn.gaussian_process import *
kernel = kernels.RBF(1.0, (3e-1, 1e3)) + kernels.ConstantKernel(1.0, (3e-1, 1e3)) + kernels.WhiteKernel()
model_hyper_param = GaussianProcessRegressor(
    kernel = kernel,
    alpha = 1e-5,
    optimizer = "fmin_l_bfgs_b",
    n_restarts_optimizer = 20,
    normalize_y = True)

In [38]:
num_sample_for_post = 100 # Gauss過程回帰のイテレーション

for idx_sample in range(num_sample_for_post):
    print("{:0} / {:1}".format(idx_sample, num_sample_for_post))
    # Gauss過程回帰で探索するべきパラメータを調べる
    ## Gauss過程回帰の学習
    print("=======Gaussian Process Regression=======")
    df_params_w_score = df_params[~(df_params["Score"].isnull())].copy() # スコアが分かっているサンプルを抽出
    x_train = df_params_w_score[["factors", "alpha", "minimum_num_rating", "regularization"]].values
    # Gauss過程回帰のために，パラメータを適当に変換
    x_train[:, 2] = x_train[:, 2] + 1 
    x_train = np.log(x_train)
    y_train = df_params_w_score[["Score"]].values
    model_hyper_param.fit(x_train, y_train) # Gauss過程回帰
    ## Gauss過程回帰による予測
    df_params_wo_score = df_params[df_params["Score"].isnull()].copy() # スコアが分かっているサンプルを抽出
    x_test = df_params_wo_score[["factors", "alpha", "minimum_num_rating", "regularization"]].values
    x_test[:, 2] = x_test[:, 2] + 1
    x_test = np.log(x_test)
    pred_test, std_test = model_hyper_param.predict(x_test, return_std = True)
    pred_test = pred_test[:, 0] # pred_testの形が(n, 1)のような"行列"なのでベクトルに直す(std_testはそのようになっていない(?))
    ## ハイパーパラメータを探索 
    upper_confidence_bound = pred_test + np.sqrt(np.log(len(pred_test)) / len(pred_test)) * std_test # 獲得関数はUpper Confidence Bound
    df_params_wo_score["UCB"] = upper_confidence_bound
    idx_param = df_params_wo_score.sort_values(by = "UCB", ascending = False).index.values[0] # 獲得関数を最大化するパラメータ(のインデックス)
    #print("idx_to_test = {:}".format(idx_to_test))

    # IMFの学習開始
    print("=======Implicit Matrix Factorizatioin=======")
    factors, alpha, minimum_num_rating, regularization = df_params.loc[idx_param].values[0:4]
    factors = int(factors)
    minimum_num_rating = int(minimum_num_rating)

    print(idx_param, factors, alpha, minimum_num_rating, regularization)

    recommend_result = recommender.recommend(movielens, factors = factors, regularization = regularization,
                                             minimum_num_rating = minimum_num_rating, alpha=alpha)
    metrics = metric_calculator.calc(
        movielens.test.rating.tolist(), recommend_result.rating.tolist(),
            movielens.test_user2items, recommend_result.user2items, k = 20)
    rmse, precision_at_k, recall_at_k = re.findall(r'-?\d+\.?\d*', str(metrics))
    precision_at_k = float(precision_at_k)
    recall_at_k = float(recall_at_k)
    
    # ハイパーパラメータとスコアの組を保存
    df_params.at[idx_param, "Precision@K"] = precision_at_k
    df_params.at[idx_param, "Recall@K"] = recall_at_k
    df_params.at[idx_param, "Score"] = (precision_at_k + recall_at_k) / 2.0
    print(precision_at_k, recall_at_k)

0 / 100
1649 5 1.2742749857031335 5 0.7847599703514611


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.011303 0.071702
1 / 100
8072 20 1.2742749857031335 10 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014012 0.086991
2 / 100
8092 20 1.2742749857031335 20 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014176 0.088415
3 / 100
8093 20 1.2742749857031335 20 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014149 0.088095
4 / 100
11292 30 1.2742749857031335 20 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014641 0.089108
5 / 100
10912 30 0.4832930238571752 1 3.359818286283781
0.013766 0.08503
6 / 100
11312 30 1.2742749857031335 50 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014012 0.086471
7 / 100
11132 30 0.7847599703514611 20 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013957 0.086526
8 / 100
14811 50 3.359818286283781 20 2.06913808111479


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013054 0.079602
9 / 100
4753 10 0.7847599703514611 50 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013328 0.083835
10 / 100
7913 20 0.7847599703514611 10 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014368 0.089792
11 / 100
7893 20 0.7847599703514611 5 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013875 0.087384
12 / 100
7932 20 0.7847599703514611 20 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014204 0.089318
13 / 100
7933 20 0.7847599703514611 20 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014122 0.08919
14 / 100
7912 20 0.7847599703514611 10 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.01445 0.090057
15 / 100
7752 20 0.4832930238571752 10 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014149 0.088679
16 / 100
4532 10 0.4832930238571752 5 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013355 0.083324
17 / 100
7931 20 0.7847599703514611 20 2.06913808111479


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014012 0.088214
18 / 100
11113 30 0.7847599703514611 10 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014258 0.087174
19 / 100
7772 20 0.4832930238571752 20 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013793 0.086946
20 / 100
7533 20 0.29763514416313175 0 5.455594781168514
0.014122 0.089026
21 / 100
10733 30 0.29763514416313175 0 5.455594781168514
0.01393 0.085915
22 / 100
7693 20 0.4832930238571752 0 5.455594781168514
0.013957 0.088287
23 / 100
7532 20 0.29763514416313175 0 3.359818286283781
0.014231 0.090139
24 / 100
7692 20 0.4832930238571752 0 3.359818286283781
0.013903 0.08815
25 / 100
7372 20 0.18329807108324356 0 3.359818286283781
0.014149 0.089372
26 / 100
7531 20 0.29763514416313175 0 2.06913808111479
0.014039 0.088944
27 / 100
7371 20 0.18329807108324356 0 2.06913808111479
0.013684 0.08482
28 / 100
4332 10 0.29763514416313175 0 3.359818286283781
0.013246 0.081281
29 / 100
7373 20 0.18329807108324356 0 5.455594781168514
0.013848 0.086672
30 / 100
10732 30 0.29763514416313175 0 3.359818286283781
0.012781 0.079493
31 / 100
8073 20 1.2742749857031335 10 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014012 0.086836
32 / 100
7753 20 0.4832930238571752 10 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014176 0.089217
33 / 100
7733 20 0.4832930238571752 5 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014231 0.089199
34 / 100
7732 20 0.4832930238571752 5 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014012 0.087776
35 / 100
7914 20 0.7847599703514611 10 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014231 0.089628
36 / 100
7934 20 0.7847599703514611 20 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014149 0.088944
37 / 100
8094 20 1.2742749857031335 20 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014067 0.087621
38 / 100
7773 20 0.4832930238571752 20 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014395 0.091033
39 / 100
11133 30 0.7847599703514611 20 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014231 0.087411
40 / 100
7754 20 0.4832930238571752 10 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014258 0.089819
41 / 100
7774 20 0.4832930238571752 20 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.01434 0.089938
42 / 100
7734 20 0.4832930238571752 5 8.858667904100823


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014122 0.089044
43 / 100
7553 20 0.29763514416313175 1 5.455594781168514
0.014122 0.089026
44 / 100
7552 20 0.29763514416313175 1 3.359818286283781
0.014231 0.090139
45 / 100
7712 20 0.4832930238571752 1 3.359818286283781
0.013903 0.08815
46 / 100
7573 20 0.29763514416313175 5 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013957 0.08794
47 / 100
7392 20 0.18329807108324356 1 3.359818286283781
0.014149 0.089372
48 / 100
7393 20 0.18329807108324356 1 5.455594781168514
0.013848 0.086672
49 / 100
7551 20 0.29763514416313175 1 2.06913808111479
0.014039 0.088944
50 / 100
7711 20 0.4832930238571752 1 2.06913808111479
0.014012 0.088807
51 / 100
7868 20 0.7847599703514611 1 0.4832930238571752
0.014258 0.089135
52 / 100
7867 20 0.7847599703514611 1 0.29763514416313175
0.014231 0.089026
53 / 100
8027 20 1.2742749857031335 1 0.29763514416313175
0.013848 0.085532
54 / 100
7708 20 0.4832930238571752 1 0.4832930238571752
0.014039 0.088068
55 / 100
7847 20 0.7847599703514611 0 0.29763514416313175
0.014231 0.089026
56 / 100
7846 20 0.7847599703514611 0 0.18329807108324356
0.014313 0.0895
57 / 100
7709 20 0.4832930238571752 1 0.7847599703514611
0.014122 0.088807
58 / 100
7710 20 0.4832930238571752 1 1.2742749857031335
0.013903 0.087219
59 / 100
7707 20 0.4832930238571752 1 0.29763514416313175
0.01393 0.087292
60 / 100


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014587 0.09044
74 / 100
10955 30 0.4832930238571752 10 14.38449888287663


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.012917 0.080615
75 / 100
10953 30 0.4832930238571752 10 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013684 0.084592
76 / 100
7686 20 0.4832930238571752 0 0.18329807108324356
0.013903 0.087156
77 / 100
7869 20 0.7847599703514611 1 0.7847599703514611
0.014286 0.089299
78 / 100
7849 20 0.7847599703514611 0 0.7847599703514611
0.014286 0.089299
79 / 100
7870 20 0.7847599703514611 1 1.2742749857031335
0.014286 0.089236
80 / 100
7850 20 0.7847599703514611 0 1.2742749857031335
0.014286 0.089236
81 / 100
7691 20 0.4832930238571752 0 2.06913808111479
0.014012 0.088807
82 / 100
7690 20 0.4832930238571752 0 1.2742749857031335
0.013903 0.087219
83 / 100
7871 20 0.7847599703514611 1 2.06913808111479
0.014423 0.090257
84 / 100
7851 20 0.7847599703514611 0 2.06913808111479
0.014423 0.090257
85 / 100
8011 20 1.2742749857031335 0 2.06913808111479
0.01382 0.085057
86 / 100
7872 20 0.7847599703514611 1 3.359818286283781
0.014395 0.090285
87 / 100
7892 20 0.7847599703514611 5 3.359818286283781


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013957 0.087867
88 / 100
7852 20 0.7847599703514611 0 3.359818286283781
0.014395 0.090285
89 / 100
7873 20 0.7847599703514611 1 5.455594781168514
0.014258 0.08909
90 / 100
7853 20 0.7847599703514611 0 5.455594781168514
0.014258 0.08909
91 / 100
7689 20 0.4832930238571752 0 0.7847599703514611
0.014122 0.088807
92 / 100
7688 20 0.4832930238571752 0 0.4832930238571752
0.014039 0.088068
93 / 100
7593 20 0.29763514416313175 10 5.455594781168514


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.014039 0.08805
94 / 100
8091 20 1.2742749857031335 20 2.06913808111479


  movielens_train_high_rating = filtered_movielens_train[dataset.train.rating >= 4]


0.013957 0.087128
95 / 100
8032 20 1.2742749857031335 1 3.359818286283781
0.014039 0.085997
96 / 100
7863 20 0.7847599703514611 1 0.04281332398719394
0.01434 0.089683
97 / 100
7862 20 0.7847599703514611 1 0.026366508987303583
0.01434 0.089591
98 / 100
7702 20 0.4832930238571752 1 0.026366508987303583
0.01393 0.087384
99 / 100
11063 30 0.7847599703514611 1 0.04281332398719394
0.013875 0.084638


最もスコアが高いようにハイパーパラメータを調整するとこんな感じ．

In [39]:
df_params.sort_values("Score", ascending = False).head(50)

Unnamed: 0,factors,alpha,minimum_num_rating,regularization,Precision@K,Recall@K,Score
7773,20,0.483293,20,5.455595,0.014395,0.091033,0.052714
10954,30,0.483293,10,8.858668,0.014587,0.09044,0.052514
7852,20,0.78476,0,3.359818,0.014395,0.090285,0.05234
7851,20,0.78476,0,2.069138,0.014423,0.090257,0.05234
7872,20,0.78476,1,3.359818,0.014395,0.090285,0.05234
7871,20,0.78476,1,2.069138,0.014423,0.090257,0.05234
7912,20,0.78476,10,3.359818,0.01445,0.090057,0.052254
7552,20,0.297635,1,3.359818,0.014231,0.090139,0.052185
7532,20,0.297635,0,3.359818,0.014231,0.090139,0.052185
7774,20,0.483293,20,8.858668,0.01434,0.089938,0.052139
