# RePlay recommender models comparison

### Dataset
We will compare RePlay models on __MovieLens 1m__. 

### Dataset preprocessing: 
Ratings greater than or equal to 3 are considered as positive interactions.

### Data split
Dataset is spitted by date so that 20% of the last interactions as are placed in the test part. Cold items and users are dropped.

### Predict:
We will predict top-10 most relevant films for each user.

### Metrics
Quality metrics used:__ndcg@k, hitrate@k, map@k, mrr@k__ for k = 1, 5, 10
Additional metrics used: __coverage@k__ and __surprisal@k__.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
%config Completer.use_jedi = False

In [3]:
import logging
import pandas as pd
import time


from pyspark.sql import functions as sf, types as st

from replay.data_preparator import DataPreparator
from replay.experiment import Experiment
from replay.metrics import Coverage, HitRate, MRR, MAP, NDCG, Surprisal
from replay.models import (
    ALSWrap, 
    ADMMSLIM, 
    ClassifierRec, 
    KNN, 
    LightFMWrap, 
    MultVAE, 
    NeuroMF, 
    SLIM, 
    Stack,
    PopRec, 
    RandomRec, 
    Wilson, 
    Word2VecRec
)

from replay.models.base_rec import HybridRecommender
from replay.session_handler import State
from replay.splitters import DateSplitter
from replay.utils import get_log_info

In [4]:
logger = logging.getLogger("replay")

In [5]:
spark = State().session
spark

In [6]:
from logging import INFO
State().logger.setLevel(INFO)

In [7]:
K = 10
K_list_metrics = [1, 5, 10]
BUDGET = 20
SEED = 12345

## 0. Preprocessing <a name='data-preparator'></a>

### 0.1 Data loading

In [8]:
from rs_datasets import MovieLens

data = MovieLens("1m")
data.info()

ratings


Unnamed: 0,user_id,item_id,rating,timestamp
0,1,1193,5,978300760
1,1,661,3,978302109
2,1,914,3,978301968



users


Unnamed: 0,user_id,gender,age,occupation,zip_code
0,1,F,1,10,48067
1,2,M,56,16,70072
2,3,M,25,15,55117



items


Unnamed: 0,item_id,title,genres
0,1,Toy Story (1995),Animation|Children's|Comedy
1,2,Jumanji (1995),Adventure|Children's|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance





#### log preprocessing

In [9]:
# converting log of interactions to spark-dataframe format
log = DataPreparator().transform(
    data=data.ratings,
    columns_names={
        "user_id": "user_id",
        "item_id": "item_id",
        "relevance": "rating",
        "timestamp": "timestamp"
    }
)
print(get_log_info(log))

total lines: 1000209, total users: 6040, total items: 3706


In [10]:
# will consider ratings >= 3 as positive feedback. a politive feedback ie treated with relevance = 1
only_positives_log = log.filter(sf.col('relevance') >= 3).withColumn('relevance', sf.lit(1))
only_positives_log.count()

836478

In [11]:
user_features=None
item_features=None

### 0.2. Data split

In [12]:
# train/test split 
train_spl = DateSplitter(
    test_start=0.2,
    drop_cold_items=True,
    drop_cold_users=True,
)
train, test = train_spl.split(only_positives_log)
print('train info:\n', get_log_info(train))
print('test info:\n', get_log_info(test))

train info:
 total lines: 669181, total users: 5397, total items: 3569
test info:
 total lines: 86542, total users: 1139, total items: 3279


In [13]:
# train/test split for hyperparams selection
opt_train, opt_val = train_spl.split(train)
opt_train.count(), opt_val.count()

(535343, 24241)

In [14]:
# negative feedback will be used for Classifier and Wilson models
only_negatives_log = log.filter(sf.col('relevance') < 3).withColumn('relevance', sf.lit(0.))
test_start = test.agg(sf.min('timestamp')).collect()[0][0]

# train with both positive and negative feedback
pos_neg_train=(train
              .withColumn('relevance', sf.lit(1))
              .union(only_negatives_log.filter(sf.col('timestamp') < test_start))
             )
pos_neg_train.count()

798993

In [15]:
train.show(2)

+-------+-------+---------+-------------------+
|user_id|item_id|relevance|          timestamp|
+-------+-------+---------+-------------------+
|    637|   3930|        1|2000-12-02 05:30:12|
|    637|   3932|        1|2000-12-02 05:53:52|
+-------+-------+---------+-------------------+
only showing top 2 rows



# 1. Metrics definition

In [16]:
# experiment is used for metrics calculation
e = Experiment(test, {MAP(): K, NDCG(): K, HitRate(): K_list_metrics, Coverage(train): K, Surprisal(train): K, MRR(): K})

# 2. Models training

## 2.1. Non-personalized models

In [17]:
non_personalized_models = {'Popular Recommender': [PopRec(), 'no_opt'], 
          'Random Recommender (uniform)': [RandomRec(seed=SEED, distribution='uniform'), 'no_opt'], 
          'Random Recommender (popularity-based)': [RandomRec(seed=SEED, distribution='popular_based'), {"alpha": [-0.5, 100]}],
          'Wilson Recommender': [Wilson(), 'no_opt']}

In [18]:
def fit_predict_add_res(name, model, experiment, train, suffix=''):
    """
    Run fit_predict for the `model`, measure time on fit_predict and evaluate metrics
    """
    start_time=time.time()
    
    fit_predict_params = {'log': train, 'k': K, 'users': test.select('user_id').distinct()}
    if isinstance(model, Wilson) or isinstance(model, ClassifierRec):
        fit_predict_params['log'] = pos_neg_train

    if isinstance(model, HybridRecommender):
        fit_predict_params['item_features'] = item_features
        fit_predict_params['user_features'] = user_features
    
    pred=model.fit_predict(**fit_predict_params)
    pred.count()
    fit_predict_time = time.time() - start_time
    
    experiment.add_result(name + suffix, pred)
    experiment.results.loc[name + suffix, 'fit_pred_time'] = fit_predict_time
    
    print(experiment.results[['NDCG@{}'.format(K), 'MRR@{}'.format(K), 'Coverage@{}'.format(K), 'fit_pred_time']].sort_values('NDCG@{}'.format(K), ascending=False))

In [19]:
def full_pipeline(models, experiment, train, suffix='', budget=BUDGET):
    """
    For each model:
        -  if required: run hyperparameters search, set best params and save param values to `experiment`
        - pass model to `fit_predict_add_res`        
    """
    
    for name, [model, params] in models.items():
        model.logger.info(msg='{} started'.format(name))
        if params != 'no_opt':
            model.logger.info(msg='{} optimization started'.format(name))
            best_params = model.optimize(opt_train, 
                                         opt_val, 
                                         param_grid=params, 
                                         item_features=item_features,
                                         user_features=user_features,
                                         k=K, 
                                         budget=budget)
            model.set_params(**best_params)
            logger.info(msg='best params for {} are: {}'.format(name, best_params))
            experiment.results.loc[name + suffix, 'params'] = best_params.__repr__()
        
        logger.info(msg='{} fit_predict started'.format(name))
        fit_predict_add_res(name, model, experiment, train, suffix)        

In [20]:
%%time
full_pipeline(non_personalized_models, e, train)

06-Aug-21 12:11:26, replay, INFO: Popular Recommender started
INFO:replay:Popular Recommender started
06-Aug-21 12:11:26, replay, INFO: Popular Recommender fit_predict started
INFO:replay:Popular Recommender fit_predict started
06-Aug-21 12:11:56, replay, INFO: Random Recommender (uniform) started
INFO:replay:Random Recommender (uniform) started
06-Aug-21 12:11:56, replay, INFO: Random Recommender (uniform) fit_predict started
INFO:replay:Random Recommender (uniform) fit_predict started


                      NDCG@10    MRR@10  Coverage@10  fit_pred_time
Popular Recommender  0.243614  0.390414     0.033903      12.300273


06-Aug-21 12:12:18, replay, INFO: Random Recommender (popularity-based) started
INFO:replay:Random Recommender (popularity-based) started
06-Aug-21 12:12:18, replay, INFO: Random Recommender (popularity-based) optimization started
INFO:replay:Random Recommender (popularity-based) optimization started
[32m[I 2021-08-06 12:12:18,492][0m A new study created in memory with name: no-name-04998036-db9e-4cc2-98ac-407dba10d273[0m


                               NDCG@10    MRR@10  Coverage@10  fit_pred_time
Popular Recommender           0.243614  0.390414     0.033903      12.300273
Random Recommender (uniform)  0.025557  0.067583     0.960773       7.897583


[32m[I 2021-08-06 12:12:30,335][0m Trial 0 finished with value: 0.05814092087936753 and parameters: {'alpha': 82.88966197286092}. Best is trial 0 with value: 0.05814092087936753.[0m
[32m[I 2021-08-06 12:12:40,739][0m Trial 1 finished with value: 0.06515443071960864 and parameters: {'alpha': 35.938073963960825}. Best is trial 1 with value: 0.06515443071960864.[0m
[32m[I 2021-08-06 12:12:46,558][0m Trial 2 finished with value: 0.061019515481233515 and parameters: {'alpha': 27.20613961415028}. Best is trial 1 with value: 0.06515443071960864.[0m
[32m[I 2021-08-06 12:12:52,199][0m Trial 3 finished with value: 0.05343624977373768 and parameters: {'alpha': 72.98775725951892}. Best is trial 1 with value: 0.06515443071960864.[0m
[32m[I 2021-08-06 12:12:57,863][0m Trial 4 finished with value: 0.07314960383949339 and parameters: {'alpha': 1.5184592423221073}. Best is trial 4 with value: 0.07314960383949339.[0m
[32m[I 2021-08-06 12:13:03,440][0m Trial 5 finished with value: 0.0596

                                        NDCG@10    MRR@10  Coverage@10  \
Popular Recommender                    0.243614  0.390414     0.033903   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   

                                       fit_pred_time  
Popular Recommender                        12.300273  
Random Recommender (popularity-based)       6.827084  
Random Recommender (uniform)                7.897583  
                                        NDCG@10    MRR@10  Coverage@10  \
Popular Recommender                    0.243614  0.390414     0.033903   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   

                                       fit_pred_time  
Popular Recommender                        12.300273  
Wi

In [21]:
e.results.sort_values('NDCG@10', ascending=False)

Unnamed: 0,Coverage@10,HitRate@1,HitRate@5,HitRate@10,MAP@10,MRR@10,NDCG@10,Surprisal@10,fit_pred_time,params
Popular Recommender,0.033903,0.28446,0.53029,0.645303,0.157194,0.390414,0.243614,0.118354,12.300273,
Wilson Recommender,0.017092,0.083406,0.34504,0.414399,0.045002,0.180976,0.092121,0.26219,10.033913,
Random Recommender (popularity-based),0.653965,0.060579,0.255487,0.381914,0.028404,0.141636,0.069369,0.317856,6.827084,{'alpha': 1.2948997611910968}
Random Recommender (uniform),0.960773,0.032485,0.107112,0.183494,0.009075,0.067583,0.025557,0.53693,7.897583,


In [22]:
e.results.to_csv('res_21_rel_1.csv')

## 2.2  Personalized models without features

In [23]:
common_models = {
          'ADMM SLIM': [ADMMSLIM(seed=SEED), None],
          'Implicit ALS': [ALSWrap(seed=SEED), None], 
          'Explicit ALS': [ALSWrap(seed=SEED, implicit_prefs=False), None], 
          'KNN': [KNN(), None], 
          'LightFM': [LightFMWrap(random_state=SEED), {"no_components": [8, 512]}], 
          'SLIM': [SLIM(seed=SEED), None]}

In [24]:
%%time
full_pipeline(common_models, e, train)

06-Aug-21 12:15:09, replay, INFO: ADMM SLIM started
INFO:replay:ADMM SLIM started
06-Aug-21 12:15:09, replay, INFO: ADMM SLIM optimization started
INFO:replay:ADMM SLIM optimization started
[32m[I 2021-08-06 12:15:09,504][0m A new study created in memory with name: no-name-8bb28a9b-94a6-448c-99d0-74990740ca77[0m
06-Aug-21 12:15:21, replay, INFO: Итерация: 1. primal gap: -4.6515; dual gap:  5.5049e+06; rho: 2500.0
INFO:replay:Итерация: 1. primal gap: -4.6515; dual gap:  5.5049e+06; rho: 2500.0
06-Aug-21 12:15:27, replay, INFO: Итерация: 2. primal gap: -4.2503; dual gap:  2.0516e+06; rho: 1250.0
INFO:replay:Итерация: 2. primal gap: -4.2503; dual gap:  2.0516e+06; rho: 1250.0
06-Aug-21 12:15:28, replay, INFO: Итерация: 3. primal gap: -3.701; dual gap:  6.8684e+05; rho: 625.0
INFO:replay:Итерация: 3. primal gap: -3.701; dual gap:  6.8684e+05; rho: 625.0
06-Aug-21 12:15:28, replay, INFO: Итерация: 4. primal gap: -3.5454; dual gap:  9.6365e+04; rho: 312.5
INFO:replay:Итерация: 4. primal g

INFO:replay:Итерация: 5. primal gap: -3.7216; dual gap:  -1.6622; rho: 0.019073486328125
[32m[I 2021-08-06 12:16:40,082][0m Trial 3 finished with value: 0.1576700081568547 and parameters: {'lambda_1': 6.147703684504098e-09, 'lambda_2': 9.128117997418395e-06}. Best is trial 0 with value: 0.21177905832134067.[0m
06-Aug-21 12:16:43, replay, INFO: Итерация: 1. primal gap: 1.0663e+05; dual gap:  2002.4; rho: 0.03814697265625
INFO:replay:Итерация: 1. primal gap: 1.0663e+05; dual gap:  2002.4; rho: 0.03814697265625
06-Aug-21 12:16:44, replay, INFO: Итерация: 2. primal gap: -2.4279; dual gap:  4062.6; rho: 0.019073486328125
INFO:replay:Итерация: 2. primal gap: -2.4279; dual gap:  4062.6; rho: 0.019073486328125
06-Aug-21 12:16:44, replay, INFO: Итерация: 3. primal gap: -1.2609; dual gap:  -0.23763; rho: 0.019073486328125
INFO:replay:Итерация: 3. primal gap: -1.2609; dual gap:  -0.23763; rho: 0.019073486328125
[32m[I 2021-08-06 12:16:54,186][0m Trial 4 finished with value: 0.162056165898755

06-Aug-21 12:17:31, replay, INFO: Итерация: 28. primal gap: -4.2547e+04; dual gap:  9.3434e+05; rho: 0.0095367431640625
INFO:replay:Итерация: 28. primal gap: -4.2547e+04; dual gap:  9.3434e+05; rho: 0.0095367431640625
06-Aug-21 12:17:32, replay, INFO: Итерация: 29. primal gap: -6.179e+04; dual gap:  2.6926e+05; rho: 0.00476837158203125
INFO:replay:Итерация: 29. primal gap: -6.179e+04; dual gap:  2.6926e+05; rho: 0.00476837158203125
06-Aug-21 12:17:32, replay, INFO: Итерация: 30. primal gap: -2.7341e+04; dual gap:  1.6168e+05; rho: 0.00476837158203125
INFO:replay:Итерация: 30. primal gap: -2.7341e+04; dual gap:  1.6168e+05; rho: 0.00476837158203125
06-Aug-21 12:17:33, replay, INFO: Итерация: 31. primal gap: -9911.7; dual gap:  1.0873e+05; rho: 0.00476837158203125
INFO:replay:Итерация: 31. primal gap: -9911.7; dual gap:  1.0873e+05; rho: 0.00476837158203125
06-Aug-21 12:17:34, replay, INFO: Итерация: 32. primal gap: -1984.6; dual gap:  7.3799e+04; rho: 0.00476837158203125
INFO:replay:Ите

INFO:replay:Итерация: 67. primal gap: 5221.2; dual gap:  5425.9; rho: 0.00476837158203125
06-Aug-21 12:17:53, replay, INFO: Итерация: 68. primal gap: 4999.6; dual gap:  3767.1; rho: 0.00476837158203125
INFO:replay:Итерация: 68. primal gap: 4999.6; dual gap:  3767.1; rho: 0.00476837158203125
06-Aug-21 12:17:54, replay, INFO: Итерация: 69. primal gap: 4639.0; dual gap:  2615.0; rho: 0.00476837158203125
INFO:replay:Итерация: 69. primal gap: 4639.0; dual gap:  2615.0; rho: 0.00476837158203125
06-Aug-21 12:17:54, replay, INFO: Итерация: 70. primal gap: 4325.0; dual gap:  1814.7; rho: 0.00476837158203125
INFO:replay:Итерация: 70. primal gap: 4325.0; dual gap:  1814.7; rho: 0.00476837158203125
06-Aug-21 12:17:55, replay, INFO: Итерация: 71. primal gap: 3978.5; dual gap:  1258.8; rho: 0.00476837158203125
INFO:replay:Итерация: 71. primal gap: 3978.5; dual gap:  1258.8; rho: 0.00476837158203125
06-Aug-21 12:17:55, replay, INFO: Итерация: 72. primal gap: 3824.5; dual gap:  872.63; rho: 0.00476837

[32m[I 2021-08-06 12:18:31,002][0m Trial 7 finished with value: 0.1372292855196 and parameters: {'lambda_1': 3.799202266305442e-06, 'lambda_2': 0.00016976557309787273}. Best is trial 0 with value: 0.21177905832134067.[0m
06-Aug-21 12:18:35, replay, INFO: Итерация: 1. primal gap: 8.5304e+05; dual gap:  2014.8; rho: 0.00476837158203125
INFO:replay:Итерация: 1. primal gap: 8.5304e+05; dual gap:  2014.8; rho: 0.00476837158203125
06-Aug-21 12:18:35, replay, INFO: Итерация: 2. primal gap: -26.351; dual gap:  4013.2; rho: 0.002384185791015625
INFO:replay:Итерация: 2. primal gap: -26.351; dual gap:  4013.2; rho: 0.002384185791015625
06-Aug-21 12:18:36, replay, INFO: Итерация: 3. primal gap: 5.2341; dual gap:  66.196; rho: 0.002384185791015625
INFO:replay:Итерация: 3. primal gap: 5.2341; dual gap:  66.196; rho: 0.002384185791015625
06-Aug-21 12:18:36, replay, INFO: Итерация: 4. primal gap: 6.4237; dual gap:  13.581; rho: 0.002384185791015625
INFO:replay:Итерация: 4. primal gap: 6.4237; dual 

INFO:replay:Итерация: 22. primal gap: 246.72; dual gap:  39.576; rho: 0.019073486328125
06-Aug-21 12:19:09, replay, INFO: Итерация: 23. primal gap: 206.13; dual gap:  73.271; rho: 0.019073486328125
INFO:replay:Итерация: 23. primal gap: 206.13; dual gap:  73.271; rho: 0.019073486328125
06-Aug-21 12:19:10, replay, INFO: Итерация: 24. primal gap: 146.18; dual gap:  105.74; rho: 0.019073486328125
INFO:replay:Итерация: 24. primal gap: 146.18; dual gap:  105.74; rho: 0.019073486328125
06-Aug-21 12:19:10, replay, INFO: Итерация: 25. primal gap: 114.41; dual gap:  151.99; rho: 0.019073486328125
INFO:replay:Итерация: 25. primal gap: 114.41; dual gap:  151.99; rho: 0.019073486328125
06-Aug-21 12:19:11, replay, INFO: Итерация: 26. primal gap: 88.995; dual gap:  217.82; rho: 0.019073486328125
INFO:replay:Итерация: 26. primal gap: 88.995; dual gap:  217.82; rho: 0.019073486328125
06-Aug-21 12:19:11, replay, INFO: Итерация: 27. primal gap: 64.794; dual gap:  311.54; rho: 0.019073486328125
INFO:repla

06-Aug-21 12:19:32, replay, INFO: Итерация: 64. primal gap: -6.2437; dual gap:  269.71; rho: 0.0095367431640625
INFO:replay:Итерация: 64. primal gap: -6.2437; dual gap:  269.71; rho: 0.0095367431640625
06-Aug-21 12:19:32, replay, INFO: Итерация: 65. primal gap: 14.155; dual gap:  131.72; rho: 0.0095367431640625
INFO:replay:Итерация: 65. primal gap: 14.155; dual gap:  131.72; rho: 0.0095367431640625
06-Aug-21 12:19:33, replay, INFO: Итерация: 66. primal gap: 38.039; dual gap:  92.71; rho: 0.0095367431640625
INFO:replay:Итерация: 66. primal gap: 38.039; dual gap:  92.71; rho: 0.0095367431640625
06-Aug-21 12:19:34, replay, INFO: Итерация: 67. primal gap: 46.035; dual gap:  64.952; rho: 0.0095367431640625
INFO:replay:Итерация: 67. primal gap: 46.035; dual gap:  64.952; rho: 0.0095367431640625
06-Aug-21 12:19:34, replay, INFO: Итерация: 68. primal gap: 46.851; dual gap:  45.198; rho: 0.0095367431640625
INFO:replay:Итерация: 68. primal gap: 46.851; dual gap:  45.198; rho: 0.0095367431640625


INFO:replay:Итерация: 4. primal gap: 14.114; dual gap:  -5.5626; rho: 0.0762939453125
06-Aug-21 12:20:08, replay, INFO: Итерация: 5. primal gap: 14.112; dual gap:  -5.5622; rho: 0.152587890625
INFO:replay:Итерация: 5. primal gap: 14.112; dual gap:  -5.5622; rho: 0.152587890625
06-Aug-21 12:20:09, replay, INFO: Итерация: 6. primal gap: 14.108; dual gap:  -5.5616; rho: 0.30517578125
INFO:replay:Итерация: 6. primal gap: 14.108; dual gap:  -5.5616; rho: 0.30517578125
06-Aug-21 12:20:09, replay, INFO: Итерация: 7. primal gap: 14.099; dual gap:  -5.5603; rho: 0.6103515625
INFO:replay:Итерация: 7. primal gap: 14.099; dual gap:  -5.5603; rho: 0.6103515625
06-Aug-21 12:20:10, replay, INFO: Итерация: 8. primal gap: 14.082; dual gap:  -5.5578; rho: 1.220703125
INFO:replay:Итерация: 8. primal gap: 14.082; dual gap:  -5.5578; rho: 1.220703125
06-Aug-21 12:20:10, replay, INFO: Итерация: 9. primal gap: 14.048; dual gap:  -5.553; rho: 2.44140625
INFO:replay:Итерация: 9. primal gap: 14.048; dual gap:  

06-Aug-21 12:20:42, replay, INFO: Итерация: 13. primal gap: 1.767; dual gap:  -0.43408; rho: 39.0625
INFO:replay:Итерация: 13. primal gap: 1.767; dual gap:  -0.43408; rho: 39.0625
06-Aug-21 12:20:43, replay, INFO: Итерация: 14. primal gap: 1.5213; dual gap:  -0.32756; rho: 39.0625
INFO:replay:Итерация: 14. primal gap: 1.5213; dual gap:  -0.32756; rho: 39.0625
06-Aug-21 12:20:43, replay, INFO: Итерация: 15. primal gap: 1.3015; dual gap:  -0.60242; rho: 39.0625
INFO:replay:Итерация: 15. primal gap: 1.3015; dual gap:  -0.60242; rho: 39.0625
06-Aug-21 12:20:44, replay, INFO: Итерация: 16. primal gap: 1.1031; dual gap:  -0.82353; rho: 39.0625
INFO:replay:Итерация: 16. primal gap: 1.1031; dual gap:  -0.82353; rho: 39.0625
06-Aug-21 12:20:45, replay, INFO: Итерация: 17. primal gap: 0.92242; dual gap:  -0.6659; rho: 39.0625
INFO:replay:Итерация: 17. primal gap: 0.92242; dual gap:  -0.6659; rho: 39.0625
06-Aug-21 12:20:45, replay, INFO: Итерация: 18. primal gap: 0.75792; dual gap:  -1.6266; rho

INFO:replay:Итерация: 3. primal gap: -0.8897; dual gap:  15.098; rho: 9.765625
06-Aug-21 12:21:25, replay, INFO: Итерация: 4. primal gap: -0.95838; dual gap:  -0.73327; rho: 9.765625
INFO:replay:Итерация: 4. primal gap: -0.95838; dual gap:  -0.73327; rho: 9.765625
[32m[I 2021-08-06 12:21:32,023][0m Trial 13 finished with value: 0.20411168791885143 and parameters: {'lambda_1': 0.4534854635477831, 'lambda_2': 1751.3765170172228}. Best is trial 0 with value: 0.21177905832134067.[0m
06-Aug-21 12:21:35, replay, INFO: Итерация: 1. primal gap: 194.09; dual gap:  1.5352e+04; rho: 4.8828125
INFO:replay:Итерация: 1. primal gap: 194.09; dual gap:  1.5352e+04; rho: 4.8828125
06-Aug-21 12:21:36, replay, INFO: Итерация: 2. primal gap: 22.784; dual gap:  2656.3; rho: 2.44140625
INFO:replay:Итерация: 2. primal gap: 22.784; dual gap:  2656.3; rho: 2.44140625
06-Aug-21 12:21:36, replay, INFO: Итерация: 3. primal gap: 19.307; dual gap:  728.25; rho: 1.220703125
INFO:replay:Итерация: 3. primal gap: 19.

INFO:replay:Итерация: 6. primal gap: 4767.1; dual gap:  2.114e+04; rho: 1.220703125
06-Aug-21 12:22:33, replay, INFO: Итерация: 7. primal gap: 4568.0; dual gap:  1.5457e+05; rho: 0.6103515625
INFO:replay:Итерация: 7. primal gap: 4568.0; dual gap:  1.5457e+05; rho: 0.6103515625
06-Aug-21 12:22:33, replay, INFO: Итерация: 8. primal gap: 4041.2; dual gap:  2.4678e+05; rho: 0.30517578125
INFO:replay:Итерация: 8. primal gap: 4041.2; dual gap:  2.4678e+05; rho: 0.30517578125
06-Aug-21 12:22:34, replay, INFO: Итерация: 9. primal gap: 4485.6; dual gap:  1.5301e+05; rho: 0.152587890625
INFO:replay:Итерация: 9. primal gap: 4485.6; dual gap:  1.5301e+05; rho: 0.152587890625
06-Aug-21 12:22:34, replay, INFO: Итерация: 10. primal gap: 4660.0; dual gap:  1.0335e+04; rho: 0.152587890625
INFO:replay:Итерация: 10. primal gap: 4660.0; dual gap:  1.0335e+04; rho: 0.152587890625
06-Aug-21 12:22:35, replay, INFO: Итерация: 11. primal gap: 4972.0; dual gap:  8979.5; rho: 0.152587890625
INFO:replay:Итерация:

INFO:replay:Итерация: 49. primal gap: 1648.4; dual gap:  1381.4; rho: 0.152587890625
06-Aug-21 12:22:57, replay, INFO: Итерация: 50. primal gap: 1578.3; dual gap:  1345.4; rho: 0.152587890625
INFO:replay:Итерация: 50. primal gap: 1578.3; dual gap:  1345.4; rho: 0.152587890625
06-Aug-21 12:22:57, replay, INFO: Итерация: 51. primal gap: 1517.6; dual gap:  1311.5; rho: 0.152587890625
INFO:replay:Итерация: 51. primal gap: 1517.6; dual gap:  1311.5; rho: 0.152587890625
06-Aug-21 12:22:58, replay, INFO: Итерация: 52. primal gap: 1451.1; dual gap:  1279.8; rho: 0.152587890625
INFO:replay:Итерация: 52. primal gap: 1451.1; dual gap:  1279.8; rho: 0.152587890625
06-Aug-21 12:22:58, replay, INFO: Итерация: 53. primal gap: 1390.4; dual gap:  1250.0; rho: 0.152587890625
INFO:replay:Итерация: 53. primal gap: 1390.4; dual gap:  1250.0; rho: 0.152587890625
06-Aug-21 12:22:59, replay, INFO: Итерация: 54. primal gap: 1331.7; dual gap:  1222.1; rho: 0.152587890625
INFO:replay:Итерация: 54. primal gap: 13

INFO:replay:Итерация: 92. primal gap: 551.88; dual gap:  693.56; rho: 0.152587890625
06-Aug-21 12:23:22, replay, INFO: Итерация: 93. primal gap: 530.81; dual gap:  687.16; rho: 0.152587890625
INFO:replay:Итерация: 93. primal gap: 530.81; dual gap:  687.16; rho: 0.152587890625
06-Aug-21 12:23:22, replay, INFO: Итерация: 94. primal gap: 496.92; dual gap:  681.15; rho: 0.152587890625
INFO:replay:Итерация: 94. primal gap: 496.92; dual gap:  681.15; rho: 0.152587890625
06-Aug-21 12:23:23, replay, INFO: Итерация: 95. primal gap: 485.1; dual gap:  674.92; rho: 0.152587890625
INFO:replay:Итерация: 95. primal gap: 485.1; dual gap:  674.92; rho: 0.152587890625
06-Aug-21 12:23:24, replay, INFO: Итерация: 96. primal gap: 459.38; dual gap:  668.93; rho: 0.152587890625
INFO:replay:Итерация: 96. primal gap: 459.38; dual gap:  668.93; rho: 0.152587890625
06-Aug-21 12:23:24, replay, INFO: Итерация: 97. primal gap: 446.06; dual gap:  662.72; rho: 0.152587890625
INFO:replay:Итерация: 97. primal gap: 446.

INFO:replay:Итерация: 34. primal gap: 8.707; dual gap:  2.5379; rho: 0.152587890625
06-Aug-21 12:23:56, replay, INFO: Итерация: 35. primal gap: 8.0891; dual gap:  2.1743; rho: 0.152587890625
INFO:replay:Итерация: 35. primal gap: 8.0891; dual gap:  2.1743; rho: 0.152587890625
06-Aug-21 12:23:56, replay, INFO: Итерация: 36. primal gap: 7.4384; dual gap:  1.8463; rho: 0.152587890625
INFO:replay:Итерация: 36. primal gap: 7.4384; dual gap:  1.8463; rho: 0.152587890625
06-Aug-21 12:23:57, replay, INFO: Итерация: 37. primal gap: 6.8784; dual gap:  1.5467; rho: 0.152587890625
INFO:replay:Итерация: 37. primal gap: 6.8784; dual gap:  1.5467; rho: 0.152587890625
06-Aug-21 12:23:58, replay, INFO: Итерация: 38. primal gap: 6.3529; dual gap:  1.2721; rho: 0.152587890625
INFO:replay:Итерация: 38. primal gap: 6.3529; dual gap:  1.2721; rho: 0.152587890625
06-Aug-21 12:23:58, replay, INFO: Итерация: 39. primal gap: 5.8483; dual gap:  1.0203; rho: 0.152587890625
INFO:replay:Итерация: 39. primal gap: 5.8

INFO:replay:Итерация: 5. primal gap: 103.56; dual gap:  9.7723; rho: 0.03814697265625
06-Aug-21 12:24:43, replay, INFO: Итерация: 6. primal gap: 76.393; dual gap:  151.19; rho: 0.03814697265625
INFO:replay:Итерация: 6. primal gap: 76.393; dual gap:  151.19; rho: 0.03814697265625
06-Aug-21 12:24:43, replay, INFO: Итерация: 7. primal gap: 63.543; dual gap:  470.14; rho: 0.019073486328125
INFO:replay:Итерация: 7. primal gap: 63.543; dual gap:  470.14; rho: 0.019073486328125
06-Aug-21 12:24:44, replay, INFO: Итерация: 8. primal gap: 58.57; dual gap:  213.52; rho: 0.019073486328125
INFO:replay:Итерация: 8. primal gap: 58.57; dual gap:  213.52; rho: 0.019073486328125
06-Aug-21 12:24:44, replay, INFO: Итерация: 9. primal gap: 53.787; dual gap:  380.18; rho: 0.019073486328125
INFO:replay:Итерация: 9. primal gap: 53.787; dual gap:  380.18; rho: 0.019073486328125
06-Aug-21 12:24:45, replay, INFO: Итерация: 10. primal gap: 24.295; dual gap:  688.55; rho: 0.0095367431640625
INFO:replay:Итерация: 1

INFO:replay:Итерация: 46. primal gap: -130.76; dual gap:  2.8744; rho: 0.0095367431640625
06-Aug-21 12:25:07, replay, INFO: Итерация: 47. primal gap: -128.82; dual gap:  2.303; rho: 0.0095367431640625
INFO:replay:Итерация: 47. primal gap: -128.82; dual gap:  2.303; rho: 0.0095367431640625
06-Aug-21 12:25:08, replay, INFO: Итерация: 48. primal gap: -127.15; dual gap:  1.786; rho: 0.0095367431640625
INFO:replay:Итерация: 48. primal gap: -127.15; dual gap:  1.786; rho: 0.0095367431640625
06-Aug-21 12:25:09, replay, INFO: Итерация: 49. primal gap: -125.72; dual gap:  1.3178; rho: 0.019073486328125
INFO:replay:Итерация: 49. primal gap: -125.72; dual gap:  1.3178; rho: 0.019073486328125
06-Aug-21 12:25:09, replay, INFO: Итерация: 50. primal gap: -274.69; dual gap:  2805.0; rho: 0.0095367431640625
INFO:replay:Итерация: 50. primal gap: -274.69; dual gap:  2805.0; rho: 0.0095367431640625
06-Aug-21 12:25:10, replay, INFO: Итерация: 51. primal gap: -282.3; dual gap:  4.3519; rho: 0.00953674316406

                                        NDCG@10    MRR@10  Coverage@10  \
Popular Recommender                    0.243614  0.390414     0.033903   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   

                                       fit_pred_time  
Popular Recommender                        12.300273  
ADMM SLIM                                  77.647394  
Wilson Recommender                         10.033913  
Random Recommender (popularity-based)       6.827084  
Random Recommender (uniform)                7.897583  


[32m[I 2021-08-06 12:27:21,523][0m Trial 0 finished with value: 0.17264983245019783 and parameters: {'rank': 93}. Best is trial 0 with value: 0.17264983245019783.[0m
[32m[I 2021-08-06 12:27:30,242][0m Trial 1 finished with value: 0.18485792192643435 and parameters: {'rank': 38}. Best is trial 1 with value: 0.18485792192643435.[0m
[32m[I 2021-08-06 12:27:38,006][0m Trial 2 finished with value: 0.19987805113792728 and parameters: {'rank': 22}. Best is trial 2 with value: 0.19987805113792728.[0m
[32m[I 2021-08-06 12:27:45,166][0m Trial 3 finished with value: 0.2036303524515266 and parameters: {'rank': 14}. Best is trial 3 with value: 0.2036303524515266.[0m
[32m[I 2021-08-06 12:28:07,774][0m Trial 4 finished with value: 0.1722055814429825 and parameters: {'rank': 139}. Best is trial 3 with value: 0.2036303524515266.[0m
[32m[I 2021-08-06 12:28:34,252][0m Trial 5 finished with value: 0.17182042728414512 and parameters: {'rank': 154}. Best is trial 3 with value: 0.20363035245

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
Popular Recommender                    0.243614  0.390414     0.033903   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   

                                       fit_pred_time  
Implicit ALS                               10.127573  
Popular Recommender                        12.300273  
ADMM SLIM                                  77.647394  
Wilson Recommender                         10.033913  
Random Recommender (popularity-based)       6.827084  
Random Recommender (uniform)                7.897583  


[32m[I 2021-08-06 12:31:16,239][0m Trial 0 finished with value: 0.02645919715151776 and parameters: {'rank': 63}. Best is trial 0 with value: 0.02645919715151776.[0m
[32m[I 2021-08-06 12:31:22,573][0m Trial 1 finished with value: 0.026479347320400082 and parameters: {'rank': 13}. Best is trial 1 with value: 0.026479347320400082.[0m
[32m[I 2021-08-06 12:31:29,312][0m Trial 2 finished with value: 0.026479347320400082 and parameters: {'rank': 13}. Best is trial 1 with value: 0.026479347320400082.[0m
[32m[I 2021-08-06 12:31:35,639][0m Trial 3 finished with value: 0.022380826610032444 and parameters: {'rank': 11}. Best is trial 1 with value: 0.026479347320400082.[0m
[32m[I 2021-08-06 12:31:49,822][0m Trial 4 finished with value: 0.019730879968245296 and parameters: {'rank': 91}. Best is trial 1 with value: 0.026479347320400082.[0m
[32m[I 2021-08-06 12:31:56,186][0m Trial 5 finished with value: 0.026479347320400082 and parameters: {'rank': 13}. Best is trial 1 with value: 0.

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
Popular Recommender                    0.243614  0.390414     0.033903   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                           0.013044  0.032544     0.265621   

                                       fit_pred_time  
Implicit ALS                               10.127573  
Popular Recommender                        12.300273  
ADMM SLIM                                  77.647394  
Wilson Recommender                         10.033913  
Random Recommender (popularity-based)       6.827084  
Random Recommender (uniform)                7.897583  
Explicit ALS          

[32m[I 2021-08-06 12:35:21,690][0m Trial 0 finished with value: 0.22300418099873942 and parameters: {'num_neighbours': 43, 'shrink': 14}. Best is trial 0 with value: 0.22300418099873942.[0m
[32m[I 2021-08-06 12:35:35,653][0m Trial 1 finished with value: 0.23319577662929464 and parameters: {'num_neighbours': 72, 'shrink': 87}. Best is trial 1 with value: 0.23319577662929464.[0m
[32m[I 2021-08-06 12:35:50,274][0m Trial 2 finished with value: 0.23309568382794496 and parameters: {'num_neighbours': 81, 'shrink': 70}. Best is trial 1 with value: 0.23319577662929464.[0m
[32m[I 2021-08-06 12:36:04,771][0m Trial 3 finished with value: 0.2250762448810027 and parameters: {'num_neighbours': 47, 'shrink': 49}. Best is trial 1 with value: 0.23319577662929464.[0m
[32m[I 2021-08-06 12:36:19,090][0m Trial 4 finished with value: 0.22547371875018976 and parameters: {'num_neighbours': 68, 'shrink': 15}. Best is trial 1 with value: 0.23319577662929464.[0m
[32m[I 2021-08-06 12:36:33,139][0m

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                           0.013044  0.032544     0.265621   

                                       fit_pred_time  
Implicit ALS                               10.127573  
KNN                                        17.963558  
Popular Recommender                        12.300273  
ADMM SLIM                                  77.647394  
Wilson Recommender                         10.033913  
Ran

[32m[I 2021-08-06 12:40:19,519][0m Trial 0 finished with value: 0.1708628704470346 and parameters: {'no_components': 391}. Best is trial 0 with value: 0.1708628704470346.[0m
[32m[I 2021-08-06 12:40:29,122][0m Trial 1 finished with value: 0.1862199690416181 and parameters: {'no_components': 163}. Best is trial 1 with value: 0.1862199690416181.[0m
[32m[I 2021-08-06 12:40:39,056][0m Trial 2 finished with value: 0.16907452406936244 and parameters: {'no_components': 346}. Best is trial 1 with value: 0.1862199690416181.[0m
[32m[I 2021-08-06 12:40:47,157][0m Trial 3 finished with value: 0.19232760651255598 and parameters: {'no_components': 62}. Best is trial 3 with value: 0.19232760651255598.[0m
[32m[I 2021-08-06 12:40:56,837][0m Trial 4 finished with value: 0.19829286286113684 and parameters: {'no_components': 78}. Best is trial 4 with value: 0.19829286286113684.[0m
[32m[I 2021-08-06 12:41:08,150][0m Trial 5 finished with value: 0.1902283595535034 and parameters: {'no_compon

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                           0.013044  0.032544     0.265621   

                                       fit_pred_time  
Implicit ALS                               10.127573  
LightFM                                    10.833092  
KNN                                        17.963558  
Popular Recommender                    

[32m[I 2021-08-06 12:44:01,173][0m Trial 0 finished with value: 0.009492275594391468 and parameters: {'beta': 0.0008001450565061224, 'lambda_': 3.0664468934764985e-06}. Best is trial 0 with value: 0.009492275594391468.[0m
[32m[I 2021-08-06 12:44:11,971][0m Trial 1 finished with value: 0.17330719981821371 and parameters: {'beta': 1.1397236194099651e-09, 'lambda_': 0.005212916296850556}. Best is trial 1 with value: 0.17330719981821371.[0m
[32m[I 2021-08-06 12:44:43,365][0m Trial 2 finished with value: 0.005493043117870676 and parameters: {'beta': 9.506262224868373e-08, 'lambda_': 3.5900983065135055e-07}. Best is trial 1 with value: 0.17330719981821371.[0m
[32m[I 2021-08-06 12:44:55,367][0m Trial 3 finished with value: 0.21024139313946197 and parameters: {'beta': 1.3227323061559506, 'lambda_': 1.3072108575076162e-05}. Best is trial 3 with value: 0.21024139313946197.[0m
[32m[I 2021-08-06 12:46:37,325][0m Trial 4 finished with value: 0.005493043117870676 and parameters: {'beta

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                           0.013044  0.032544     0.265621   

                                       fit_pred_time  
Implicit ALS                               10.127573  
LightFM                                    10.833092  
SLIM                

In [25]:
e.results.sort_values('NDCG@10', ascending=False)

Unnamed: 0,Coverage@10,HitRate@1,HitRate@5,HitRate@10,MAP@10,MRR@10,NDCG@10,Surprisal@10,fit_pred_time,params
Implicit ALS,0.13281,0.305531,0.569798,0.685689,0.171672,0.419297,0.265372,0.162866,10.127573,{'rank': 8}
LightFM,0.151303,0.317823,0.574188,0.698859,0.167327,0.431049,0.262777,0.168066,10.833092,{'no_components': 8}
SLIM,0.040347,0.310799,0.567164,0.669008,0.171509,0.418741,0.26137,0.123728,12.456171,"{'beta': 4.528603379741062, 'lambda_': 0.01886..."
KNN,0.055758,0.294996,0.555751,0.65496,0.166407,0.408699,0.256174,0.137584,17.963558,"{'num_neighbours': 56, 'shrink': 99}"
Popular Recommender,0.033903,0.28446,0.53029,0.645303,0.157194,0.390414,0.243614,0.118354,12.300273,
ADMM SLIM,0.366769,0.188762,0.460053,0.590869,0.084121,0.303578,0.159086,0.236767,77.647394,"{'lambda_1': 0.0017369838173267552, 'lambda_2'..."
Wilson Recommender,0.017092,0.083406,0.34504,0.414399,0.045002,0.180976,0.092121,0.26219,10.033913,
Random Recommender (popularity-based),0.653965,0.060579,0.255487,0.381914,0.028404,0.141636,0.069369,0.317856,6.827084,{'alpha': 1.2948997611910968}
Random Recommender (uniform),0.960773,0.032485,0.107112,0.183494,0.009075,0.067583,0.025557,0.53693,7.897583,
Explicit ALS,0.265621,0.013169,0.055312,0.093064,0.004738,0.032544,0.013044,0.684305,13.875534,{'rank': 60}


In [26]:
e.results.to_csv('res_22_rel_1.csv')

## 2.3 Neural models

In [27]:
nets = {'MultVAE with default parameters': [MultVAE(), 'no_opt'],
        'NeuroMF with default parameters': [NeuroMF(), 'no_opt'], 
        'Word2Vec with default parameters': [Word2VecRec(seed=SEED), 'no_opt'],
        'MultVAE with optimized parameters': [MultVAE(), {"learning_rate": [0.0001, 0.5],
                                   "dropout": [0, 0.5],
                                    "l2_reg": [1e-9, 5]
                                   }],
        'NeuroMF with optimized parameters': [NeuroMF(), {
                                    "learning_rate": [0.0001, 0.5],
                                    "l2_reg": [1e-4, 5],
                                    "count_negative_sample": [1, 20]
                                    }],
        'Word2Vec with optimized parameters': [Word2VecRec(seed=SEED), None]}

In [28]:
%%time
full_pipeline(nets, e, train, budget=10)

06-Aug-21 12:49:42, replay, INFO: MultVAE with default parameters started
INFO:replay:MultVAE with default parameters started
06-Aug-21 12:49:42, replay, INFO: MultVAE with default parameters fit_predict started
INFO:replay:MultVAE with default parameters fit_predict started
INFO:ignite.handlers.early_stopping.EarlyStopping:EarlyStopping: Stop training
06-Aug-21 12:51:37, replay, INFO: NeuroMF with default parameters started
INFO:replay:NeuroMF with default parameters started
06-Aug-21 12:51:37, replay, INFO: NeuroMF with default parameters fit_predict started
INFO:replay:NeuroMF with default parameters fit_predict started


                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
MultVAE with default parameters        0.243479  0.393790     0.032222   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                           0.013044  0.032544     0.265621   

                                       fit_pred_time  
Implicit ALS                               10.127573  
L

INFO:ignite.handlers.early_stopping.EarlyStopping:EarlyStopping: Stop training
06-Aug-21 13:02:05, replay, INFO: Word2Vec with default parameters started
INFO:replay:Word2Vec with default parameters started
06-Aug-21 13:02:05, replay, INFO: Word2Vec with default parameters fit_predict started
INFO:replay:Word2Vec with default parameters fit_predict started


                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
MultVAE with default parameters        0.243479  0.393790     0.032222   
NeuroMF with default parameters        0.193122  0.317911     0.257495   
ADMM SLIM                              0.159086  0.303578     0.366769   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                           0.013044  0.032544     0.265621   

                                     

06-Aug-21 13:03:20, replay, INFO: MultVAE with optimized parameters started
INFO:replay:MultVAE with optimized parameters started
06-Aug-21 13:03:20, replay, INFO: MultVAE with optimized parameters optimization started
INFO:replay:MultVAE with optimized parameters optimization started
[32m[I 2021-08-06 13:03:20,616][0m A new study created in memory with name: no-name-f36ce0ab-aad0-4bab-baf8-29824bac4b23[0m


                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
MultVAE with default parameters        0.243479  0.393790     0.032222   
NeuroMF with default parameters        0.193122  0.317911     0.257495   
ADMM SLIM                              0.159086  0.303578     0.366769   
Word2Vec with default parameters       0.137660  0.243760     0.145139   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)           0.025557  0.067583     0.960773   
Explicit ALS                          

INFO:ignite.handlers.early_stopping.EarlyStopping:EarlyStopping: Stop training
[32m[I 2021-08-06 13:03:44,361][0m Trial 0 finished with value: 0.1874887866212654 and parameters: {'learning_rate': 0.0003862999904097142, 'dropout': 0.07923005875005928, 'l2_reg': 2.83816083728439}. Best is trial 0 with value: 0.1874887866212654.[0m
INFO:ignite.handlers.early_stopping.EarlyStopping:EarlyStopping: Stop training
[32m[I 2021-08-06 13:04:13,416][0m Trial 1 finished with value: 0.17181691224302248 and parameters: {'learning_rate': 0.00020753708855532054, 'dropout': 0.062263603220253594, 'l2_reg': 0.006427895298529179}. Best is trial 0 with value: 0.1874887866212654.[0m
INFO:ignite.handlers.early_stopping.EarlyStopping:EarlyStopping: Stop training
[32m[I 2021-08-06 13:04:34,704][0m Trial 2 finished with value: 0.16182557260096955 and parameters: {'learning_rate': 0.0005399642966597036, 'dropout': 0.06983604090587692, 'l2_reg': 5.34824137085562e-05}. Best is trial 0 with value: 0.18748878

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
MultVAE with default parameters        0.243479  0.393790     0.032222   
MultVAE with optimized parameters      0.237955  0.395733     0.030821   
NeuroMF with default parameters        0.193122  0.317911     0.257495   
ADMM SLIM                              0.159086  0.303578     0.366769   
Word2Vec with default parameters       0.137660  0.243760     0.145139   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based)  0.069369  0.141636     0.653965   
Random Recommender (uniform)          

[32m[I 2021-08-06 13:33:44,660][0m Trial 0 finished with value: 0.2171966472182154 and parameters: {'learning_rate': 0.0005164127318548905, 'l2_reg': 0.4933886344427261, 'count_negative_sample': 12}. Best is trial 0 with value: 0.2171966472182154.[0m
[32m[I 2021-08-06 13:40:15,299][0m Trial 1 finished with value: 0.19619666260424973 and parameters: {'learning_rate': 0.22633427634755302, 'l2_reg': 1.291553788083584, 'count_negative_sample': 1}. Best is trial 0 with value: 0.2171966472182154.[0m
[32m[I 2021-08-06 13:51:41,969][0m Trial 2 finished with value: 0.18114541887224037 and parameters: {'learning_rate': 0.16272475727080693, 'l2_reg': 1.110110644272328, 'count_negative_sample': 4}. Best is trial 0 with value: 0.2171966472182154.[0m
[32m[I 2021-08-06 14:11:09,732][0m Trial 3 finished with value: 0.21265404303894136 and parameters: {'learning_rate': 0.002612384539747061, 'l2_reg': 0.05352400724787759, 'count_negative_sample': 9}. Best is trial 0 with value: 0.217196647218

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
MultVAE with default parameters        0.243479  0.393790     0.032222   
MultVAE with optimized parameters      0.237955  0.395733     0.030821   
NeuroMF with optimized parameters      0.198788  0.243165     0.076772   
NeuroMF with default parameters        0.193122  0.317911     0.257495   
ADMM SLIM                              0.159086  0.303578     0.366769   
Word2Vec with default parameters       0.137660  0.243760     0.145139   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based) 

[32m[I 2021-08-06 17:02:11,430][0m Trial 0 finished with value: 0.03375827331626345 and parameters: {'rank': 208, 'window_size': 70, 'use_idf': True}. Best is trial 0 with value: 0.03375827331626345.[0m
[32m[I 2021-08-06 17:04:45,995][0m Trial 1 finished with value: 0.035382581269944056 and parameters: {'rank': 259, 'window_size': 33, 'use_idf': True}. Best is trial 1 with value: 0.035382581269944056.[0m
[32m[I 2021-08-06 17:07:17,193][0m Trial 2 finished with value: 0.03618660118348877 and parameters: {'rank': 194, 'window_size': 45, 'use_idf': True}. Best is trial 2 with value: 0.03618660118348877.[0m
[32m[I 2021-08-06 17:09:32,143][0m Trial 3 finished with value: 0.03512336075821768 and parameters: {'rank': 232, 'window_size': 31, 'use_idf': True}. Best is trial 2 with value: 0.03618660118348877.[0m
[32m[I 2021-08-06 17:13:15,636][0m Trial 4 finished with value: 0.031156451221491882 and parameters: {'rank': 220, 'window_size': 64, 'use_idf': True}. Best is trial 2 with

                                        NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                           0.265372  0.419297     0.132810   
LightFM                                0.262777  0.431049     0.151303   
SLIM                                   0.261370  0.418741     0.040347   
KNN                                    0.256174  0.408699     0.055758   
Popular Recommender                    0.243614  0.390414     0.033903   
MultVAE with default parameters        0.243479  0.393790     0.032222   
MultVAE with optimized parameters      0.237955  0.395733     0.030821   
NeuroMF with optimized parameters      0.198788  0.243165     0.076772   
NeuroMF with default parameters        0.193122  0.317911     0.257495   
ADMM SLIM                              0.159086  0.303578     0.366769   
Word2Vec with default parameters       0.137660  0.243760     0.145139   
Wilson Recommender                     0.092121  0.180976     0.017092   
Random Recommender (popularity-based) 

In [29]:
e.results.sort_values('NDCG@10', ascending=False)

Unnamed: 0,Coverage@10,HitRate@1,HitRate@5,HitRate@10,MAP@10,MRR@10,NDCG@10,Surprisal@10,fit_pred_time,params
Implicit ALS,0.13281,0.305531,0.569798,0.685689,0.171672,0.419297,0.265372,0.162866,10.127573,{'rank': 8}
LightFM,0.151303,0.317823,0.574188,0.698859,0.167327,0.431049,0.262777,0.168066,10.833092,{'no_components': 8}
SLIM,0.040347,0.310799,0.567164,0.669008,0.171509,0.418741,0.26137,0.123728,12.456171,"{'beta': 4.528603379741062, 'lambda_': 0.01886..."
KNN,0.055758,0.294996,0.555751,0.65496,0.166407,0.408699,0.256174,0.137584,17.963558,"{'num_neighbours': 56, 'shrink': 99}"
Popular Recommender,0.033903,0.28446,0.53029,0.645303,0.157194,0.390414,0.243614,0.118354,12.300273,
MultVAE with default parameters,0.032222,0.286216,0.519754,0.658472,0.154847,0.39379,0.243479,0.121923,32.609149,
MultVAE with optimized parameters,0.030821,0.287094,0.543459,0.640035,0.150969,0.395733,0.237955,0.122875,26.977435,"{'learning_rate': 0.010693178531368242, 'dropo..."
NeuroMF with optimized parameters,0.076772,0.021949,0.524144,0.653205,0.114313,0.243165,0.198788,0.231221,2791.187742,"{'learning_rate': 0.004837890834754644, 'l2_re..."
NeuroMF with default parameters,0.257495,0.187006,0.501317,0.626866,0.110592,0.317911,0.193122,0.235454,350.737231,
ADMM SLIM,0.366769,0.188762,0.460053,0.590869,0.084121,0.303578,0.159086,0.236767,77.647394,"{'lambda_1': 0.0017369838173267552, 'lambda_2'..."


In [30]:
e.results.to_csv('res_23_rel_1.csv')

## 2.4 Ensembles of recommenders

In [32]:
ensembles = {'Stack Recommender (LightFM + KNN + ALS)': [Stack(
    models=[LightFMWrap(random_state=SEED, no_components=common_models['LightFM'][0].no_components), 
            KNN(**{'num_neighbours': common_models['KNN'][0].num_neighbours, 'shrink': common_models['KNN'][0].shrink}), 
            ALSWrap(seed=SEED, rank=common_models['Implicit ALS'][0].rank)],
    n_folds=3,
    budget=BUDGET,
    seed=SEED), 'no_opt']}

In [33]:
State().logger.setLevel(logging.DEBUG)

In [34]:
%%time
full_pipeline(ensembles, e, train)

06-Aug-21 17:37:26, replay, INFO: Stack Recommender (LightFM + KNN + ALS) started
INFO:replay:Stack Recommender (LightFM + KNN + ALS) started
06-Aug-21 17:37:26, replay, INFO: Stack Recommender (LightFM + KNN + ALS) fit_predict started
INFO:replay:Stack Recommender (LightFM + KNN + ALS) fit_predict started
06-Aug-21 17:37:26, replay, DEBUG: Начало обучения Stack
DEBUG:replay:Начало обучения Stack
06-Aug-21 17:37:26, replay, DEBUG: Предварительная стадия обучения (pre-fit)
DEBUG:replay:Предварительная стадия обучения (pre-fit)
06-Aug-21 17:37:26, replay, DEBUG: Основная стадия обучения (fit)
DEBUG:replay:Основная стадия обучения (fit)
06-Aug-21 17:37:27, replay, INFO: Processing fold #0
INFO:replay:Processing fold #0
06-Aug-21 17:37:29, replay, DEBUG: Начало обучения LightFMWrap
DEBUG:replay:Начало обучения LightFMWrap
06-Aug-21 17:37:29, replay, DEBUG: Предварительная стадия обучения (pre-fit)
DEBUG:replay:Предварительная стадия обучения (pre-fit)
06-Aug-21 17:37:31, replay, DEBUG: Осн

                                          NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                             0.265372  0.419297     0.132810   
LightFM                                  0.262777  0.431049     0.151303   
SLIM                                     0.261370  0.418741     0.040347   
Stack Recommender (LightFM + KNN + ALS)  0.260570  0.416054     0.057439   
KNN                                      0.256174  0.408699     0.055758   
Popular Recommender                      0.243614  0.390414     0.033903   
MultVAE with default parameters          0.243479  0.393790     0.032222   
MultVAE with optimized parameters        0.237955  0.395733     0.030821   
NeuroMF with optimized parameters        0.198788  0.243165     0.076772   
NeuroMF with default parameters          0.193122  0.317911     0.257495   
ADMM SLIM                                0.159086  0.303578     0.366769   
Word2Vec with default parameters         0.137660  0.243760     0.145139   
Wilson Recom

In [35]:
e.results.sort_values('NDCG@10', ascending=False)

Unnamed: 0,Coverage@10,HitRate@1,HitRate@5,HitRate@10,MAP@10,MRR@10,NDCG@10,Surprisal@10,fit_pred_time,params
Implicit ALS,0.13281,0.305531,0.569798,0.685689,0.171672,0.419297,0.265372,0.162866,10.127573,{'rank': 8}
LightFM,0.151303,0.317823,0.574188,0.698859,0.167327,0.431049,0.262777,0.168066,10.833092,{'no_components': 8}
SLIM,0.040347,0.310799,0.567164,0.669008,0.171509,0.418741,0.26137,0.123728,12.456171,"{'beta': 4.528603379741062, 'lambda_': 0.01886..."
Stack Recommender (LightFM + KNN + ALS),0.057439,0.304653,0.562774,0.661106,0.169578,0.416054,0.26057,0.136999,1057.291856,
KNN,0.055758,0.294996,0.555751,0.65496,0.166407,0.408699,0.256174,0.137584,17.963558,"{'num_neighbours': 56, 'shrink': 99}"
Popular Recommender,0.033903,0.28446,0.53029,0.645303,0.157194,0.390414,0.243614,0.118354,12.300273,
MultVAE with default parameters,0.032222,0.286216,0.519754,0.658472,0.154847,0.39379,0.243479,0.121923,32.609149,
MultVAE with optimized parameters,0.030821,0.287094,0.543459,0.640035,0.150969,0.395733,0.237955,0.122875,26.977435,"{'learning_rate': 0.010693178531368242, 'dropo..."
NeuroMF with optimized parameters,0.076772,0.021949,0.524144,0.653205,0.114313,0.243165,0.198788,0.231221,2791.187742,"{'learning_rate': 0.004837890834754644, 'l2_re..."
NeuroMF with default parameters,0.257495,0.187006,0.501317,0.626866,0.110592,0.317911,0.193122,0.235454,350.737231,


In [37]:
# weights of each recommender in ensamble
ensembles['Stack Recommender (LightFM + KNN + ALS)'][0].params

{'LightFMWrap': 0.9461744498894694,
 'KNN': 0.4653759137084064,
 'ALSWrap': 0.4443127394321243}

In [38]:
e.results.to_csv('res_24_rel_1.csv')

## 2.5 Models considering features

### 2.5.1 item features preprocessing

In [39]:
%%time
item_features = DataPreparator().transform(
    data=data.items,
    columns_names={
        "item_id": "item_id"
    }
)

CPU times: user 41.9 ms, sys: 0 ns, total: 41.9 ms
Wall time: 129 ms


In [40]:
item_features.show(2)

+-------+--------------------+----------------+
|item_id|              genres|           title|
+-------+--------------------+----------------+
|      1|Animation|Childre...|Toy Story (1995)|
|      2|Adventure|Childre...|  Jumanji (1995)|
+-------+--------------------+----------------+
only showing top 2 rows



In [41]:
year = item_features.withColumn('year', sf.substring(sf.col('title'), -5, 4).astype(st.IntegerType())).select('item_id', 'year')
year.show(2)

+-------+----+
|item_id|year|
+-------+----+
|      1|1995|
|      2|1995|
+-------+----+
only showing top 2 rows



In [42]:
genres = (
    State().session.createDataFrame(data.items[["item_id", "genres"]])
    .select(
        "item_id",
        sf.split("genres", "\|").alias("genres")
    )
)

In [43]:
genres_list = (
    genres.select(sf.explode("genres").alias("genre"))
    .distinct().filter('genre <> "(no genres listed)"')
    .toPandas()["genre"].tolist()
)

In [44]:
genres_list

['Documentary',
 'Fantasy',
 'Adventure',
 'War',
 'Animation',
 'Comedy',
 'Thriller',
 'Film-Noir',
 'Crime',
 'Sci-Fi',
 'Musical',
 'Mystery',
 "Children's",
 'Drama',
 'Horror',
 'Western',
 'Romance',
 'Action']

In [45]:
from pyspark.sql.functions import col, lit, array_contains
from pyspark.sql.types import IntegerType

item_features = genres
for genre in genres_list:
    item_features = item_features.withColumn(
        genre,
        array_contains(col("genres"), genre).astype(IntegerType())
    )
item_features = item_features.drop("genres").cache()
item_features.count()

3883

In [46]:
item_features = item_features.join(year, on='item_id', how='inner')
item_features.count()

3883

In [47]:
item_features.cache()

DataFrame[item_id: int, Documentary: int, Fantasy: int, Adventure: int, War: int, Animation: int, Comedy: int, Thriller: int, Film-Noir: int, Crime: int, Sci-Fi: int, Musical: int, Mystery: int, Children's: int, Drama: int, Horror: int, Western: int, Romance: int, Action: int, year: int]

In [48]:
item_features.show(3)

+-------+-----------+-------+---------+---+---------+------+--------+---------+-----+------+-------+-------+----------+-----+------+-------+-------+------+----+
|item_id|Documentary|Fantasy|Adventure|War|Animation|Comedy|Thriller|Film-Noir|Crime|Sci-Fi|Musical|Mystery|Children's|Drama|Horror|Western|Romance|Action|year|
+-------+-----------+-------+---------+---+---------+------+--------+---------+-----+------+-------+-------+----------+-----+------+-------+-------+------+----+
|      1|          0|      0|        0|  0|        1|     1|       0|        0|    0|     0|      0|      0|         1|    0|     0|      0|      0|     0|1995|
|      2|          0|      1|        1|  0|        0|     0|       0|        0|    0|     0|      0|      0|         1|    0|     0|      0|      0|     0|1995|
|      3|          0|      0|        0|  0|        0|     1|       0|        0|    0|     0|      0|      0|         0|    0|     0|      0|      1|     0|1995|
+-------+-----------+-------+-----

### 2.5.2 Models training

In [49]:
models_with_features = {'Classifier Recommender': [ClassifierRec(), 'no_opt'],
        'LightFM with item features': [LightFMWrap(random_state=SEED), {"no_components": [8, 512]}]}

In [50]:
%%time
full_pipeline(models_with_features, e, train)

06-Aug-21 17:56:06, replay, INFO: Classifier Recommender started
INFO:replay:Classifier Recommender started
06-Aug-21 17:56:06, replay, INFO: Classifier Recommender fit_predict started
INFO:replay:Classifier Recommender fit_predict started
06-Aug-21 17:56:06, replay, DEBUG: Начало обучения ClassifierRec
DEBUG:replay:Начало обучения ClassifierRec
06-Aug-21 17:56:06, replay, DEBUG: Предварительная стадия обучения (pre-fit)
DEBUG:replay:Предварительная стадия обучения (pre-fit)
06-Aug-21 17:56:07, replay, DEBUG: Основная стадия обучения (fit)
DEBUG:replay:Основная стадия обучения (fit)
06-Aug-21 17:56:13, replay, DEBUG: Начало предикта ClassifierRec
DEBUG:replay:Начало предикта ClassifierRec
06-Aug-21 17:56:38, replay, INFO: LightFM with item features started
INFO:replay:LightFM with item features started
06-Aug-21 17:56:38, replay, INFO: LightFM with item features optimization started
INFO:replay:LightFM with item features optimization started
[32m[I 2021-08-06 17:56:38,498][0m A new s

                                          NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                             0.265372  0.419297     0.132810   
LightFM                                  0.262777  0.431049     0.151303   
SLIM                                     0.261370  0.418741     0.040347   
Stack Recommender (LightFM + KNN + ALS)  0.260570  0.416054     0.057439   
KNN                                      0.256174  0.408699     0.055758   
Popular Recommender                      0.243614  0.390414     0.033903   
MultVAE with default parameters          0.243479  0.393790     0.032222   
MultVAE with optimized parameters        0.237955  0.395733     0.030821   
NeuroMF with optimized parameters        0.198788  0.243165     0.076772   
NeuroMF with default parameters          0.193122  0.317911     0.257495   
ADMM SLIM                                0.159086  0.303578     0.366769   
Word2Vec with default parameters         0.137660  0.243760     0.145139   
Wilson Recom

06-Aug-21 17:56:39, replay, DEBUG: Основная стадия обучения (fit)
DEBUG:replay:Основная стадия обучения (fit)
06-Aug-21 17:58:07, replay, DEBUG: Предикт модели в оптимизации
DEBUG:replay:Предикт модели в оптимизации
06-Aug-21 17:58:07, replay, DEBUG: Начало предикта LightFMWrap
DEBUG:replay:Начало предикта LightFMWrap
  1 / concat_features.sum(axis=1).A.ravel(),
06-Aug-21 17:58:13, replay, DEBUG: Подсчет метрики в оптимизации
DEBUG:replay:Подсчет метрики в оптимизации
06-Aug-21 17:58:17, replay, DEBUG: NDCG=0.192064
DEBUG:replay:NDCG=0.192064
[32m[I 2021-08-06 17:58:17,757][0m Trial 0 finished with value: 0.19206417961180125 and parameters: {'no_components': 180}. Best is trial 0 with value: 0.19206417961180125.[0m
06-Aug-21 17:58:17, replay, DEBUG: Фит модели в оптимизации
DEBUG:replay:Фит модели в оптимизации
06-Aug-21 17:58:17, replay, DEBUG: Начало обучения LightFMWrap
DEBUG:replay:Начало обучения LightFMWrap
06-Aug-21 17:58:17, replay, DEBUG: Основная стадия обучения (fit)
DEBU

06-Aug-21 18:05:08, replay, DEBUG: Фит модели в оптимизации
DEBUG:replay:Фит модели в оптимизации
06-Aug-21 18:05:08, replay, DEBUG: Начало обучения LightFMWrap
DEBUG:replay:Начало обучения LightFMWrap
06-Aug-21 18:05:08, replay, DEBUG: Основная стадия обучения (fit)
DEBUG:replay:Основная стадия обучения (fit)
06-Aug-21 18:06:01, replay, DEBUG: Предикт модели в оптимизации
DEBUG:replay:Предикт модели в оптимизации
06-Aug-21 18:06:02, replay, DEBUG: Начало предикта LightFMWrap
DEBUG:replay:Начало предикта LightFMWrap
  1 / concat_features.sum(axis=1).A.ravel(),
06-Aug-21 18:06:07, replay, DEBUG: Подсчет метрики в оптимизации
DEBUG:replay:Подсчет метрики в оптимизации
06-Aug-21 18:06:10, replay, DEBUG: NDCG=0.193907
DEBUG:replay:NDCG=0.193907
[32m[I 2021-08-06 18:06:10,196][0m Trial 8 finished with value: 0.193907033120705 and parameters: {'no_components': 52}. Best is trial 8 with value: 0.193907033120705.[0m
06-Aug-21 18:06:10, replay, DEBUG: Фит модели в оптимизации
DEBUG:replay:Фи

06-Aug-21 18:14:34, replay, DEBUG: NDCG=0.191595
DEBUG:replay:NDCG=0.191595
[32m[I 2021-08-06 18:14:34,508][0m Trial 15 finished with value: 0.19159450713058054 and parameters: {'no_components': 95}. Best is trial 12 with value: 0.19546122898109716.[0m
06-Aug-21 18:14:34, replay, DEBUG: Фит модели в оптимизации
DEBUG:replay:Фит модели в оптимизации
06-Aug-21 18:14:34, replay, DEBUG: Начало обучения LightFMWrap
DEBUG:replay:Начало обучения LightFMWrap
06-Aug-21 18:14:34, replay, DEBUG: Основная стадия обучения (fit)
DEBUG:replay:Основная стадия обучения (fit)
06-Aug-21 18:16:20, replay, DEBUG: Предикт модели в оптимизации
DEBUG:replay:Предикт модели в оптимизации
06-Aug-21 18:16:20, replay, DEBUG: Начало предикта LightFMWrap
DEBUG:replay:Начало предикта LightFMWrap
  1 / concat_features.sum(axis=1).A.ravel(),
06-Aug-21 18:16:26, replay, DEBUG: Подсчет метрики в оптимизации
DEBUG:replay:Подсчет метрики в оптимизации
06-Aug-21 18:16:30, replay, DEBUG: NDCG=0.181285
DEBUG:replay:NDCG=0.

                                          NDCG@10    MRR@10  Coverage@10  \
Implicit ALS                             0.265372  0.419297     0.132810   
LightFM                                  0.262777  0.431049     0.151303   
SLIM                                     0.261370  0.418741     0.040347   
Stack Recommender (LightFM + KNN + ALS)  0.260570  0.416054     0.057439   
KNN                                      0.256174  0.408699     0.055758   
LightFM with item features               0.254673  0.412271     0.231718   
Popular Recommender                      0.243614  0.390414     0.033903   
MultVAE with default parameters          0.243479  0.393790     0.032222   
MultVAE with optimized parameters        0.237955  0.395733     0.030821   
NeuroMF with optimized parameters        0.198788  0.243165     0.076772   
NeuroMF with default parameters          0.193122  0.317911     0.257495   
ADMM SLIM                                0.159086  0.303578     0.366769   
Word2Vec wit

In [51]:
e.results.sort_values('NDCG@10', ascending=False)

Unnamed: 0,Coverage@10,HitRate@1,HitRate@5,HitRate@10,MAP@10,MRR@10,NDCG@10,Surprisal@10,fit_pred_time,params
Implicit ALS,0.13281,0.305531,0.569798,0.685689,0.171672,0.419297,0.265372,0.162866,10.127573,{'rank': 8}
LightFM,0.151303,0.317823,0.574188,0.698859,0.167327,0.431049,0.262777,0.168066,10.833092,{'no_components': 8}
SLIM,0.040347,0.310799,0.567164,0.669008,0.171509,0.418741,0.26137,0.123728,12.456171,"{'beta': 4.528603379741062, 'lambda_': 0.01886..."
Stack Recommender (LightFM + KNN + ALS),0.057439,0.304653,0.562774,0.661106,0.169578,0.416054,0.26057,0.136999,1057.291856,
KNN,0.055758,0.294996,0.555751,0.65496,0.166407,0.408699,0.256174,0.137584,17.963558,"{'num_neighbours': 56, 'shrink': 99}"
LightFM with item features,0.231718,0.287972,0.585601,0.690957,0.159831,0.412271,0.254673,0.194597,86.704299,{'no_components': 78}
Popular Recommender,0.033903,0.28446,0.53029,0.645303,0.157194,0.390414,0.243614,0.118354,12.300273,
MultVAE with default parameters,0.032222,0.286216,0.519754,0.658472,0.154847,0.39379,0.243479,0.121923,32.609149,
MultVAE with optimized parameters,0.030821,0.287094,0.543459,0.640035,0.150969,0.395733,0.237955,0.122875,26.977435,"{'learning_rate': 0.010693178531368242, 'dropo..."
NeuroMF with optimized parameters,0.076772,0.021949,0.524144,0.653205,0.114313,0.243165,0.198788,0.231221,2791.187742,"{'learning_rate': 0.004837890834754644, 'l2_re..."


In [52]:
e.results.to_csv('res_25_rel_1.csv')

# 3. Results

The best results by quality and time were shown by the commonly-used models such as ALS, SLIM and LightFM. 