# Recommender Systems 2021/22

### Practice - Hyperparameter optimization with Skopt

### Hyperparameter optimization is essential to achieve the best recommendation quality!!

## How does it work
* Split the data in training, validation and test. Ensure that the distribution of the three sets is similar (e.g., the validation data you use should be split with the same approach used for the test data; if you know the test data contains cold items or users, ensure the validation data does too...)
* Choose a recommender model, identify its hyperparameters and choose a value range and distribution for each.
* Select a hyperparameter configuration and fit your model using the training data. Evaluate it using the validation data.
* Repeat the previous step by exploring many possible hyperparameter configurations. Several exploration strategies are possible.
* Select the hyperparameter configuration with the best recommendation quality on the validation data. Use that configuration to fit the model on the union of training and validation data. Evaluate the final model on the test data and report that result (or submit to the course challenge).

Using directly or indirectly any information on the composition of the test data at any stage of the training or optimization process will result in *information leakage* and cause you to overestimate the quality of your model.



## Hyperparameter optimization strategies

### Grid-search
For each hyperparameter define a list of possible values, then explore *all* possible combinations. For example, in a KNN using tversky similarity you could choose these values:
* number of neighbors: [10, 50, 100, 150, 200, 250]
* shrink term: [0, 10, 20, 50, 100]
* alpha: [0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 1.3, 1.5]
* beta: [0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 1.3, 1.5]

You end up with *1920* possible hyperparameter configurations. Furthermore, you cannot easily fine-tune. Say that you found the optimal number of neighbors is 200, but in that area of the search space you were using a step of 50. What if there is an even better result at 215? You have to define a *new* hyperparameter range around the optimal one you found and use a smaller step, e.g., [180, 190, 200, 210, 220] and so on. 

Overall, grid-search is very sensitive to the range and distribution you choose, very rigid, and often unpractically slow due to how the number of cases grows combinatorially. It has been known for more than 20 years that it is generally a bad idea.

### Random-search
In a random search you define the range and distribution of possible values, and then you pick at random from it each hyperparameter configuration you explore.
For example, in a KNN using tversky similarity you could choose these values:
* number of neighbors: uniform random from 10 to 500
* shrink term: uniform random from 0 to 500
* alpha: uniform random from 0.1 to 1.5
* beta: uniform random from 0.1 to 1.5

It is very parallelizable and effective (definitely better than Grid-search).


### Bayesian-search
More advanced strategy that uses a gaussian process to try to model the interdependencies between hyperparameters based on how they affect the result. It contains a part of random search where the search space is explored, then a second part that uses the gaussian process to choose the next hyperparameter configuration. It combines exploration with exploitation. 

Bayesian-search is less parallelizable (the gaussian process is sequential) but has a good exploration-exploitation tradeoff. It is the strategy used in our research activity.


### Other strategies
There is no shortage of hyperparameter optimization strategies, for example the Tree-Structured Parzen Estimator, as well as techniques that use previous optimization runs to attempt transfer learning between models and datasets. All those are beyond the scope of the course.



### In the course repository you will find a BayesianSearch object in the HyperparameterTuning folder. That is a simple wrapper of another library and its purpose is to provide a very simple way to tune some of the most common hyperparameters. 




## Hyperparameter sensitivity

One of the most important problems you encounter when doing hyperparameter tuning is how to select the range and distribution of the hyperparameters. There is no universal rule and some of those decisions are based on experience. Generally the suggestions we can give you are:
* Keep the search space on the larger side, better have more room to manouver than less.
* Hyperparameters like the number of neighbors, the shrink term, the number of latent dimensions of matrix factorization models etc. work well with a uniform random distribution
* Hyperparameters like the learning rate and l1 l2 regularizations work better with a log-uniform distribution as the training tends to be affected by the orders of magnitude rather than the absolute values. You need to be able to explore maybe from 10^-9 to 10^-3, but is less important to choose the specific value.

### In the course repository you will find a script called run_hyperparameter_search that contains a list of commonly used hyperparameters with the corresponding range and distribution.

### Hint: If you see that the optimization yelds a hyperparameter that has a value at either end of the range (min or max) this may indicate that in that scenarion you need to expand the search space.



## Early-stopping

A special hyperparameter is the number of epochs you should train a machine learning model for.
It is possible to put this number as a hyperparameter, but that is not a super-effective strategy. A common strategy is *early-stopping*. It works as follows:
* Select a maximum number of epochs, say 500
* Train the model for a certain number of epochs, say 5
* Evaluate the recommendation quality of the model on the validation data. Create a clone of the model.
* Continue training and evaluate the model periodically. Every time you find a better recommendation quality update the model clone, which will then represent the "best" model you have.
* If the recommendation quality does not improve for a certain number of consecutive validation steps, say 5, stop the training. If you reach the maximum number of epochs, stop the training.
* Use the "best" model clone to generate the recommendations on the validation data.
* When you train the model on the union of training and validation data, use the optimal number of epochs selected by the early-stopping at the previous step.

Usually early-stopping allows to save a lot of computational time and fine-tune the optimal number of epochs, unless the validation step takes a very large amount of time. In those cases, it can be better to either use early-stopping on the algorithm loss function (which is however a different problem then recommendation and does not guarantee you that the best absolute loss will correspond to the model with the best recommendation quality) or just select the number of epochs as any other hyperparameter.


# Example on the course repository

In [1]:
from Data_manager.split_functions.split_train_validation_random_holdout import split_train_in_two_percentage_global_sample
from Data_manager.Movielens.Movielens1MReader import Movielens1MReader

data_reader = Movielens1MReader()
data_loaded = data_reader.load_data()

URM_all = data_loaded.get_URM_all()
ICM_all = data_loaded.get_ICM_from_name("ICM_genres")

Movielens1M: Verifying data consistency...
Movielens1M: Verifying data consistency... Passed!
DataReader: current dataset is: <class 'Data_manager.Dataset.Dataset'>
	Number of items: 3883
	Number of users: 6040
	Number of interactions in URM_all: 1000209
	Value range in URM_all: 1.00-5.00
	Interaction density: 4.26E-02
	Interactions per user:
		 Min: 2.00E+01
		 Avg: 1.66E+02
		 Max: 2.31E+03
	Interactions per item:
		 Min: 0.00E+00
		 Avg: 2.58E+02
		 Max: 3.43E+03
	Gini Index: 0.53

	ICM name: ICM_genres, Value range: 1.00 / 1.00, Num features: 18, feature occurrences: 6408, density 9.17E-02
	ICM name: ICM_year, Value range: 1.92E+03 / 2.00E+03, Num features: 1, feature occurrences: 3883, density 1.00E+00




### How do we perform hyperparameter optimization?
* Split the data in three *disjoint* sets: training, validation and testing data
* Define a set of hyperparameters with the range and distribution
* Explore hyperparameter space and select those with the best recommendation quality on the *validation* data (including the number of epochs for ML algorithms)
* Given the best hyperparameters, fit the model again using the union of training and validation data.
* Evaluate this last model on the testing data.

### Step 1: Split the data and create the evaluator objects

In [2]:
from Evaluation.Evaluator import EvaluatorHoldout

URM_train_validation, URM_test = split_train_in_two_percentage_global_sample(URM_all, train_percentage = 0.8)
URM_train, URM_validation = split_train_in_two_percentage_global_sample(URM_train_validation, train_percentage = 0.8)

evaluator_validation = EvaluatorHoldout(URM_validation, cutoff_list=[10])
evaluator_test = EvaluatorHoldout(URM_test, cutoff_list=[10])

EvaluatorHoldout: Ignoring 6024 ( 0.3%) Users that have less than 1 test interactions
EvaluatorHoldout: Ignoring 6036 ( 0.1%) Users that have less than 1 test interactions


### Step 2: Define hyperparameter set for the desired model, in this case ItemKNN

In [3]:
from skopt.space import Real, Integer, Categorical

hyperparameters_range_dictionary = {
    "topK": Integer(5, 1000),
    "shrink": Integer(0, 1000),
    "similarity": Categorical(["cosine"]),
    "normalize": Categorical([True, False]),
}

### Step 3: Create SearchBayesianSkopt object, providing the desired recommender class and evaluator objects

In [4]:
from Recommenders.KNN.ItemKNNCFRecommender import ItemKNNCFRecommender
from HyperparameterTuning.SearchBayesianSkopt import SearchBayesianSkopt

recommender_class = ItemKNNCFRecommender

hyperparameterSearch = SearchBayesianSkopt(recommender_class,
                                         evaluator_validation=evaluator_validation,
                                         evaluator_test=evaluator_test)

### Step 4: Provide the data needed to create an instance of the model, one trained only on URM_train, the other on URM_train_validation

In [5]:
from HyperparameterTuning.SearchAbstractClass import SearchInputRecommenderArgs
  
recommender_input_args = SearchInputRecommenderArgs(
    CONSTRUCTOR_POSITIONAL_ARGS = [URM_train],     # For a CBF model simply put [URM_train, ICM_train]
    CONSTRUCTOR_KEYWORD_ARGS = {},
    FIT_POSITIONAL_ARGS = [],
    FIT_KEYWORD_ARGS = {},
    EARLYSTOPPING_KEYWORD_ARGS = {},
)

In [6]:
recommender_input_args_last_test = SearchInputRecommenderArgs(
    CONSTRUCTOR_POSITIONAL_ARGS = [URM_train_validation],     # For a CBF model simply put [URM_train_validation, ICM_train]
    CONSTRUCTOR_KEYWORD_ARGS = {},
    FIT_POSITIONAL_ARGS = [],
    FIT_KEYWORD_ARGS = {},
    EARLYSTOPPING_KEYWORD_ARGS = {},
)

### Step 5: Create a result folder and select the number of cases (50 with 30% random is a good number)

In [7]:
import os

output_folder_path = "result_experiments/"

# If directory does not exist, create
if not os.path.exists(output_folder_path):
    os.makedirs(output_folder_path)
    
n_cases = 10  # using 10 as an example
n_random_starts = int(n_cases*0.3)
metric_to_optimize = "MAP"   
cutoff_to_optimize = 10

### Step 5: Run!

In [8]:
hyperparameterSearch.search(recommender_input_args,
                       recommender_input_args_last_test = recommender_input_args_last_test,
                       hyperparameter_search_space = hyperparameters_range_dictionary,
                       n_cases = n_cases,
                       n_random_starts = n_random_starts,
                       save_model = "last",
                       output_folder_path = output_folder_path, # Where to save the results
                       output_file_name_root = recommender_class.RECOMMENDER_NAME, # How to call the files
                       metric_to_optimize = metric_to_optimize,
                       cutoff_to_optimize = cutoff_to_optimize,
                      )

Iteration No: 1 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 150, 'shrink': 314, 'similarity': 'cosine', 'normalize': True}
ItemKNNCFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
Similarity column 3883 (100.0%), 5210.63 column/sec. Elapsed time 0.75 sec
EvaluatorHoldout: Processed 6024 (100.0%) in 5.41 sec. Users per second: 1114
SearchBayesianSkopt: New best config found. Config 0: {'topK': 150, 'shrink': 314, 'similarity': 'cosine', 'normalize': True} - results: PRECISION: 0.2026228, PRECISION_RECALL_MIN_DEN: 0.2288963, RECALL: 0.1204475, MAP: 0.1059938, MAP_MIN_DEN: 0.1170157, MRR: 0.4239689, NDCG: 0.1923506, F1: 0.1510842, HIT_RATE: 0.7981408, ARHR_ALL_HITS: 0.6578372, NOVELTY: 0.0241524, AVERAGE_POPULARITY: 0.4900907, DIVERSITY_MEAN_INTER_LIST: 0.9322565, DIVERSITY_HERFINDAHL: 0.9932102, COVERAGE_ITEM: 0.2722122, COVERAGE_ITEM_CORRECT: 0.1738347, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.7960265, DIVERSITY

Similarity column 3883 (100.0%), 2689.85 column/sec. Elapsed time 1.44 sec
EvaluatorHoldout: Processed 6024 (100.0%) in 9.61 sec. Users per second: 627
SearchBayesianSkopt: Config 4 is suboptimal. Config: {'topK': 783, 'shrink': 0, 'similarity': 'cosine', 'normalize': False} - results: PRECISION: 0.1420983, PRECISION_RECALL_MIN_DEN: 0.1581309, RECALL: 0.0767443, MAP: 0.0734144, MAP_MIN_DEN: 0.0803848, MRR: 0.3458051, NDCG: 0.1431956, F1: 0.0996628, HIT_RATE: 0.6636786, ARHR_ALL_HITS: 0.4954128, NOVELTY: 0.0221433, AVERAGE_POPULARITY: 0.7557058, DIVERSITY_MEAN_INTER_LIST: 0.5132387, DIVERSITY_HERFINDAHL: 0.9513154, COVERAGE_ITEM: 0.0159670, COVERAGE_ITEM_CORRECT: 0.0121040, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.6619205, DIVERSITY_GINI: 0.0050268, SHANNON_ENTROPY: 4.5819229, RATIO_DIVERSITY_HERFINDAHL: 0.9520925, RATIO_DIVERSITY_GINI: 0.0143769, RATIO_SHANNON_ENTROPY: 0.4237471, RATIO_AVERAGE_POPULARITY: 2.7359250, RATIO_NOVELTY: 0.0750299, 

Iteration No: 5 ended. Search fi

Similarity column 3883 (100.0%), 1733.86 column/sec. Elapsed time 2.24 sec
EvaluatorHoldout: Processed 6036 (100.0%) in 8.20 sec. Users per second: 736
SearchBayesianSkopt: Best config evaluated with evaluator_test with constructor data for final test. Config: {'topK': 300, 'shrink': 1000, 'similarity': 'cosine', 'normalize': True} - results:
CUTOFF: 10 - PRECISION: 0.3289430, PRECISION_RECALL_MIN_DEN: 0.3507929, RECALL: 0.1443527, MAP: 0.2372513, MAP_MIN_DEN: 0.2481392, MRR: 0.6085585, NDCG: 0.3006104, F1: 0.2006518, HIT_RATE: 0.8727634, ARHR_ALL_HITS: 1.1395252, NOVELTY: 0.0237508, AVERAGE_POPULARITY: 0.5261314, DIVERSITY_MEAN_INTER_LIST: 0.9116772, DIVERSITY_HERFINDAHL: 0.9911526, COVERAGE_ITEM: 0.1964976, COVERAGE_ITEM_CORRECT: 0.1529745, COVERAGE_USER: 0.9993377, COVERAGE_USER_CORRECT: 0.8721854, DIVERSITY_GINI: 0.0382841, SHANNON_ENTROPY: 7.5692657, RATIO_DIVERSITY_HERFINDAHL: 0.9919632, RATIO_DIVERSITY_GINI: 0.1094867, RATIO_SHANNON_ENTROPY: 0.7000265, RATIO_AVERAGE_POPULARITY: 

### The metadata.zip file contains details on the search

In [9]:
from Recommenders.DataIO import DataIO

data_loader = DataIO(folder_path = output_folder_path)
search_metadata = data_loader.load_data(recommender_class.RECOMMENDER_NAME + "_metadata.zip")

search_metadata.keys()

dict_keys(['result_on_test_best', 'time_df', 'result_on_last', 'time_on_last_df', 'time_on_validation_avg', 'cutoff_to_optimize', 'metric_to_optimize', 'time_on_validation_total', 'algorithm_name_recommender', 'hyperparameters_df', 'result_on_validation_best', 'time_on_train_total', 'time_on_train_avg', 'result_on_test_df', 'algorithm_name_search', 'time_on_test_avg', 'hyperparameters_best', 'hyperparameters_best_index', 'result_on_validation_df', 'exception_list', 'time_on_test_total'])

In [10]:
hyperparameters_df = search_metadata["hyperparameters_df"]
hyperparameters_df

Unnamed: 0,topK,shrink,similarity,normalize
0,150,314,cosine,True
1,383,664,cosine,False
2,159,547,cosine,True
3,300,1000,cosine,True
4,783,0,cosine,False
5,998,2,cosine,True
6,5,1000,cosine,True
7,693,122,cosine,True
8,513,46,cosine,True
9,5,182,cosine,False


In [11]:
result_on_validation_df = search_metadata["result_on_validation_df"]
result_on_validation_df

Unnamed: 0_level_0,Unnamed: 1_level_0,PRECISION,PRECISION_RECALL_MIN_DEN,RECALL,MAP,MAP_MIN_DEN,MRR,NDCG,F1,HIT_RATE,ARHR_ALL_HITS,...,COVERAGE_ITEM_CORRECT,COVERAGE_USER,COVERAGE_USER_CORRECT,DIVERSITY_GINI,SHANNON_ENTROPY,RATIO_DIVERSITY_HERFINDAHL,RATIO_DIVERSITY_GINI,RATIO_SHANNON_ENTROPY,RATIO_AVERAGE_POPULARITY,RATIO_NOVELTY
Unnamed: 0_level_1,cutoff,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
0,10,0.202623,0.228896,0.120448,0.105994,0.117016,0.423969,0.192351,0.151084,0.798141,0.657837,...,0.173835,0.997351,0.796026,0.053959,8.034793,0.994022,0.154324,0.743077,1.774303,0.081837
1,10,0.145435,0.162662,0.080011,0.075451,0.082986,0.354088,0.147499,0.10323,0.678287,0.508359,...,0.017255,0.997351,0.67649,0.005378,4.700364,0.954948,0.015382,0.434701,2.718386,0.075125
2,10,0.203602,0.229579,0.12052,0.107041,0.118379,0.425583,0.194463,0.151413,0.800299,0.662361,...,0.164048,0.997351,0.798179,0.050306,7.938351,0.993629,0.143876,0.734158,1.806573,0.081526
3,10,0.202125,0.228466,0.120046,0.107534,0.119399,0.429007,0.197879,0.15063,0.794655,0.667068,...,0.126964,0.997351,0.79255,0.033902,7.378293,0.990586,0.09696,0.682362,1.980549,0.080017
4,10,0.142098,0.158131,0.076744,0.073414,0.080385,0.345805,0.143196,0.099663,0.663679,0.495413,...,0.012104,0.997351,0.661921,0.005027,4.581923,0.952093,0.014377,0.423747,2.735925,0.07503
5,10,0.193675,0.217827,0.112892,0.102723,0.113711,0.417543,0.186976,0.14264,0.77905,0.644213,...,0.112799,0.997351,0.776987,0.024665,6.889494,0.986169,0.070542,0.637157,2.094322,0.079146
6,10,0.157669,0.17911,0.093214,0.07686,0.084908,0.354809,0.145788,0.117162,0.72842,0.515201,...,0.201133,0.997351,0.72649,0.080293,8.627045,0.996386,0.229641,0.79785,1.564439,0.083728
7,10,0.197825,0.222848,0.115995,0.105119,0.116278,0.423346,0.191634,0.146241,0.786355,0.6555,...,0.124131,0.997351,0.784272,0.02885,7.119046,0.988209,0.082511,0.658386,2.041729,0.07956
8,10,0.20083,0.228163,0.120765,0.105947,0.1177,0.424558,0.194242,0.150831,0.797311,0.660029,...,0.139583,0.997351,0.795199,0.035164,7.40108,0.990311,0.10057,0.684469,1.96181,0.080221
9,10,0.138928,0.1543,0.074728,0.069565,0.076122,0.325661,0.134207,0.097182,0.650896,0.469209,...,0.069534,0.997351,0.649172,0.024053,6.825971,0.986343,0.068793,0.631282,2.160487,0.078851


In [12]:
result_best_on_test = search_metadata["result_on_last"]
result_best_on_test

Unnamed: 0_level_0,PRECISION,PRECISION_RECALL_MIN_DEN,RECALL,MAP,MAP_MIN_DEN,MRR,NDCG,F1,HIT_RATE,ARHR_ALL_HITS,...,COVERAGE_ITEM_CORRECT,COVERAGE_USER,COVERAGE_USER_CORRECT,DIVERSITY_GINI,SHANNON_ENTROPY,RATIO_DIVERSITY_HERFINDAHL,RATIO_DIVERSITY_GINI,RATIO_SHANNON_ENTROPY,RATIO_AVERAGE_POPULARITY,RATIO_NOVELTY
cutoff,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
10,0.328943,0.350793,0.144353,0.237251,0.248139,0.608559,0.30061,0.200652,0.872763,1.139525,...,0.152975,0.999338,0.872185,0.038284,7.569266,0.991963,0.109487,0.700027,1.915292,0.064382


In [13]:
best_hyperparameters = search_metadata["hyperparameters_best"]
best_hyperparameters

{'topK': 300, 'shrink': 1000, 'similarity': 'cosine', 'normalize': True}

In [14]:
time_df = search_metadata["time_df"]
time_df

Unnamed: 0,train,validation,test
0,0.790614,5.419285,6.228587
1,1.550529,8.002338,
2,1.721653,6.326061,6.375081
3,1.495981,7.337181,7.332524
4,1.628996,9.625446,
5,1.810539,9.066864,
6,1.304739,4.667664,
7,1.568717,8.511135,
8,1.85302,7.490336,
9,1.650328,4.953592,


In [15]:
exception_list = search_metadata["exception_list"]
exception_list

[None, None, None, None, None, None, None, None, None, None]

## An example with earlystopping, for FunKSVD

In [16]:
hyperparameters_range_dictionary = {
    "epochs": Categorical([500]),
    "num_factors": Integer(1, 200),
    "sgd_mode": Categorical(["sgd", "adagrad", "adam"]),
    "batch_size": Categorical([1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]),
    "item_reg": Real(low = 1e-5, high = 1e-2, prior = 'log-uniform'),
    "user_reg": Real(low = 1e-5, high = 1e-2, prior = 'log-uniform'),
    "learning_rate": Real(low = 1e-4, high = 1e-1, prior = 'log-uniform'),
}

In [17]:
earlystopping_keywargs = {"validation_every_n": 5,
                          "stop_on_validation": True,
                          "evaluator_object": evaluator_validation,
                          "lower_validations_allowed": 5,
                          "validation_metric": metric_to_optimize,
                          }

In [18]:
recommender_input_args = SearchInputRecommenderArgs(
    CONSTRUCTOR_POSITIONAL_ARGS = [URM_train],     # For a CBF model simply put [URM_train, ICM_train]
    CONSTRUCTOR_KEYWORD_ARGS = {},
    FIT_POSITIONAL_ARGS = [],
    FIT_KEYWORD_ARGS = {},
    EARLYSTOPPING_KEYWORD_ARGS = earlystopping_keywargs,     # Additional hyperparameters for the fit function
)

In [19]:
recommender_input_args_last_test = SearchInputRecommenderArgs(
    CONSTRUCTOR_POSITIONAL_ARGS = [URM_train_validation],     # For a CBF model simply put [URM_train_validation, ICM_train]
    CONSTRUCTOR_KEYWORD_ARGS = {},
    FIT_POSITIONAL_ARGS = [],
    FIT_KEYWORD_ARGS = {},
    EARLYSTOPPING_KEYWORD_ARGS = earlystopping_keywargs,     # Additional hyperparameters for the fit function
)

In [20]:
from Recommenders.MatrixFactorization.Cython.MatrixFactorization_Cython import MatrixFactorization_FunkSVD_Cython

recommender_class = MatrixFactorization_FunkSVD_Cython

hyperparameterSearch = SearchBayesianSkopt(recommender_class,
                                         evaluator_validation=evaluator_validation,
                                         evaluator_test=evaluator_test)

In [21]:
hyperparameterSearch.search(recommender_input_args,
                       recommender_input_args_last_test = recommender_input_args_last_test,
                       hyperparameter_search_space = hyperparameters_range_dictionary,
                       n_cases = n_cases,
                       n_random_starts = n_random_starts,
                       save_model = "last",
                       output_folder_path = output_folder_path, # Where to save the results
                       output_file_name_root = recommender_class.RECOMMENDER_NAME, # How to call the files
                       metric_to_optimize = metric_to_optimize,
                       cutoff_to_optimize = cutoff_to_optimize,
                      )

Iteration No: 1 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'epochs': 500, 'num_factors': 53, 'sgd_mode': 'sgd', 'batch_size': 8, 'item_reg': 0.00012885833828867778, 'user_reg': 1.3327983000192677e-05, 'learning_rate': 0.0011498156311265017}
MatrixFactorization_FunkSVD_Cython_Recommender: URM Detected 238 ( 6.1%) items with no interactions.
FUNK_SVD: Processed 640136 (100.0%) in 4.02 sec. MSE loss 1.28E+00. Sample per second: 159340
FUNK_SVD: Epoch 1 of 500. Elapsed time 3.79 sec
FUNK_SVD: Processed 640136 (100.0%) in 3.80 sec. MSE loss 1.17E+00. Sample per second: 168655
FUNK_SVD: Epoch 2 of 500. Elapsed time 7.57 sec
FUNK_SVD: Processed 640136 (100.0%) in 3.67 sec. MSE loss 1.14E+00. Sample per second: 174457
FUNK_SVD: Epoch 3 of 500. Elapsed time 10.45 sec
FUNK_SVD: Processed 640136 (100.0%) in 3.58 sec. MSE loss 1.11E+00. Sample per second: 178704
FUNK_SVD: Epoch 4 of 500. Elapsed time 13.36 sec
FUNK_SVD: Processed 640136 (100.0%) in 3.04 sec

FUNK_SVD: Processed 640136 (100.0%) in 3.25 sec. MSE loss 9.18E-01. Sample per second: 197039
FUNK_SVD: Epoch 27 of 500. Elapsed time 1.57 min
FUNK_SVD: Processed 640136 (100.0%) in 2.77 sec. MSE loss 9.17E-01. Sample per second: 230949
FUNK_SVD: Epoch 28 of 500. Elapsed time 1.61 min
FUNK_SVD: Processed 640136 (100.0%) in 3.35 sec. MSE loss 9.12E-01. Sample per second: 191178
FUNK_SVD: Epoch 29 of 500. Elapsed time 1.65 min
FUNK_SVD: Processed 640136 (100.0%) in 3.27 sec. MSE loss 9.09E-01. Sample per second: 195660
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.59 sec. Users per second: 1676
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0633798, PRECISION_RECALL_MIN_DEN: 0.0693543, RECALL: 0.0307000, MAP: 0.0248637, MAP_MIN_DEN: 0.0267379, MRR: 0.1524233, NDCG: 0.0606391, F1: 0.0413640, HIT_RATE: 0.4025564, ARHR_ALL_HITS: 0.1929833, NOVELTY: 0.0242611, AVERAGE_POPULARITY: 0.4746087, DIVERSITY_MEAN_INTER_LIST: 0.5401015, DIVERSITY_HERFINDAHL: 0.9540012, COVERAGE_I

FUNK_SVD: Processed 640144 (100.0%) in 5.01 sec. MSE loss 1.46E+00. Sample per second: 127688
FUNK_SVD: Epoch 16 of 500. Elapsed time 1.28 min
FUNK_SVD: Processed 640144 (100.0%) in 4.32 sec. MSE loss 1.40E+00. Sample per second: 148304
FUNK_SVD: Epoch 17 of 500. Elapsed time 1.36 min
FUNK_SVD: Processed 640144 (100.0%) in 4.56 sec. MSE loss 1.34E+00. Sample per second: 140246
FUNK_SVD: Epoch 18 of 500. Elapsed time 1.43 min
FUNK_SVD: Processed 640144 (100.0%) in 4.82 sec. MSE loss 1.30E+00. Sample per second: 132573
FUNK_SVD: Epoch 19 of 500. Elapsed time 1.50 min
FUNK_SVD: Processed 640144 (100.0%) in 5.06 sec. MSE loss 1.26E+00. Sample per second: 126492
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.83 sec. Users per second: 1248
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1099104, PRECISION_RECALL_MIN_DEN: 0.1221991, RECALL: 0.0575444, MAP: 0.0504870, MAP_MIN_DEN: 0.0551992, MRR: 0.2610717, NDCG: 0.1046849, F1: 0.0755395, HIT_RATE: 0.5761952, ARHR_ALL_HITS: 

FUNK_SVD: Epoch 44 of 500. Elapsed time 3.66 min
FUNK_SVD: Processed 640144 (100.0%) in 4.63 sec. MSE loss 1.00E+00. Sample per second: 138161
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.87 sec. Users per second: 1238
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1073373, PRECISION_RECALL_MIN_DEN: 0.1186354, RECALL: 0.0550678, MAP: 0.0482923, MAP_MIN_DEN: 0.0525795, MRR: 0.2509920, NDCG: 0.1027217, F1: 0.0727912, HIT_RATE: 0.5634130, ARHR_ALL_HITS: 0.3425592, NOVELTY: 0.0224971, AVERAGE_POPULARITY: 0.6962233, DIVERSITY_MEAN_INTER_LIST: 0.5957626, DIVERSITY_HERFINDAHL: 0.9595664, COVERAGE_ITEM: 0.0309039, COVERAGE_ITEM_CORRECT: 0.0218903, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5619205, DIVERSITY_GINI: 0.0063850, SHANNON_ENTROPY: 4.9959996, RATIO_DIVERSITY_HERFINDAHL: 0.9603503, RATIO_DIVERSITY_GINI: 0.0182612, RATIO_SHANNON_ENTROPY: 0.4620419, RATIO_AVERAGE_POPULARITY: 2.5205771, RATIO_NOVELTY: 0.0762288, 

FUNK_SVD: Convergence reached! Terminating a

FUNK_SVD: Processed 640256 (100.0%) in 8.26 sec. MSE loss 1.24E+00. Sample per second: 77544
FUNK_SVD: Epoch 16 of 500. Elapsed time 2.65 min
FUNK_SVD: Processed 640256 (100.0%) in 9.38 sec. MSE loss 1.24E+00. Sample per second: 68258
FUNK_SVD: Epoch 17 of 500. Elapsed time 2.80 min
FUNK_SVD: Processed 640256 (100.0%) in 9.75 sec. MSE loss 1.24E+00. Sample per second: 65654
FUNK_SVD: Epoch 18 of 500. Elapsed time 2.96 min
FUNK_SVD: Processed 640256 (100.0%) in 10.05 sec. MSE loss 1.24E+00. Sample per second: 63694
FUNK_SVD: Epoch 19 of 500. Elapsed time 3.12 min
FUNK_SVD: Processed 640256 (100.0%) in 8.93 sec. MSE loss 1.24E+00. Sample per second: 71661
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.50 sec. Users per second: 1339
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0087317, PRECISION_RECALL_MIN_DEN: 0.0094239, RECALL: 0.0034604, MAP: 0.0027641, MAP_MIN_DEN: 0.0029988, MRR: 0.0251002, NDCG: 0.0057061, F1: 0.0049565, HIT_RATE: 0.0796813, ARHR_ALL_HITS: 0.02

FUNK_SVD: Epoch 43 of 500. Elapsed time 7.02 min
FUNK_SVD: Processed 640256 (100.0%) in 6.30 sec. MSE loss 1.23E+00. Sample per second: 101564
FUNK_SVD: Epoch 44 of 500. Elapsed time 7.12 min
FUNK_SVD: Processed 640256 (100.0%) in 7.25 sec. MSE loss 1.23E+00. Sample per second: 88293
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.19 sec. Users per second: 1438
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0098440, PRECISION_RECALL_MIN_DEN: 0.0107738, RECALL: 0.0042859, MAP: 0.0031124, MAP_MIN_DEN: 0.0034256, MRR: 0.0283008, NDCG: 0.0068696, F1: 0.0059718, HIT_RATE: 0.0901394, ARHR_ALL_HITS: 0.0295647, NOVELTY: 0.0316974, AVERAGE_POPULARITY: 0.0877446, DIVERSITY_MEAN_INTER_LIST: 0.9964656, DIVERSITY_HERFINDAHL: 0.9996300, COVERAGE_ITEM: 0.9984548, COVERAGE_ITEM_CORRECT: 0.0994077, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.0899007, DIVERSITY_GINI: 0.6577621, SHANNON_ENTROPY: 11.6426352, RATIO_DIVERSITY_HERFINDAHL: 1.0004467, RATIO_DIVERSITY_GINI: 1.8812215, 

FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0115206, PRECISION_RECALL_MIN_DEN: 0.0125946, RECALL: 0.0051836, MAP: 0.0037291, MAP_MIN_DEN: 0.0040975, MRR: 0.0337290, NDCG: 0.0086734, F1: 0.0071500, HIT_RATE: 0.1040837, ARHR_ALL_HITS: 0.0353673, NOVELTY: 0.0315051, AVERAGE_POPULARITY: 0.0996354, DIVERSITY_MEAN_INTER_LIST: 0.9960504, DIVERSITY_HERFINDAHL: 0.9995885, COVERAGE_ITEM: 0.9976822, COVERAGE_ITEM_CORRECT: 0.1012104, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.1038079, DIVERSITY_GINI: 0.6422826, SHANNON_ENTROPY: 11.6005743, RATIO_DIVERSITY_HERFINDAHL: 1.0004051, RATIO_DIVERSITY_GINI: 1.8369498, RATIO_SHANNON_ENTROPY: 1.0728486, RATIO_AVERAGE_POPULARITY: 0.3607159, RATIO_NOVELTY: 0.1067512, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 70 of 500. Elapsed time 10.89 min
FUNK_SVD: Processed 640256 (100.0%) in 9.77 sec. MSE loss 1.22E+00. Sample per second: 65546
FUNK_SVD: Epoch 71 of 500. Elapsed time 11.04 min
FUNK_SVD: Processed 640256 (100.0%) in 10.08 sec. MSE loss 1

FUNK_SVD: Processed 640256 (100.0%) in 8.54 sec. MSE loss 1.22E+00. Sample per second: 74952
FUNK_SVD: Epoch 96 of 500. Elapsed time 15.24 min
FUNK_SVD: Processed 640256 (100.0%) in 10.29 sec. MSE loss 1.22E+00. Sample per second: 62232
FUNK_SVD: Epoch 97 of 500. Elapsed time 15.40 min
FUNK_SVD: Processed 640256 (100.0%) in 9.50 sec. MSE loss 1.22E+00. Sample per second: 67379
FUNK_SVD: Epoch 98 of 500. Elapsed time 15.56 min
FUNK_SVD: Processed 640256 (100.0%) in 10.31 sec. MSE loss 1.22E+00. Sample per second: 62065
FUNK_SVD: Epoch 99 of 500. Elapsed time 15.72 min
FUNK_SVD: Processed 640256 (100.0%) in 9.87 sec. MSE loss 1.22E+00. Sample per second: 64889
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.61 sec. Users per second: 1306
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0138280, PRECISION_RECALL_MIN_DEN: 0.0150423, RECALL: 0.0062747, MAP: 0.0048458, MAP_MIN_DEN: 0.0053886, MRR: 0.0435508, NDCG: 0.0114577, F1: 0.0086323, HIT_RATE: 0.1235060, ARHR_ALL_HITS:

FUNK_SVD: Epoch 123 of 500. Elapsed time 19.69 min
FUNK_SVD: Processed 640256 (100.0%) in 6.52 sec. MSE loss 1.21E+00. Sample per second: 98236
FUNK_SVD: Epoch 124 of 500. Elapsed time 19.79 min
FUNK_SVD: Processed 640256 (100.0%) in 6.39 sec. MSE loss 1.21E+00. Sample per second: 100096
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.45 sec. Users per second: 1747
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0172643, PRECISION_RECALL_MIN_DEN: 0.0189638, RECALL: 0.0083705, MAP: 0.0062853, MAP_MIN_DEN: 0.0070220, MRR: 0.0559150, NDCG: 0.0154109, F1: 0.0112746, HIT_RATE: 0.1523904, ARHR_ALL_HITS: 0.0591447, NOVELTY: 0.0308105, AVERAGE_POPULARITY: 0.1429024, DIVERSITY_MEAN_INTER_LIST: 0.9920456, DIVERSITY_HERFINDAHL: 0.9991881, COVERAGE_ITEM: 0.9958795, COVERAGE_ITEM_CORRECT: 0.0994077, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.1519868, DIVERSITY_GINI: 0.5851221, SHANNON_ENTROPY: 11.3752785, RATIO_DIVERSITY_HERFINDAHL: 1.0000044, RATIO_DIVERSITY_GINI: 1.67346

EvaluatorHoldout: Processed 6024 (100.0%) in 4.32 sec. Users per second: 1395
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0211155, PRECISION_RECALL_MIN_DEN: 0.0233704, RECALL: 0.0105701, MAP: 0.0082179, MAP_MIN_DEN: 0.0091919, MRR: 0.0712811, NDCG: 0.0201063, F1: 0.0140880, HIT_RATE: 0.1807769, ARHR_ALL_HITS: 0.0764042, NOVELTY: 0.0303728, AVERAGE_POPULARITY: 0.1709661, DIVERSITY_MEAN_INTER_LIST: 0.9871562, DIVERSITY_HERFINDAHL: 0.9986992, COVERAGE_ITEM: 0.9935617, COVERAGE_ITEM_CORRECT: 0.0978625, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.1802980, DIVERSITY_GINI: 0.5488731, SHANNON_ENTROPY: 11.1916019, RATIO_DIVERSITY_HERFINDAHL: 0.9995151, RATIO_DIVERSITY_GINI: 1.5697955, RATIO_SHANNON_ENTROPY: 1.0350259, RATIO_AVERAGE_POPULARITY: 0.6189584, RATIO_NOVELTY: 0.1029147, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 150 of 500. Elapsed time 23.42 min
FUNK_SVD: Processed 640256 (100.0%) in 7.74 sec. MSE loss 1.20E+00. Sample per second: 82635
FUNK_SVD: Epoch 151 of 500. Ela

FUNK_SVD: Processed 640256 (100.0%) in 9.64 sec. MSE loss 1.20E+00. Sample per second: 66446
FUNK_SVD: Epoch 176 of 500. Elapsed time 27.72 min
FUNK_SVD: Processed 640256 (100.0%) in 9.56 sec. MSE loss 1.20E+00. Sample per second: 66996
FUNK_SVD: Epoch 177 of 500. Elapsed time 27.87 min
FUNK_SVD: Processed 640256 (100.0%) in 9.63 sec. MSE loss 1.20E+00. Sample per second: 66444
FUNK_SVD: Epoch 178 of 500. Elapsed time 28.02 min
FUNK_SVD: Processed 640256 (100.0%) in 10.15 sec. MSE loss 1.20E+00. Sample per second: 63103
FUNK_SVD: Epoch 179 of 500. Elapsed time 28.18 min
FUNK_SVD: Processed 640256 (100.0%) in 9.37 sec. MSE loss 1.20E+00. Sample per second: 68313
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.62 sec. Users per second: 1305
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0262616, PRECISION_RECALL_MIN_DEN: 0.0293764, RECALL: 0.0136763, MAP: 0.0108567, MAP_MIN_DEN: 0.0121548, MRR: 0.0914941, NDCG: 0.0266015, F1: 0.0179860, HIT_RATE: 0.2172975, ARHR_ALL_HI

FUNK_SVD: Epoch 203 of 500. Elapsed time 32.01 min
FUNK_SVD: Processed 640256 (100.0%) in 6.41 sec. MSE loss 1.20E+00. Sample per second: 99853
FUNK_SVD: Epoch 204 of 500. Elapsed time 32.10 min
FUNK_SVD: Processed 640256 (100.0%) in 6.01 sec. MSE loss 1.19E+00. Sample per second: 106528
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.34 sec. Users per second: 1801
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0311421, PRECISION_RECALL_MIN_DEN: 0.0350174, RECALL: 0.0167639, MAP: 0.0134718, MAP_MIN_DEN: 0.0151011, MRR: 0.1106827, NDCG: 0.0326634, F1: 0.0217953, HIT_RATE: 0.2516600, ARHR_ALL_HITS: 0.1218831, NOVELTY: 0.0291942, AVERAGE_POPULARITY: 0.2450652, DIVERSITY_MEAN_INTER_LIST: 0.9659987, DIVERSITY_HERFINDAHL: 0.9965838, COVERAGE_ITEM: 0.9860932, COVERAGE_ITEM_CORRECT: 0.0921968, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.2509934, DIVERSITY_GINI: 0.4530622, SHANNON_ENTROPY: 10.6078227, RATIO_DIVERSITY_HERFINDAHL: 0.9973980, RATIO_DIVERSITY_GINI: 1.29577

EvaluatorHoldout: Processed 6024 (100.0%) in 4.55 sec. Users per second: 1324
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0364708, PRECISION_RECALL_MIN_DEN: 0.0414284, RECALL: 0.0203618, MAP: 0.0160436, MAP_MIN_DEN: 0.0180564, MRR: 0.1280440, NDCG: 0.0390786, F1: 0.0261333, HIT_RATE: 0.2886786, ARHR_ALL_HITS: 0.1430159, NOVELTY: 0.0285949, AVERAGE_POPULARITY: 0.2824691, DIVERSITY_MEAN_INTER_LIST: 0.9507271, DIVERSITY_HERFINDAHL: 0.9950569, COVERAGE_ITEM: 0.9799124, COVERAGE_ITEM_CORRECT: 0.0878187, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.2879139, DIVERSITY_GINI: 0.4051828, SHANNON_ENTROPY: 10.2708056, RATIO_DIVERSITY_HERFINDAHL: 0.9958699, RATIO_DIVERSITY_GINI: 1.1588363, RATIO_SHANNON_ENTROPY: 0.9498685, RATIO_AVERAGE_POPULARITY: 1.0226391, RATIO_NOVELTY: 0.0968903, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 230 of 500. Elapsed time 35.77 min
FUNK_SVD: Processed 640256 (100.0%) in 9.46 sec. MSE loss 1.19E+00. Sample per second: 67652
FUNK_SVD: Epoch 231 of 500. Ela

FUNK_SVD: Processed 640256 (100.0%) in 9.95 sec. MSE loss 1.19E+00. Sample per second: 64327
FUNK_SVD: Epoch 256 of 500. Elapsed time 40.08 min
FUNK_SVD: Processed 640256 (100.0%) in 10.28 sec. MSE loss 1.19E+00. Sample per second: 62259
FUNK_SVD: Epoch 257 of 500. Elapsed time 40.24 min
FUNK_SVD: Processed 640256 (100.0%) in 9.57 sec. MSE loss 1.19E+00. Sample per second: 66874
FUNK_SVD: Epoch 258 of 500. Elapsed time 40.39 min
FUNK_SVD: Processed 640256 (100.0%) in 9.58 sec. MSE loss 1.19E+00. Sample per second: 66816
FUNK_SVD: Epoch 259 of 500. Elapsed time 40.54 min
FUNK_SVD: Processed 640256 (100.0%) in 9.77 sec. MSE loss 1.18E+00. Sample per second: 65515
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.51 sec. Users per second: 1334
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0425963, PRECISION_RECALL_MIN_DEN: 0.0486847, RECALL: 0.0242810, MAP: 0.0193130, MAP_MIN_DEN: 0.0217682, MRR: 0.1492216, NDCG: 0.0467004, F1: 0.0309307, HIT_RATE: 0.3243692, ARHR_ALL_HI

FUNK_SVD: Epoch 283 of 500. Elapsed time 44.28 min
FUNK_SVD: Processed 640256 (100.0%) in 5.68 sec. MSE loss 1.18E+00. Sample per second: 112653
FUNK_SVD: Epoch 284 of 500. Elapsed time 44.38 min
FUNK_SVD: Processed 640256 (100.0%) in 6.74 sec. MSE loss 1.18E+00. Sample per second: 94932
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.40 sec. Users per second: 1770
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0479416, PRECISION_RECALL_MIN_DEN: 0.0547067, RECALL: 0.0273321, MAP: 0.0219584, MAP_MIN_DEN: 0.0247636, MRR: 0.1649711, NDCG: 0.0527168, F1: 0.0348155, HIT_RATE: 0.3564077, ARHR_ALL_HITS: 0.1898037, NOVELTY: 0.0273270, AVERAGE_POPULARITY: 0.3634751, DIVERSITY_MEAN_INTER_LIST: 0.9073447, DIVERSITY_HERFINDAHL: 0.9907194, COVERAGE_ITEM: 0.9626577, COVERAGE_ITEM_CORRECT: 0.0839557, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.3554636, DIVERSITY_GINI: 0.3065650, SHANNON_ENTROPY: 9.4614627, RATIO_DIVERSITY_HERFINDAHL: 0.9915288, RATIO_DIVERSITY_GINI: 0.876786

EvaluatorHoldout: Processed 6024 (100.0%) in 4.67 sec. Users per second: 1289
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0526062, PRECISION_RECALL_MIN_DEN: 0.0601665, RECALL: 0.0301585, MAP: 0.0244564, MAP_MIN_DEN: 0.0275536, MRR: 0.1782749, NDCG: 0.0579454, F1: 0.0383382, HIT_RATE: 0.3794821, ARHR_ALL_HITS: 0.2081897, NOVELTY: 0.0267930, AVERAGE_POPULARITY: 0.3975933, DIVERSITY_MEAN_INTER_LIST: 0.8852871, DIVERSITY_HERFINDAHL: 0.9885140, COVERAGE_ITEM: 0.9492660, COVERAGE_ITEM_CORRECT: 0.0816379, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.3784768, DIVERSITY_GINI: 0.2653890, SHANNON_ENTROPY: 9.0865984, RATIO_DIVERSITY_HERFINDAHL: 0.9893216, RATIO_DIVERSITY_GINI: 0.7590215, RATIO_SHANNON_ENTROPY: 0.8403502, RATIO_AVERAGE_POPULARITY: 1.4394299, RATIO_NOVELTY: 0.0907850, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 310 of 500. Elapsed time 48.13 min
FUNK_SVD: Processed 640256 (100.0%) in 9.84 sec. MSE loss 1.17E+00. Sample per second: 65074
FUNK_SVD: Epoch 311 of 500. Elap

FUNK_SVD: Processed 640256 (100.0%) in 9.41 sec. MSE loss 1.17E+00. Sample per second: 67997
FUNK_SVD: Epoch 336 of 500. Elapsed time 52.47 min
FUNK_SVD: Processed 640256 (100.0%) in 9.75 sec. MSE loss 1.17E+00. Sample per second: 65688
FUNK_SVD: Epoch 337 of 500. Elapsed time 52.63 min
FUNK_SVD: Processed 640256 (100.0%) in 10.05 sec. MSE loss 1.18E+00. Sample per second: 63727
FUNK_SVD: Epoch 338 of 500. Elapsed time 52.78 min
FUNK_SVD: Processed 640256 (100.0%) in 9.20 sec. MSE loss 1.17E+00. Sample per second: 69548
FUNK_SVD: Epoch 339 of 500. Elapsed time 52.93 min
FUNK_SVD: Processed 640256 (100.0%) in 9.40 sec. MSE loss 1.17E+00. Sample per second: 68112
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 5.00 sec. Users per second: 1205
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0581839, PRECISION_RECALL_MIN_DEN: 0.0662999, RECALL: 0.0329649, MAP: 0.0271893, MAP_MIN_DEN: 0.0304932, MRR: 0.1910058, NDCG: 0.0636941, F1: 0.0420856, HIT_RATE: 0.4050465, ARHR_ALL_HI

FUNK_SVD: Epoch 363 of 500. Elapsed time 56.56 min
FUNK_SVD: Processed 640256 (100.0%) in 6.51 sec. MSE loss 1.17E+00. Sample per second: 98362
FUNK_SVD: Epoch 364 of 500. Elapsed time 56.66 min
FUNK_SVD: Processed 640256 (100.0%) in 6.58 sec. MSE loss 1.17E+00. Sample per second: 97299
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.39 sec. Users per second: 1779
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0625996, PRECISION_RECALL_MIN_DEN: 0.0712829, RECALL: 0.0353843, MAP: 0.0292478, MAP_MIN_DEN: 0.0327627, MRR: 0.1995559, NDCG: 0.0681370, F1: 0.0452124, HIT_RATE: 0.4244688, ARHR_ALL_HITS: 0.2406832, NOVELTY: 0.0257049, AVERAGE_POPULARITY: 0.4671956, DIVERSITY_MEAN_INTER_LIST: 0.8318940, DIVERSITY_HERFINDAHL: 0.9831756, COVERAGE_ITEM: 0.9013649, COVERAGE_ITEM_CORRECT: 0.0713366, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.4233444, DIVERSITY_GINI: 0.1850793, SHANNON_ENTROPY: 8.2581284, RATIO_DIVERSITY_HERFINDAHL: 0.9839788, RATIO_DIVERSITY_GINI: 0.5293331

EvaluatorHoldout: Processed 6024 (100.0%) in 4.84 sec. Users per second: 1244
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0669655, PRECISION_RECALL_MIN_DEN: 0.0759362, RECALL: 0.0373173, MAP: 0.0311894, MAP_MIN_DEN: 0.0348183, MRR: 0.2064615, NDCG: 0.0719500, F1: 0.0479268, HIT_RATE: 0.4404050, ARHR_ALL_HITS: 0.2526715, NOVELTY: 0.0252663, AVERAGE_POPULARITY: 0.4951144, DIVERSITY_MEAN_INTER_LIST: 0.8075010, DIVERSITY_HERFINDAHL: 0.9807367, COVERAGE_ITEM: 0.8748390, COVERAGE_ITEM_CORRECT: 0.0679887, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.4392384, DIVERSITY_GINI: 0.1543429, SHANNON_ENTROPY: 7.9011821, RATIO_DIVERSITY_HERFINDAHL: 0.9815379, RATIO_DIVERSITY_GINI: 0.4414257, RATIO_SHANNON_ENTROPY: 0.7307201, RATIO_AVERAGE_POPULARITY: 1.7924911, RATIO_NOVELTY: 0.0856120, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 390 of 500. Elapsed time 1.01 hour
FUNK_SVD: Processed 640256 (100.0%) in 9.66 sec. MSE loss 1.17E+00. Sample per second: 66304
FUNK_SVD: Epoch 391 of 500. Elap

FUNK_SVD: Processed 640256 (100.0%) in 9.66 sec. MSE loss 1.16E+00. Sample per second: 66271
FUNK_SVD: Epoch 416 of 500. Elapsed time 1.08 hour
FUNK_SVD: Processed 640256 (100.0%) in 10.17 sec. MSE loss 1.16E+00. Sample per second: 62964
FUNK_SVD: Epoch 417 of 500. Elapsed time 1.09 hour
FUNK_SVD: Processed 640256 (100.0%) in 9.26 sec. MSE loss 1.16E+00. Sample per second: 69125
FUNK_SVD: Epoch 418 of 500. Elapsed time 1.09 hour
FUNK_SVD: Processed 640256 (100.0%) in 9.48 sec. MSE loss 1.16E+00. Sample per second: 67519
FUNK_SVD: Epoch 419 of 500. Elapsed time 1.09 hour
FUNK_SVD: Processed 640256 (100.0%) in 10.41 sec. MSE loss 1.16E+00. Sample per second: 61517
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.82 sec. Users per second: 1249
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0712981, PRECISION_RECALL_MIN_DEN: 0.0806413, RECALL: 0.0395003, MAP: 0.0329616, MAP_MIN_DEN: 0.0366821, MRR: 0.2125624, NDCG: 0.0756297, F1: 0.0508364, HIT_RATE: 0.4603254, ARHR_ALL_H

FUNK_SVD: Epoch 443 of 500. Elapsed time 1.15 hour
FUNK_SVD: Processed 640256 (100.0%) in 6.89 sec. MSE loss 1.16E+00. Sample per second: 92878
FUNK_SVD: Epoch 444 of 500. Elapsed time 1.15 hour
FUNK_SVD: Processed 640256 (100.0%) in 6.41 sec. MSE loss 1.16E+00. Sample per second: 99904
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.27 sec. Users per second: 1840
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0746680, PRECISION_RECALL_MIN_DEN: 0.0844733, RECALL: 0.0413233, MAP: 0.0343516, MAP_MIN_DEN: 0.0381705, MRR: 0.2169731, NDCG: 0.0786267, F1: 0.0532027, HIT_RATE: 0.4732736, ARHR_ALL_HITS: 0.2715508, NOVELTY: 0.0245021, AVERAGE_POPULARITY: 0.5444604, DIVERSITY_MEAN_INTER_LIST: 0.7597723, DIVERSITY_HERFINDAHL: 0.9759646, COVERAGE_ITEM: 0.7950039, COVERAGE_ITEM_CORRECT: 0.0615503, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.4720199, DIVERSITY_GINI: 0.1018547, SHANNON_ENTROPY: 7.2237119, RATIO_DIVERSITY_HERFINDAHL: 0.9767619, RATIO_DIVERSITY_GINI: 0.2913079

EvaluatorHoldout: Processed 6024 (100.0%) in 4.75 sec. Users per second: 1269
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0775896, PRECISION_RECALL_MIN_DEN: 0.0875525, RECALL: 0.0425381, MAP: 0.0355919, MAP_MIN_DEN: 0.0394462, MRR: 0.2201345, NDCG: 0.0810179, F1: 0.0549501, HIT_RATE: 0.4824037, ARHR_ALL_HITS: 0.2783166, NOVELTY: 0.0242273, AVERAGE_POPULARITY: 0.5625959, DIVERSITY_MEAN_INTER_LIST: 0.7397524, DIVERSITY_HERFINDAHL: 0.9739630, COVERAGE_ITEM: 0.7465877, COVERAGE_ITEM_CORRECT: 0.0592326, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.4811258, DIVERSITY_GINI: 0.0830703, SHANNON_ENTROPY: 6.9532279, RATIO_DIVERSITY_HERFINDAHL: 0.9747587, RATIO_DIVERSITY_GINI: 0.2375837, RATIO_SHANNON_ENTROPY: 0.6430510, RATIO_AVERAGE_POPULARITY: 2.0367983, RATIO_NOVELTY: 0.0820914, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 470 of 500. Elapsed time 1.22 hour
FUNK_SVD: Processed 640256 (100.0%) in 10.04 sec. MSE loss 1.15E+00. Sample per second: 63745
FUNK_SVD: Epoch 471 of 500. Ela

FUNK_SVD: Processed 640256 (100.0%) in 9.26 sec. MSE loss 1.15E+00. Sample per second: 69121
FUNK_SVD: Epoch 496 of 500. Elapsed time 1.29 hour
FUNK_SVD: Processed 640256 (100.0%) in 9.48 sec. MSE loss 1.15E+00. Sample per second: 67557
FUNK_SVD: Epoch 497 of 500. Elapsed time 1.29 hour
FUNK_SVD: Processed 640256 (100.0%) in 9.70 sec. MSE loss 1.15E+00. Sample per second: 65980
FUNK_SVD: Epoch 498 of 500. Elapsed time 1.30 hour
FUNK_SVD: Processed 640256 (100.0%) in 10.19 sec. MSE loss 1.15E+00. Sample per second: 62810
FUNK_SVD: Epoch 499 of 500. Elapsed time 1.30 hour
FUNK_SVD: Processed 640256 (100.0%) in 9.41 sec. MSE loss 1.15E+00. Sample per second: 68065
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.74 sec. Users per second: 1272
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0808433, PRECISION_RECALL_MIN_DEN: 0.0912505, RECALL: 0.0442404, MAP: 0.0369268, MAP_MIN_DEN: 0.0408399, MRR: 0.2237674, NDCG: 0.0836874, F1: 0.0571864, HIT_RATE: 0.4938579, ARHR_ALL_HI

FUNK_SVD: Epoch 18 of 500. Elapsed time 1.29 min
FUNK_SVD: Processed 640136 (100.0%) in 3.69 sec. MSE loss 3.55E-01. Sample per second: 173198
FUNK_SVD: Epoch 19 of 500. Elapsed time 1.35 min
FUNK_SVD: Processed 640136 (100.0%) in 4.30 sec. MSE loss 3.43E-01. Sample per second: 148969
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.83 sec. Users per second: 1246
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0600764, PRECISION_RECALL_MIN_DEN: 0.0649755, RECALL: 0.0278546, MAP: 0.0255623, MAP_MIN_DEN: 0.0274350, MRR: 0.1640084, NDCG: 0.0582670, F1: 0.0380618, HIT_RATE: 0.3929283, ARHR_ALL_HITS: 0.2027927, NOVELTY: 0.0248901, AVERAGE_POPULARITY: 0.4223439, DIVERSITY_MEAN_INTER_LIST: 0.9304402, DIVERSITY_HERFINDAHL: 0.9930286, COVERAGE_ITEM: 0.1560649, COVERAGE_ITEM_CORRECT: 0.0715941, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.3918874, DIVERSITY_GINI: 0.0401247, SHANNON_ENTROPY: 7.7060191, RATIO_DIVERSITY_HERFINDAHL: 0.9938398, RATIO_DIVERSITY_GINI: 0.1147581, 

FUNK_SVD: Processed 640160 (100.0%) in 10.57 sec. MSE loss 1.07E-01. Sample per second: 60590
FUNK_SVD: Epoch 6 of 500. Elapsed time 1.05 min
FUNK_SVD: Processed 640160 (100.0%) in 10.22 sec. MSE loss 9.43E-02. Sample per second: 62665
FUNK_SVD: Epoch 7 of 500. Elapsed time 1.21 min
FUNK_SVD: Processed 640160 (100.0%) in 9.95 sec. MSE loss 8.41E-02. Sample per second: 64322
FUNK_SVD: Epoch 8 of 500. Elapsed time 1.37 min
FUNK_SVD: Processed 640160 (100.0%) in 10.38 sec. MSE loss 7.60E-02. Sample per second: 61649
FUNK_SVD: Epoch 9 of 500. Elapsed time 1.53 min
FUNK_SVD: Processed 640160 (100.0%) in 9.95 sec. MSE loss 7.07E-02. Sample per second: 64313
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.21 sec. Users per second: 1879
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0450863, PRECISION_RECALL_MIN_DEN: 0.0484788, RECALL: 0.0188861, MAP: 0.0178595, MAP_MIN_DEN: 0.0190478, MRR: 0.1209602, NDCG: 0.0409930, F1: 0.0266210, HIT_RATE: 0.3139110, ARHR_ALL_HITS: 0.1456

Iteration No: 5 ended. Search finished for the next optimal point.
Time taken: 375.9501
Function value obtained: -0.0182
Current minimum: -0.0505
Iteration No: 6 started. Searching for the next optimal point.
SearchBayesianSkopt: Testing config: {'epochs': 500, 'num_factors': 108, 'sgd_mode': 'adagrad', 'batch_size': 32, 'item_reg': 0.0035797717161438407, 'user_reg': 1.3811120614703808e-05, 'learning_rate': 0.001228995115651476}
MatrixFactorization_FunkSVD_Cython_Recommender: URM Detected 238 ( 6.1%) items with no interactions.
FUNK_SVD: Processed 640160 (100.0%) in 10.16 sec. MSE loss 1.29E+01. Sample per second: 63008
FUNK_SVD: Epoch 1 of 500. Elapsed time 9.77 sec
FUNK_SVD: Processed 640160 (100.0%) in 10.45 sec. MSE loss 1.14E+01. Sample per second: 61248
FUNK_SVD: Epoch 2 of 500. Elapsed time 20.06 sec
FUNK_SVD: Processed 640160 (100.0%) in 10.89 sec. MSE loss 1.05E+01. Sample per second: 58763
FUNK_SVD: Epoch 3 of 500. Elapsed time 30.50 sec
FUNK_SVD: Processed 640160 (100.0%) in

FUNK_SVD: Processed 640160 (100.0%) in 10.12 sec. MSE loss 4.25E+00. Sample per second: 63287
FUNK_SVD: Epoch 26 of 500. Elapsed time 4.80 min
FUNK_SVD: Processed 640160 (100.0%) in 10.62 sec. MSE loss 4.13E+00. Sample per second: 60258
FUNK_SVD: Epoch 27 of 500. Elapsed time 4.97 min
FUNK_SVD: Processed 640160 (100.0%) in 10.90 sec. MSE loss 4.03E+00. Sample per second: 58709
FUNK_SVD: Epoch 28 of 500. Elapsed time 5.14 min
FUNK_SVD: Processed 640160 (100.0%) in 10.99 sec. MSE loss 3.93E+00. Sample per second: 58240
FUNK_SVD: Epoch 29 of 500. Elapsed time 5.31 min
FUNK_SVD: Processed 640160 (100.0%) in 11.23 sec. MSE loss 3.83E+00. Sample per second: 56995
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.77 sec. Users per second: 1263
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0932603, PRECISION_RECALL_MIN_DEN: 0.1040418, RECALL: 0.0481212, MAP: 0.0416013, MAP_MIN_DEN: 0.0455969, MRR: 0.2284368, NDCG: 0.0836111, F1: 0.0634849, HIT_RATE: 0.5242364, ARHR_ALL_HITS: 

FUNK_SVD: Epoch 53 of 500. Elapsed time 9.23 min
FUNK_SVD: Processed 640160 (100.0%) in 10.58 sec. MSE loss 2.31E+00. Sample per second: 60507
FUNK_SVD: Epoch 54 of 500. Elapsed time 9.40 min
FUNK_SVD: Processed 640160 (100.0%) in 10.76 sec. MSE loss 2.27E+00. Sample per second: 59484
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.65 sec. Users per second: 1296
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1016932, PRECISION_RECALL_MIN_DEN: 0.1132565, RECALL: 0.0524341, MAP: 0.0459877, MAP_MIN_DEN: 0.0506099, MRR: 0.2428992, NDCG: 0.0929215, F1: 0.0691920, HIT_RATE: 0.5529548, ARHR_ALL_HITS: 0.3292225, NOVELTY: 0.0226775, AVERAGE_POPULARITY: 0.6719359, DIVERSITY_MEAN_INTER_LIST: 0.7607519, DIVERSITY_HERFINDAHL: 0.9760626, COVERAGE_ITEM: 0.0999227, COVERAGE_ITEM_CORRECT: 0.0460984, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5514901, DIVERSITY_GINI: 0.0126639, SHANNON_ENTROPY: 5.9743211, RATIO_DIVERSITY_HERFINDAHL: 0.9768600, RATIO_DIVERSITY_GINI: 0.0362191, 

FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1051461, PRECISION_RECALL_MIN_DEN: 0.1171060, RECALL: 0.0544593, MAP: 0.0477506, MAP_MIN_DEN: 0.0525700, MRR: 0.2485348, NDCG: 0.0972552, F1: 0.0717543, HIT_RATE: 0.5624170, ARHR_ALL_HITS: 0.3390866, NOVELTY: 0.0225907, AVERAGE_POPULARITY: 0.6838491, DIVERSITY_MEAN_INTER_LIST: 0.7319107, DIVERSITY_HERFINDAHL: 0.9731789, COVERAGE_ITEM: 0.0741695, COVERAGE_ITEM_CORRECT: 0.0381149, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5609272, DIVERSITY_GINI: 0.0105938, SHANNON_ENTROPY: 5.7405084, RATIO_DIVERSITY_HERFINDAHL: 0.9739740, RATIO_DIVERSITY_GINI: 0.0302986, RATIO_SHANNON_ENTROPY: 0.5308958, RATIO_AVERAGE_POPULARITY: 2.4757779, RATIO_NOVELTY: 0.0765461, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 80 of 500. Elapsed time 14.29 min
FUNK_SVD: Processed 640160 (100.0%) in 10.80 sec. MSE loss 1.58E+00. Sample per second: 59274
FUNK_SVD: Epoch 81 of 500. Elapsed time 14.46 min
FUNK_SVD: Processed 640160 (100.0%) in 10.74 sec. MSE loss 1

FUNK_SVD: Processed 640160 (100.0%) in 8.14 sec. MSE loss 1.26E+00. Sample per second: 78656
FUNK_SVD: Epoch 106 of 500. Elapsed time 18.86 min
FUNK_SVD: Processed 640160 (100.0%) in 7.56 sec. MSE loss 1.25E+00. Sample per second: 84647
FUNK_SVD: Epoch 107 of 500. Elapsed time 18.99 min
FUNK_SVD: Processed 640160 (100.0%) in 8.09 sec. MSE loss 1.24E+00. Sample per second: 79072
FUNK_SVD: Epoch 108 of 500. Elapsed time 19.11 min
FUNK_SVD: Processed 640160 (100.0%) in 7.97 sec. MSE loss 1.23E+00. Sample per second: 80342
FUNK_SVD: Epoch 109 of 500. Elapsed time 19.24 min
FUNK_SVD: Processed 640160 (100.0%) in 8.40 sec. MSE loss 1.22E+00. Sample per second: 76236
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.42 sec. Users per second: 1764
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1070717, PRECISION_RECALL_MIN_DEN: 0.1190805, RECALL: 0.0553969, MAP: 0.0483031, MAP_MIN_DEN: 0.0531144, MRR: 0.2483022, NDCG: 0.0995412, F1: 0.0730164, HIT_RATE: 0.5667331, ARHR_ALL_HIT

FUNK_SVD: Processed 640160 (100.0%) in 10.55 sec. MSE loss 1.08E+00. Sample per second: 60667
FUNK_SVD: Epoch 134 of 500. Elapsed time 23.69 min
FUNK_SVD: Processed 640160 (100.0%) in 11.04 sec. MSE loss 1.07E+00. Sample per second: 57964
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.58 sec. Users per second: 1316
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1063579, PRECISION_RECALL_MIN_DEN: 0.1179115, RECALL: 0.0545429, MAP: 0.0480363, MAP_MIN_DEN: 0.0527462, MRR: 0.2481029, NDCG: 0.0996293, F1: 0.0721074, HIT_RATE: 0.5625830, ARHR_ALL_HITS: 0.3398114, NOVELTY: 0.0225893, AVERAGE_POPULARITY: 0.6831954, DIVERSITY_MEAN_INTER_LIST: 0.7049758, DIVERSITY_HERFINDAHL: 0.9704859, COVERAGE_ITEM: 0.0592326, COVERAGE_ITEM_CORRECT: 0.0321916, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5610927, DIVERSITY_GINI: 0.0093991, SHANNON_ENTROPY: 5.5742194, RATIO_DIVERSITY_HERFINDAHL: 0.9712787, RATIO_DIVERSITY_GINI: 0.0268817, RATIO_SHANNON_ENTROPY: 0.5155170, RATIO_AVERAGE

FUNK_SVD: Processed 640136 (100.0%) in 2.52 sec. MSE loss 7.59E+00. Sample per second: 254156
FUNK_SVD: Epoch 11 of 500. Elapsed time 34.95 sec
FUNK_SVD: Processed 640136 (100.0%) in 2.82 sec. MSE loss 7.34E+00. Sample per second: 226678
FUNK_SVD: Epoch 12 of 500. Elapsed time 37.26 sec
FUNK_SVD: Processed 640136 (100.0%) in 3.15 sec. MSE loss 7.11E+00. Sample per second: 202834
FUNK_SVD: Epoch 13 of 500. Elapsed time 39.59 sec
FUNK_SVD: Processed 640136 (100.0%) in 2.59 sec. MSE loss 6.90E+00. Sample per second: 247272
FUNK_SVD: Epoch 14 of 500. Elapsed time 42.02 sec
FUNK_SVD: Processed 640136 (100.0%) in 3.03 sec. MSE loss 6.69E+00. Sample per second: 210879
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.86 sec. Users per second: 1241
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0746016, PRECISION_RECALL_MIN_DEN: 0.0827230, RECALL: 0.0380684, MAP: 0.0325137, MAP_MIN_DEN: 0.0351956, MRR: 0.1993404, NDCG: 0.0655356, F1: 0.0504121, HIT_RATE: 0.4638114, ARHR_ALL_HI

FUNK_SVD: Epoch 38 of 500. Elapsed time 2.08 min
FUNK_SVD: Processed 640136 (100.0%) in 2.55 sec. MSE loss 3.85E+00. Sample per second: 250448
FUNK_SVD: Epoch 39 of 500. Elapsed time 2.12 min
FUNK_SVD: Processed 640136 (100.0%) in 2.95 sec. MSE loss 3.78E+00. Sample per second: 216814
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 5.02 sec. Users per second: 1199
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.0970120, PRECISION_RECALL_MIN_DEN: 0.1082082, RECALL: 0.0504449, MAP: 0.0439043, MAP_MIN_DEN: 0.0477554, MRR: 0.2416044, NDCG: 0.0878581, F1: 0.0663755, HIT_RATE: 0.5418327, ARHR_ALL_HITS: 0.3217275, NOVELTY: 0.0227753, AVERAGE_POPULARITY: 0.6623323, DIVERSITY_MEAN_INTER_LIST: 0.7748141, DIVERSITY_HERFINDAHL: 0.9774685, COVERAGE_ITEM: 0.1300541, COVERAGE_ITEM_CORRECT: 0.0471285, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5403974, DIVERSITY_GINI: 0.0147258, SHANNON_ENTROPY: 6.1450618, RATIO_DIVERSITY_HERFINDAHL: 0.9782671, RATIO_DIVERSITY_GINI: 0.0421162, 

FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1031541, PRECISION_RECALL_MIN_DEN: 0.1149100, RECALL: 0.0534253, MAP: 0.0472375, MAP_MIN_DEN: 0.0514044, MRR: 0.2508450, NDCG: 0.0941957, F1: 0.0703929, HIT_RATE: 0.5589309, ARHR_ALL_HITS: 0.3394688, NOVELTY: 0.0225398, AVERAGE_POPULARITY: 0.6937215, DIVERSITY_MEAN_INTER_LIST: 0.7211373, DIVERSITY_HERFINDAHL: 0.9721018, COVERAGE_ITEM: 0.0803502, COVERAGE_ITEM_CORRECT: 0.0352820, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5574503, DIVERSITY_GINI: 0.0103738, SHANNON_ENTROPY: 5.6957925, RATIO_DIVERSITY_HERFINDAHL: 0.9728959, RATIO_DIVERSITY_GINI: 0.0296694, RATIO_SHANNON_ENTROPY: 0.5267604, RATIO_AVERAGE_POPULARITY: 2.5115196, RATIO_NOVELTY: 0.0763736, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 65 of 500. Elapsed time 3.63 min
FUNK_SVD: Processed 640136 (100.0%) in 2.57 sec. MSE loss 2.53E+00. Sample per second: 249109
FUNK_SVD: Epoch 66 of 500. Elapsed time 3.67 min
FUNK_SVD: Processed 640136 (100.0%) in 2.85 sec. MSE loss 2.51

FUNK_SVD: Processed 640136 (100.0%) in 1.46 sec. MSE loss 1.94E+00. Sample per second: 436962
FUNK_SVD: Epoch 91 of 500. Elapsed time 4.95 min
FUNK_SVD: Processed 640136 (100.0%) in 2.45 sec. MSE loss 1.92E+00. Sample per second: 260963
FUNK_SVD: Epoch 92 of 500. Elapsed time 4.98 min
FUNK_SVD: Processed 640136 (100.0%) in 1.85 sec. MSE loss 1.90E+00. Sample per second: 346836
FUNK_SVD: Epoch 93 of 500. Elapsed time 5.00 min
FUNK_SVD: Processed 640136 (100.0%) in 2.82 sec. MSE loss 1.88E+00. Sample per second: 226834
FUNK_SVD: Epoch 94 of 500. Elapsed time 5.04 min
FUNK_SVD: Processed 640136 (100.0%) in 2.20 sec. MSE loss 1.87E+00. Sample per second: 291385
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.53 sec. Users per second: 1707
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1066567, PRECISION_RECALL_MIN_DEN: 0.1187991, RECALL: 0.0551608, MAP: 0.0491069, MAP_MIN_DEN: 0.0534290, MRR: 0.2556876, NDCG: 0.0982044, F1: 0.0727149, HIT_RATE: 0.5667331, ARHR_ALL_HITS: 

FUNK_SVD: Epoch 118 of 500. Elapsed time 6.00 min
FUNK_SVD: Processed 640136 (100.0%) in 2.28 sec. MSE loss 1.57E+00. Sample per second: 280767
FUNK_SVD: Epoch 119 of 500. Elapsed time 6.03 min
FUNK_SVD: Processed 640136 (100.0%) in 1.89 sec. MSE loss 1.57E+00. Sample per second: 338934
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.43 sec. Users per second: 1754
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1078353, PRECISION_RECALL_MIN_DEN: 0.1199836, RECALL: 0.0555464, MAP: 0.0498711, MAP_MIN_DEN: 0.0543150, MRR: 0.2579400, NDCG: 0.1000591, F1: 0.0733235, HIT_RATE: 0.5693891, ARHR_ALL_HITS: 0.3530325, NOVELTY: 0.0223953, AVERAGE_POPULARITY: 0.7144836, DIVERSITY_MEAN_INTER_LIST: 0.6634879, DIVERSITY_HERFINDAHL: 0.9663378, COVERAGE_ITEM: 0.0499614, COVERAGE_ITEM_CORRECT: 0.0262683, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5678808, DIVERSITY_GINI: 0.0078667, SHANNON_ENTROPY: 5.3099273, RATIO_DIVERSITY_HERFINDAHL: 0.9671272, RATIO_DIVERSITY_GINI: 0.0224990

EvaluatorHoldout: Processed 6024 (100.0%) in 4.68 sec. Users per second: 1287
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1086653, PRECISION_RECALL_MIN_DEN: 0.1211343, RECALL: 0.0563226, MAP: 0.0504157, MAP_MIN_DEN: 0.0549187, MRR: 0.2596839, NDCG: 0.1016023, F1: 0.0741910, HIT_RATE: 0.5690571, ARHR_ALL_HITS: 0.3560143, NOVELTY: 0.0223732, AVERAGE_POPULARITY: 0.7176940, DIVERSITY_MEAN_INTER_LIST: 0.6467894, DIVERSITY_HERFINDAHL: 0.9646682, COVERAGE_ITEM: 0.0442956, COVERAGE_ITEM_CORRECT: 0.0244656, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5675497, DIVERSITY_GINI: 0.0073780, SHANNON_ENTROPY: 5.2156390, RATIO_DIVERSITY_HERFINDAHL: 0.9654563, RATIO_DIVERSITY_GINI: 0.0211013, RATIO_SHANNON_ENTROPY: 0.4823547, RATIO_AVERAGE_POPULARITY: 2.5983087, RATIO_NOVELTY: 0.0758089, 

FUNK_SVD: New best model found! Updating.
FUNK_SVD: Epoch 145 of 500. Elapsed time 7.36 min
FUNK_SVD: Processed 640136 (100.0%) in 2.80 sec. MSE loss 1.39E+00. Sample per second: 228544
FUNK_SVD: Epoch 146 of 500. Elap

FUNK_SVD: Processed 640136 (100.0%) in 3.21 sec. MSE loss 1.28E+00. Sample per second: 199369
FUNK_SVD: Epoch 171 of 500. Elapsed time 8.76 min
FUNK_SVD: Processed 640136 (100.0%) in 2.47 sec. MSE loss 1.28E+00. Sample per second: 259459
FUNK_SVD: Epoch 172 of 500. Elapsed time 8.80 min
FUNK_SVD: Processed 640136 (100.0%) in 2.76 sec. MSE loss 1.28E+00. Sample per second: 231263
FUNK_SVD: Epoch 173 of 500. Elapsed time 8.84 min
FUNK_SVD: Processed 640136 (100.0%) in 3.07 sec. MSE loss 1.28E+00. Sample per second: 208126
FUNK_SVD: Epoch 174 of 500. Elapsed time 8.88 min
FUNK_SVD: Processed 640136 (100.0%) in 2.39 sec. MSE loss 1.27E+00. Sample per second: 267603
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.54 sec. Users per second: 1326
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1089143, PRECISION_RECALL_MIN_DEN: 0.1214095, RECALL: 0.0565850, MAP: 0.0506578, MAP_MIN_DEN: 0.0552166, MRR: 0.2603165, NDCG: 0.1025430, F1: 0.0744766, HIT_RATE: 0.5687251, ARHR_ALL_HI

FUNK_SVD: Epoch 199 of 500. Elapsed time 10.26 min
FUNK_SVD: Processed 640136 (100.0%) in 2.41 sec. MSE loss 1.22E+00. Sample per second: 264961
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.83 sec. Users per second: 1248
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1094456, PRECISION_RECALL_MIN_DEN: 0.1219190, RECALL: 0.0568162, MAP: 0.0508366, MAP_MIN_DEN: 0.0554013, MRR: 0.2604169, NDCG: 0.1033511, F1: 0.0748011, HIT_RATE: 0.5695551, ARHR_ALL_HITS: 0.3578970, NOVELTY: 0.0223562, AVERAGE_POPULARITY: 0.7200156, DIVERSITY_MEAN_INTER_LIST: 0.6200192, DIVERSITY_HERFINDAHL: 0.9619916, COVERAGE_ITEM: 0.0357971, COVERAGE_ITEM_CORRECT: 0.0229204, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5680464, DIVERSITY_GINI: 0.0067667, SHANNON_ENTROPY: 5.0838441, RATIO_DIVERSITY_HERFINDAHL: 0.9627775, RATIO_DIVERSITY_GINI: 0.0193531, RATIO_SHANNON_ENTROPY: 0.4701660, RATIO_AVERAGE_POPULARITY: 2.6067138, RATIO_NOVELTY: 0.0757512, 

FUNK_SVD: New best model found! Updating.


FUNK_SVD: Processed 640136 (100.0%) in 3.04 sec. MSE loss 1.18E+00. Sample per second: 210252
FUNK_SVD: Epoch 226 of 500. Elapsed time 11.77 min
FUNK_SVD: Processed 640136 (100.0%) in 2.41 sec. MSE loss 1.18E+00. Sample per second: 265401
FUNK_SVD: Epoch 227 of 500. Elapsed time 11.81 min
FUNK_SVD: Processed 640136 (100.0%) in 2.82 sec. MSE loss 1.18E+00. Sample per second: 226688
FUNK_SVD: Epoch 228 of 500. Elapsed time 11.85 min
FUNK_SVD: Processed 640136 (100.0%) in 3.12 sec. MSE loss 1.18E+00. Sample per second: 205161
FUNK_SVD: Epoch 229 of 500. Elapsed time 11.89 min
FUNK_SVD: Processed 640136 (100.0%) in 2.45 sec. MSE loss 1.17E+00. Sample per second: 261573
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.65 sec. Users per second: 1296
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1093459, PRECISION_RECALL_MIN_DEN: 0.1217019, RECALL: 0.0567079, MAP: 0.0508657, MAP_MIN_DEN: 0.0554343, MRR: 0.2611518, NDCG: 0.1038655, F1: 0.0746839, HIT_RATE: 0.5690571, ARHR_AL

FUNK_SVD: Epoch 253 of 500. Elapsed time 13.21 min
FUNK_SVD: Processed 640136 (100.0%) in 2.82 sec. MSE loss 1.15E+00. Sample per second: 226677
FUNK_SVD: Epoch 254 of 500. Elapsed time 13.25 min
FUNK_SVD: Processed 640136 (100.0%) in 3.41 sec. MSE loss 1.15E+00. Sample per second: 187956
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.67 sec. Users per second: 1291
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1095286, PRECISION_RECALL_MIN_DEN: 0.1218028, RECALL: 0.0564752, MAP: 0.0510068, MAP_MIN_DEN: 0.0555633, MRR: 0.2618322, NDCG: 0.1043068, F1: 0.0745241, HIT_RATE: 0.5685591, ARHR_ALL_HITS: 0.3594044, NOVELTY: 0.0223568, AVERAGE_POPULARITY: 0.7196093, DIVERSITY_MEAN_INTER_LIST: 0.5999097, DIVERSITY_HERFINDAHL: 0.9599810, COVERAGE_ITEM: 0.0311615, COVERAGE_ITEM_CORRECT: 0.0206026, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5670530, DIVERSITY_GINI: 0.0063932, SHANNON_ENTROPY: 4.9964090, RATIO_DIVERSITY_HERFINDAHL: 0.9607653, RATIO_DIVERSITY_GINI: 0.01828

FUNK_SVD: Processed 640136 (100.0%) in 2.84 sec. MSE loss 1.14E+00. Sample per second: 225303
FUNK_SVD: Epoch 281 of 500. Elapsed time 14.79 min
FUNK_SVD: Processed 640136 (100.0%) in 3.19 sec. MSE loss 1.14E+00. Sample per second: 200748
FUNK_SVD: Epoch 282 of 500. Elapsed time 14.83 min
FUNK_SVD: Processed 640136 (100.0%) in 2.77 sec. MSE loss 1.14E+00. Sample per second: 231409
FUNK_SVD: Epoch 283 of 500. Elapsed time 14.87 min
FUNK_SVD: Processed 640136 (100.0%) in 3.15 sec. MSE loss 1.14E+00. Sample per second: 203249
FUNK_SVD: Epoch 284 of 500. Elapsed time 14.91 min
FUNK_SVD: Processed 640136 (100.0%) in 2.04 sec. MSE loss 1.13E+00. Sample per second: 314348
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.55 sec. Users per second: 1695
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1091799, PRECISION_RECALL_MIN_DEN: 0.1213124, RECALL: 0.0561255, MAP: 0.0509183, MAP_MIN_DEN: 0.0554402, MRR: 0.2609325, NDCG: 0.1042828, F1: 0.0741389, HIT_RATE: 0.5672311, ARHR_AL

FUNK_SVD: Processed 640192 (100.0%) in 0.95 sec. MSE loss 7.94E+00. Sample per second: 674510
FUNK_SVD: Epoch 16 of 500. Elapsed time 19.89 sec
FUNK_SVD: Processed 640192 (100.0%) in 1.26 sec. MSE loss 7.78E+00. Sample per second: 508086
FUNK_SVD: Epoch 17 of 500. Elapsed time 20.20 sec
FUNK_SVD: Processed 640192 (100.0%) in 0.73 sec. MSE loss 7.61E+00. Sample per second: 876447
FUNK_SVD: Epoch 18 of 500. Elapsed time 20.67 sec
FUNK_SVD: Processed 640192 (100.0%) in 1.23 sec. MSE loss 7.45E+00. Sample per second: 521074
FUNK_SVD: Epoch 19 of 500. Elapsed time 21.17 sec
FUNK_SVD: Processed 640192 (100.0%) in 0.57 sec. MSE loss 7.30E+00. Sample per second: 1123081
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.56 sec. Users per second: 1320
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1189575, PRECISION_RECALL_MIN_DEN: 0.1325547, RECALL: 0.0621542, MAP: 0.0569873, MAP_MIN_DEN: 0.0625789, MRR: 0.2833098, NDCG: 0.1131901, F1: 0.0816480, HIT_RATE: 0.5966135, ARHR_ALL_H

EvaluatorHoldout: Processed 6036 (100.0%) in 4.82 sec. Users per second: 1253
SearchBayesianSkopt: Config evaluated with evaluator_test. Config: {'epochs': 15, 'num_factors': 1, 'sgd_mode': 'adagrad', 'batch_size': 64, 'item_reg': 0.01, 'user_reg': 0.01, 'learning_rate': 0.0011332925114552466} - results:
CUTOFF: 10 - PRECISION: 0.1470345, PRECISION_RECALL_MIN_DEN: 0.1571106, RECALL: 0.0618970, MAP: 0.0736408, MAP_MIN_DEN: 0.0774005, MRR: 0.3169586, NDCG: 0.1292467, F1: 0.0871194, HIT_RATE: 0.6504307, ARHR_ALL_HITS: 0.4713302, NOVELTY: 0.0220970, AVERAGE_POPULARITY: 0.7643492, DIVERSITY_MEAN_INTER_LIST: 0.4658630, DIVERSITY_HERFINDAHL: 0.9465786, COVERAGE_ITEM: 0.0121040, COVERAGE_ITEM_CORRECT: 0.0108164, COVERAGE_USER: 0.9993377, COVERAGE_USER_CORRECT: 0.6500000, DIVERSITY_GINI: 0.0045915, SHANNON_ENTROPY: 4.4184519, RATIO_DIVERSITY_HERFINDAHL: 0.9473519, RATIO_DIVERSITY_GINI: 0.0131319, RATIO_SHANNON_ENTROPY: 0.4086289, RATIO_AVERAGE_POPULARITY: 2.7672173, RATIO_NOVELTY: 0.0748730, 



FUNK_SVD: Processed 640135 (100.0%) in 0.40 sec. MSE loss 1.13E+00. Sample per second: 1619413
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 4.61 sec. Users per second: 1306
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1102922, PRECISION_RECALL_MIN_DEN: 0.1217955, RECALL: 0.0561448, MAP: 0.0518545, MAP_MIN_DEN: 0.0568572, MRR: 0.2680191, NDCG: 0.1090101, F1: 0.0744105, HIT_RATE: 0.5678951, ARHR_ALL_HITS: 0.3659392, NOVELTY: 0.0222883, AVERAGE_POPULARITY: 0.7294135, DIVERSITY_MEAN_INTER_LIST: 0.3999889, DIVERSITY_HERFINDAHL: 0.9399922, COVERAGE_ITEM: 0.0121040, COVERAGE_ITEM_CORRECT: 0.0110739, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5663907, DIVERSITY_GINI: 0.0041905, SHANNON_ENTROPY: 4.2718474, RATIO_DIVERSITY_HERFINDAHL: 0.9407602, RATIO_DIVERSITY_GINI: 0.0119851, RATIO_SHANNON_ENTROPY: 0.3950706, RATIO_AVERAGE_POPULARITY: 2.6407375, RATIO_NOVELTY: 0.0755212, 

FUNK_SVD: Epoch 25 of 500. Elapsed time 34.54 sec
FUNK_SVD: Processed 640135 (100.0%) in 0.3

FUNK_SVD: Processed 640135 (100.0%) in 1.00 sec. MSE loss 1.28E+00. Sample per second: 637050
FUNK_SVD: Epoch 16 of 500. Elapsed time 15.27 sec
FUNK_SVD: Processed 640135 (100.0%) in 0.46 sec. MSE loss 1.26E+00. Sample per second: 1399274
FUNK_SVD: Epoch 17 of 500. Elapsed time 15.72 sec
FUNK_SVD: Processed 640135 (100.0%) in 0.69 sec. MSE loss 1.24E+00. Sample per second: 923494
FUNK_SVD: Epoch 18 of 500. Elapsed time 15.96 sec
FUNK_SVD: Processed 640135 (100.0%) in 0.91 sec. MSE loss 1.23E+00. Sample per second: 702973
FUNK_SVD: Epoch 19 of 500. Elapsed time 16.18 sec
FUNK_SVD: Processed 640135 (100.0%) in 1.13 sec. MSE loss 1.21E+00. Sample per second: 568212
FUNK_SVD: Validation begins...
EvaluatorHoldout: Processed 6024 (100.0%) in 3.43 sec. Users per second: 1756
FUNK_SVD: CUTOFF: 10 - PRECISION: 0.1141102, PRECISION_RECALL_MIN_DEN: 0.1266480, RECALL: 0.0590362, MAP: 0.0535163, MAP_MIN_DEN: 0.0586453, MRR: 0.2690154, NDCG: 0.1102873, F1: 0.0778143, HIT_RATE: 0.5808433, ARHR_ALL_H

FUNK_SVD: Processed 800192 (100.0%) in 0.58 sec. MSE loss 7.71E+00. Sample per second: 1375515
FUNK_SVD: Epoch 14 of 15. Elapsed time 4.51 sec
FUNK_SVD: Processed 800192 (100.0%) in 0.82 sec. MSE loss 7.51E+00. Sample per second: 972444
FUNK_SVD: Epoch 15 of 15. Elapsed time 4.75 sec
FUNK_SVD: Terminating at epoch 15. Elapsed time 4.75 sec
EvaluatorHoldout: Processed 6036 (100.0%) in 3.44 sec. Users per second: 1754
SearchBayesianSkopt: Best config evaluated with evaluator_test with constructor data for final test. Config: {'epochs': 15, 'num_factors': 1, 'sgd_mode': 'adagrad', 'batch_size': 64, 'item_reg': 0.01, 'user_reg': 0.01, 'learning_rate': 0.0011332925114552466} - results:
CUTOFF: 10 - PRECISION: 0.1763419, PRECISION_RECALL_MIN_DEN: 0.1864401, RECALL: 0.0674992, MAP: 0.1051057, MAP_MIN_DEN: 0.1090986, MRR: 0.3674129, NDCG: 0.1543828, F1: 0.0976286, HIT_RATE: 0.6706428, ARHR_ALL_HITS: 0.5930055, NOVELTY: 0.0221507, AVERAGE_POPULARITY: 0.7488238, DIVERSITY_MEAN_INTER_LIST: 0.5295

In [22]:
search_metadata = data_loader.load_data(recommender_class.RECOMMENDER_NAME + "_metadata.zip")

hyperparameters_df = search_metadata["hyperparameters_df"]
hyperparameters_df

Unnamed: 0,epochs,num_factors,sgd_mode,batch_size,item_reg,user_reg,learning_rate
0,5,53,sgd,8,0.000129,1.3e-05,0.00115
1,20,40,adagrad,16,0.000336,2.1e-05,0.002356
2,500,122,sgd,256,0.006262,0.000235,0.000167
3,10,48,sgd,2,0.000157,2.6e-05,0.008652
4,5,132,adagrad,32,1.6e-05,0.001534,0.090911
5,120,108,adagrad,32,0.00358,1.4e-05,0.001229
6,260,20,adagrad,4,1.5e-05,0.002114,0.000453
7,15,1,adagrad,64,0.01,0.01,0.001133
8,5,1,adagrad,1,1e-05,1e-05,0.001335
9,5,1,adagrad,1,0.01,1e-05,0.00089


In [23]:
time_df = search_metadata["time_df"]
time_df

Unnamed: 0,train,validation,test
0,105.788402,3.551863,3.345683
1,228.568995,4.646658,4.720899
2,4691.106388,4.741997,
3,157.704445,4.618278,
4,370.884716,4.37645,
5,1548.496231,4.912879,
6,900.066335,4.686435,4.761179
7,51.814222,4.571804,4.823095
8,41.532243,4.548487,
9,28.977696,3.262919,


In [24]:
best_hyperparameters = search_metadata["hyperparameters_best"]
best_hyperparameters

{'epochs': 15,
 'num_factors': 1,
 'sgd_mode': 'adagrad',
 'batch_size': 64,
 'item_reg': 0.01,
 'user_reg': 0.01,
 'learning_rate': 0.0011332925114552466}

## How to use the predefined hyperparameter ranges

Function runParameterSearch_Collaborative takes as input all needed to optimize any of the recommenders in the framework as well as other parameters such as which similarity heuristics to use for the KNNs, whether to parallelize the training of KNNs and so on...

In [25]:
similarity_type_list = ['cosine', 'jaccard', "asymmetric", "dice", "tversky"]

In [26]:
from HyperparameterTuning.run_hyperparameter_search import runHyperparameterSearch_Collaborative, runHyperparameterSearch_Content
from Recommenders.NonPersonalizedRecommender import TopPop, Random
from Recommenders.GraphBased.P3alphaRecommender import P3alphaRecommender

recommender_class = TopPop

runHyperparameterSearch_Collaborative(recommender_class,
       URM_train = URM_train,
       URM_train_last_test = URM_train_validation,
       metric_to_optimize = metric_to_optimize,
       cutoff_to_optimize = cutoff_to_optimize,
       evaluator_validation_earlystopping = evaluator_validation,
       evaluator_validation = evaluator_validation,
       evaluator_test = evaluator_test,
       output_folder_path = output_folder_path,
       parallelizeKNN = True,
       allow_weighting = True,
       resume_from_saved = True,
       save_model = "best",
       similarity_type_list = ['cosine', 'jaccard', "asymmetric", "dice", "tversky"],
       n_cases = n_cases,
       n_random_starts = n_random_starts)

SearchSingleCase: Resuming 'TopPopRecommender' Failed, no such file exists.

SearchSingleCase: Testing config: {}
TopPopRecommender: URM Detected 238 ( 6.1%) items with no interactions.
EvaluatorHoldout: Processed 6024 (100.0%) in 2.63 sec. Users per second: 2291
SearchSingleCase: New best config found. Config 0: {} - results: PRECISION: 0.1217961, PRECISION_RECALL_MIN_DEN: 0.1349623, RECALL: 0.0622819, MAP: 0.0596391, MAP_MIN_DEN: 0.0653490, MRR: 0.2875275, NDCG: 0.1124584, F1: 0.0824182, HIT_RATE: 0.5946215, ARHR_ALL_HITS: 0.4067459, NOVELTY: 0.0220221, AVERAGE_POPULARITY: 0.7784131, DIVERSITY_MEAN_INTER_LIST: 0.4049413, DIVERSITY_HERFINDAHL: 0.9404874, COVERAGE_ITEM: 0.0113314, COVERAGE_ITEM_CORRECT: 0.0100438, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.5930464, DIVERSITY_GINI: 0.0042281, SHANNON_ENTROPY: 4.2698400, RATIO_DIVERSITY_HERFINDAHL: 0.9412558, RATIO_DIVERSITY_GINI: 0.0120924, RATIO_SHANNON_ENTROPY: 0.3948849, RATIO_AVERAGE_POPULARITY: 2.8181335, RATIO_NOVELTY: 0.0

### The process is similar for content-based algorithms.
* Add the ICM data and its name (it will be used for file names)
* Remove the evaluator for earlystopping

In [27]:
from Recommenders.KNN.ItemKNNCBFRecommender import ItemKNNCBFRecommender

recommender_class = ItemKNNCBFRecommender

runHyperparameterSearch_Content(recommender_class,
       URM_train = URM_train,
       ICM_object = ICM_all,
       ICM_name = "ICM_genres",
       URM_train_last_test = URM_train_validation,
       metric_to_optimize = metric_to_optimize,
       cutoff_to_optimize = cutoff_to_optimize,
       evaluator_validation = evaluator_validation,
       evaluator_test = evaluator_test,
       output_folder_path = output_folder_path,
       parallelizeKNN = True,
       allow_weighting = True,
       resume_from_saved = True,
       save_model = "best",
       similarity_type_list = ['cosine', 'jaccard', "dice", "tversky"],
       n_cases = n_cases,
       n_random_starts = n_random_starts)

SearchBayesianSkopt: Resuming 'ItemKNNCBFRecommender_ICM_genres_cosine' Failed, no such file exists.

Iteration No: 1 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 454, 'shrink': 169, 'similarity': 'cosine', 'normalize': True, 'feature_weighting': 'BM25'}
ItemKNNCBFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
SearchBayesianSkopt: Resuming 'ItemKNNCBFRecommender_ICM_genres_jaccard' Failed, no such file exists.

Iteration No: 1 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 927, 'shrink': 987, 'similarity': 'jaccard', 'normalize': True}
ItemKNNCBFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
SearchBayesianSkopt: Resuming 'ItemKNNCBFRecommender_ICM_genres_dice' Failed, no such file exists.

Iteration No: 1 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 750, 'shrink': 926, 'similarity': 'dice', 'normalize': True}
It


Iteration No: 2 ended. Evaluation done at random point.
Time taken: 6.8641
Function value obtained: -0.0218
Current minimum: -0.0221
Iteration No: 3 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 6, 'shrink': 608, 'similarity': 'dice', 'normalize': False}
ItemKNNCBFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
Similarity column 3883 (100.0%), 20947.92 column/sec. Elapsed time 0.19 sec
EvaluatorHoldout: Processed 6036 (100.0%) in 5.33 sec. Users per second: 1132
SearchBayesianSkopt: Config evaluated with evaluator_test. Config: {'topK': 347, 'shrink': 379, 'similarity': 'tversky', 'normalize': True, 'tversky_alpha': 0.8564596806756939, 'tversky_beta': 0.09091909290431067} - results:
CUTOFF: 10 - PRECISION: 0.0723492, PRECISION_RECALL_MIN_DEN: 0.0751475, RECALL: 0.0251246, MAP: 0.0311188, MAP_MIN_DEN: 0.0319933, MRR: 0.1671991, NDCG: 0.0438749, F1: 0.0372971, HIT_RATE: 0.4037442, ARHR_ALL_HITS: 0.2239745, NOVELTY: 0.028286


Similarity column 3883 (100.0%), 14454.75 column/sec. Elapsed time 0.27 sec
EvaluatorHoldout: Processed 6024 (100.0%) in 3.82 sec. Users per second: 1576
SearchBayesianSkopt: Config 3 is suboptimal. Config: {'topK': 5, 'shrink': 1000, 'similarity': 'jaccard', 'normalize': True} - results: PRECISION: 0.0497178, PRECISION_RECALL_MIN_DEN: 0.0540287, RECALL: 0.0230467, MAP: 0.0199906, MAP_MIN_DEN: 0.0214813, MRR: 0.1305180, NDCG: 0.0347948, F1: 0.0314943, HIT_RATE: 0.3343293, ARHR_ALL_HITS: 0.1601152, NOVELTY: 0.0292292, AVERAGE_POPULARITY: 0.1910602, DIVERSITY_MEAN_INTER_LIST: 0.9525647, DIVERSITY_HERFINDAHL: 0.9952407, COVERAGE_ITEM: 0.3757404, COVERAGE_ITEM_CORRECT: 0.1079063, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.3334437, DIVERSITY_GINI: 0.0765684, SHANNON_ENTROPY: 8.5402188, RATIO_DIVERSITY_HERFINDAHL: 0.9960537, RATIO_DIVERSITY_GINI: 0.2189882, RATIO_SHANNON_ENTROPY: 0.7898197, RATIO_AVERAGE_POPULARITY: 0.6917064, RATIO_NOVELTY: 0.0990397, 

Iteration No: 4 ended. Searc


Iteration No: 5 ended. Search finished for the next optimal point.
Time taken: 4.8693
Function value obtained: -0.0231
Current minimum: -0.0242
Iteration No: 6 started. Searching for the next optimal point.
SearchBayesianSkopt: Testing config: {'topK': 365, 'shrink': 0, 'similarity': 'tversky', 'normalize': True, 'tversky_alpha': 0.9451994341143322, 'tversky_beta': 1.9034035449301236}
ItemKNNCBFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
EvaluatorHoldout: Processed 6024 (100.0%) in 5.04 sec. Users per second: 1195
SearchBayesianSkopt: Config 3 is suboptimal. Config: {'topK': 236, 'shrink': 897, 'similarity': 'cosine', 'normalize': False, 'feature_weighting': 'TF-IDF'} - results: PRECISION: 0.0589475, PRECISION_RECALL_MIN_DEN: 0.0637218, RECALL: 0.0265087, MAP: 0.0242811, MAP_MIN_DEN: 0.0258535, MRR: 0.1472971, NDCG: 0.0402735, F1: 0.0365713, HIT_RATE: 0.3670319, ARHR_ALL_HITS: 0.1871194, NOVELTY: 0.0286932, AVERAGE_POPULARITY: 0.2084592, DIVERSITY_MEAN_INTER_LIST

Similarity column 3883 (100.0%), 8014.72 column/sec. Elapsed time 0.48 sec
EvaluatorHoldout: Processed 6024 (100.0%) in 5.49 sec. Users per second: 1098
SearchBayesianSkopt: Config 6 is suboptimal. Config: {'topK': 569, 'shrink': 999, 'similarity': 'jaccard', 'normalize': True} - results: PRECISION: 0.0538679, PRECISION_RECALL_MIN_DEN: 0.0583835, RECALL: 0.0247511, MAP: 0.0215132, MAP_MIN_DEN: 0.0229863, MRR: 0.1305522, NDCG: 0.0362937, F1: 0.0339177, HIT_RATE: 0.3389774, ARHR_ALL_HITS: 0.1661128, NOVELTY: 0.0290451, AVERAGE_POPULARITY: 0.1962369, DIVERSITY_MEAN_INTER_LIST: 0.8982177, DIVERSITY_HERFINDAHL: 0.9898069, COVERAGE_ITEM: 0.2547000, COVERAGE_ITEM_CORRECT: 0.0767448, COVERAGE_USER: 0.9973510, COVERAGE_USER_CORRECT: 0.3380795, DIVERSITY_GINI: 0.0385445, SHANNON_ENTROPY: 7.5063410, RATIO_DIVERSITY_HERFINDAHL: 0.9906155, RATIO_DIVERSITY_GINI: 0.1102384, RATIO_SHANNON_ENTROPY: 0.6942042, RATIO_AVERAGE_POPULARITY: 0.7104476, RATIO_NOVELTY: 0.0984159, 

Similarity column 3883 (100.0

SearchBayesianSkopt: Testing config: {'topK': 5, 'shrink': 1000, 'similarity': 'cosine', 'normalize': True, 'feature_weighting': 'BM25'}
ItemKNNCBFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
Iteration No: 9 ended. Search finished for the next optimal point.
Time taken: 7.1110
Function value obtained: -0.0230
Current minimum: -0.0256
Iteration No: 10 started. Searching for the next optimal point.
SearchBayesianSkopt: Testing config: {'topK': 32, 'shrink': 0, 'similarity': 'dice', 'normalize': True}
ItemKNNCBFRecommender: URM Detected 238 ( 6.1%) items with no interactions.
Similarity column 3883 (100.0%), 16843.04 column/sec. Elapsed time 0.23 sec
Similarity column 3883 (100.0%), 12634.34 column/sec. Elapsed time 0.31 sec
EvaluatorHoldout: Processed 6024 (100.0%) in 6.95 sec. Users per second: 867
SearchBayesianSkopt: Config 9 is suboptimal. Config: {'topK': 763, 'shrink': 885, 'similarity': 'jaccard', 'normalize': True} - results: PRECISION: 0.0548473, PRECISION_R


SearchBayesianSkopt: Saving model in result_experiments/ItemKNNCBFRecommender_ICM_genres_dice

ItemKNNCBFRecommender: Saving model in file 'result_experiments/ItemKNNCBFRecommender_ICM_genres_dice_best_model_last'
EvaluatorHoldout: Processed 6036 (100.0%) in 5.92 sec. Users per second: 1019
SearchBayesianSkopt: Best config evaluated with evaluator_test with constructor data for final test. Config: {'topK': 355, 'shrink': 176, 'similarity': 'cosine', 'normalize': False, 'feature_weighting': 'TF-IDF'} - results:
CUTOFF: 10 - PRECISION: 0.0870278, PRECISION_RECALL_MIN_DEN: 0.0900327, RECALL: 0.0293889, MAP: 0.0410426, MAP_MIN_DEN: 0.0421953, MRR: 0.2014210, NDCG: 0.0550319, F1: 0.0439396, HIT_RATE: 0.4541087, ARHR_ALL_HITS: 0.2806119, NOVELTY: 0.0286268, AVERAGE_POPULARITY: 0.2092813, DIVERSITY_MEAN_INTER_LIST: 0.8945452, DIVERSITY_HERFINDAHL: 0.9894397, COVERAGE_ITEM: 0.1820757, COVERAGE_ITEM_CORRECT: 0.0875612, COVERAGE_USER: 0.9993377, COVERAGE_USER_CORRECT: 0.4538079, DIVERSITY_GINI:

### You can use the "partial" function to pre-set all attributes that are needed and use it to loop on the recommender_classes

In [28]:
from Recommenders.NonPersonalizedRecommender import TopPop, Random
from Recommenders.GraphBased.P3alphaRecommender import P3alphaRecommender
import os, multiprocessing
from functools import partial


runHyperparameterSearch_Collaborative_partial = partial(runHyperparameterSearch_Collaborative,
       URM_train = URM_train,
       URM_train_last_test = URM_train_validation,
       metric_to_optimize = metric_to_optimize,
       cutoff_to_optimize = cutoff_to_optimize,
       evaluator_validation_earlystopping = evaluator_validation,
       evaluator_validation = evaluator_validation,
       evaluator_test = evaluator_test,
       output_folder_path = output_folder_path,
       parallelizeKNN = True,
       allow_weighting = True,
       resume_from_saved = True,
       save_model = "best",
       similarity_type_list = ['cosine', 'jaccard', "asymmetric", "dice", "tversky"],
       n_cases = n_cases,
       n_random_starts = n_random_starts)


In [29]:
collaborative_algorithm_list = [
    Random,
    TopPop,
    P3alphaRecommender,
]

### You can call it as part of a parallel pool, hence parallizing the optimization of the various models

In [30]:
pool = multiprocessing.Pool(processes=int(multiprocessing.cpu_count()), maxtasksperchild=1)
pool.map(runHyperparameterSearch_Collaborative_partial, collaborative_algorithm_list)

SearchSingleCase: Resuming 'RandomRecommender' Failed, no such file exists.

SearchSingleCase: Testing config: {}
RandomRecommender: URM Detected 238 ( 6.1%) items with no interactions.
SearchBayesianSkopt: Resuming 'P3alphaRecommender' Failed, no such file exists.

Iteration No: 1 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 593, 'alpha': 1.5162625884190866, 'normalize_similarity': False}
P3alphaRecommender: URM Detected 238 ( 6.1%) items with no interactions.
SearchSingleCase: Resuming 'TopPopRecommender'... Loaded 1 configurations.
TopPopRecommender: URM Detected 212 ( 5.5%) items with no interactions.
SearchSingleCase: Resuming 'TopPopRecommender'... Result on last already available.
EvaluatorHoldout: Processed 6024 (100.0%) in 4.76 sec. Users per second: 1266
SearchSingleCase: New best config found. Config 0: {} - results: PRECISION: 0.0072709, PRECISION_RECALL_MIN_DEN: 0.0077562, RECALL: 0.0028381, MAP: 0.0022469, MAP_MIN_DEN: 0.0024


P3alphaRecommender: Saving model in file 'result_experiments/P3alphaRecommender_best_model'
P3alphaRecommender: Saving complete
Iteration No: 2 ended. Evaluation done at random point.
Time taken: 15.3414
Function value obtained: -0.1040
Current minimum: -0.1040
Iteration No: 3 started. Evaluating function at random point.
SearchBayesianSkopt: Testing config: {'topK': 18, 'alpha': 1.7608333614795886, 'normalize_similarity': False}
P3alphaRecommender: URM Detected 238 ( 6.1%) items with no interactions.
EvaluatorHoldout: Processed 6024 (100.0%) in 4.06 sec. Users per second: 1484
SearchBayesianSkopt: Config 2 is suboptimal. Config: {'topK': 18, 'alpha': 1.7608333614795886, 'normalize_similarity': False} - results: PRECISION: 0.0425299, PRECISION_RECALL_MIN_DEN: 0.0468866, RECALL: 0.0216756, MAP: 0.0173111, MAP_MIN_DEN: 0.0192689, MRR: 0.1231091, NDCG: 0.0325182, F1: 0.0287160, HIT_RATE: 0.3014608, ARHR_ALL_HITS: 0.1451413, NOVELTY: 0.0303234, AVERAGE_POPULARITY: 0.1446706, DIVERSITY_MEA


Iteration No: 8 ended. Search finished for the next optimal point.
Time taken: 10.3573
Function value obtained: -0.1004
Current minimum: -0.1078
Iteration No: 9 started. Searching for the next optimal point.
SearchBayesianSkopt: Testing config: {'topK': 5, 'alpha': 1.0204916543650424, 'normalize_similarity': True}
P3alphaRecommender: URM Detected 238 ( 6.1%) items with no interactions.
EvaluatorHoldout: Processed 6024 (100.0%) in 4.43 sec. Users per second: 1361
SearchBayesianSkopt: Config 8 is suboptimal. Config: {'topK': 5, 'alpha': 1.0204916543650424, 'normalize_similarity': True} - results: PRECISION: 0.0985060, PRECISION_RECALL_MIN_DEN: 0.1026572, RECALL: 0.0372631, MAP: 0.0555751, MAP_MIN_DEN: 0.0586253, MRR: 0.2952565, NDCG: 0.0967650, F1: 0.0540718, HIT_RATE: 0.4973440, ARHR_ALL_HITS: 0.3977612, NOVELTY: 0.0313592, AVERAGE_POPULARITY: 0.2009188, DIVERSITY_MEAN_INTER_LIST: 0.8220619, DIVERSITY_HERFINDAHL: 0.9821925, COVERAGE_ITEM: 0.2106619, COVERAGE_ITEM_CORRECT: 0.1197528, CO

[None, None, None]

### Or sequentially, note that since resume_from_saved = True this new call to the function will just load the optimized values and not re-run it 

In [31]:
for recommender_class in collaborative_algorithm_list:

    try:

        runHyperparameterSearch_Collaborative_partial(recommender_class)

    except Exception as e:

        print("On recommender {} Exception {}".format(recommender_class, str(e)))
        traceback.print_exc()

SearchSingleCase: Resuming 'RandomRecommender'... Loaded 1 configurations.
RandomRecommender: URM Detected 212 ( 5.5%) items with no interactions.
SearchSingleCase: Resuming 'RandomRecommender'... Result on last already available.
SearchSingleCase: Resuming 'TopPopRecommender'... Loaded 1 configurations.
TopPopRecommender: URM Detected 212 ( 5.5%) items with no interactions.
SearchSingleCase: Resuming 'TopPopRecommender'... Result on last already available.
SearchBayesianSkopt: Resuming 'P3alphaRecommender'... Loaded 10 configurations.
P3alphaRecommender: URM Detected 212 ( 5.5%) items with no interactions.
SearchBayesianSkopt: Resuming 'P3alphaRecommender'... Result on last already available.


## How to load and save a model
### The models are saved in two files:
* RECOMMENDER_NAME + "_best_model.zip": contains the model trained on the URM_train
* RECOMMENDER_NAME + "_best_model_last.zip": contains the model trained on the URM_train_validation

#### You can load them using the "load_model" function of the Recommender class, to do so:
* Create an instance of the recommender you want to load and provide the needed URM data and ICM data, if needed. The URM you pass is the one that will be used as user profiles to generate the recommendations.
* Given the instance, call the load_model function
* You can now use the recommender to generate recommendations.

#### If you want to save your own models use the "save_model" function, it takes thesame parameters as "load_model"

In [32]:
from Recommenders.GraphBased.P3alphaRecommender import P3alphaRecommender

recommender_object = P3alphaRecommender(URM_all)
recommender_object

P3alphaRecommender: URM Detected 177 ( 4.6%) items with no interactions.


<Recommenders.GraphBased.P3alphaRecommender.P3alphaRecommender at 0x7f9eeb9412e0>

In [33]:
recommender_object.load_model(output_folder_path, 
                              file_name = recommender_object.RECOMMENDER_NAME + "_best_model_last.zip" )

P3alphaRecommender: Loading model from file 'result_experiments/P3alphaRecommender_best_model_last.zip'
P3alphaRecommender: Loading complete


In [34]:
recommender_object.W_sparse

<3883x3883 sparse matrix of type '<class 'numpy.float32'>'
	with 221647 stored elements in Compressed Sparse Row format>

In [35]:
user_id = 10
recommender_object.recommend(user_id, cutoff = 10)

[48, 244, 23, 22, 40, 335, 246, 537, 350, 515]

In [36]:
recommender_object.save_model(output_folder_path, 
                              file_name = recommender_object.RECOMMENDER_NAME + "_my_own_save.zip" )

P3alphaRecommender: Saving model in file 'result_experiments/P3alphaRecommender_my_own_save.zip'
P3alphaRecommender: Saving complete
