## Imports
These experiments were run on Python 3.8. In the requirements.txt are the versions used for these packages.
- tqdm: For showing progress in loops.
- numpy and pandas: For data manipulation.
- cornac: For obtaining the recommendations.
- tensorflow: Required by cornac.
- torch: Required for the VAECF implementation of Cornac.

In [278]:
from datetime import datetime
from pathlib import Path
from logging import Formatter, StreamHandler, getLogger, INFO

from tqdm import tqdm
from cornac import Experiment
from cornac.eval_methods import RatioSplit
from cornac.metrics import NDCG, Recall, Precision
from cornac.hyperopt import Discrete, Continuous
from cornac.hyperopt import GridSearch, RandomSearch
#from cornac.models import MF, WMF, SVD, VAECF
from cornac.models import VAECF, NeuMF, MCF, SVD, WMF, SKMeans, GMF
from cornac.exception import ScoreException
from numpy import array, nan
from pandas import read_csv, DataFrame, Series

## Logger setup
Here we set up the logger for showing some info when executing this script.

In [279]:
logger = getLogger(__name__)
logger.setLevel(INFO)

ch = StreamHandler()
ch.setLevel(INFO)
ch.setFormatter(
    Formatter('%(asctime)s - %(levelname)s - %(message)s')
)

logger.addHandler(ch)

## Configuration variables
We define some variables used on the rest of the experiment.

### General config
Getting the date now and the name of the experiment.

In [280]:
now = f'{datetime.now():%Y%m%d%H%M}'
experiment_name = 'AMBAR'

### File and dir config
Getting the working directory with pathlib, and obtaining the csv to be used in cornac, and defining a results directory.

In [281]:
work_dir = Path('.').resolve()
data_file = work_dir / 'data' / experiment_name / 'ratings_info.csv'
results_dir = work_dir / 'results' / experiment_name / now

Here we make sure the results directory exists by creating it if it doesn't.

In [282]:
if not results_dir.exists():
    results_dir.mkdir(parents=True, exist_ok=True)

Also, we make sure the data file exists and is a file. Here we could also make sure that the file is an actual csv file.

In [283]:
if not data_file.exists() and data_file.is_file():
    print("Bad data file")

### Dataframe config
We define the names of the headers of each column to be identified by pandas. Also, we define the data type of the values in each cell of the user, item and rating. If the data has multiple data types, the val_dtype can be a list of type string compatible with pandas.

In [284]:
col_names = {
    'user': 'user_id',
    'item': 'track_id',
    'rating': 'rating'
}
val_dtype = 'int'

### Cornac config
Here we set up the k value, the test set size and the validation set size. Also we decide if we want to exclude unknown values or not.

In [285]:
k = 1000
test_size = 0.2
val_size = 0.1
exclude_unknown = True

## Function setup
We set up various utility functions to be used later. Mostly for exporting data and getting it in a format compatible with cornac.

set_data_to_tuple_list takes a dict of {user: [item_list, rating_list]}, process it and returns a tuple list of format [(user, item, rating)...].

In [286]:
def set_data_to_tuple_list(d: dict) -> list:
    result = []
    for user in d:
        transpose = array(d[user]).T
        for t in transpose:
            result.append((user,) + tuple(t))
    return result

list_to_dict converts a list into a dict using dict comprehension and enumerate.

In [287]:
def list_to_dict(l: list) -> dict:
    return {i: v for i, v in enumerate(l)}

get_set_dataframe process the raw data ({user: [item_list, rating_list]}), with the item ids and user ids, and converts it into a pandas DataFrame to be exported later.

In [288]:
# Transforma de formato ({user:[item_list, rating_list]})
# DataFrame final:
#    user_id  item_id  rating  item_idx  user_idx
# 0        0        0       5         0         0
# 1        0        1       3         1         0
# 2        1        1       4         1         1
# 3        1        2       2         2         1
def get_set_dataframe(set_data: dict, i_ids: list, u_ids: list) -> DataFrame:
    data_list = set_data_to_tuple_list(set_data)
    i_map = list_to_dict(i_ids)
    u_map = list_to_dict(u_ids)

    set_df = DataFrame(data_list,
                       columns=list(col_names.values()),
                       dtype=val_dtype)
    set_df['item_idx'] = set_df[col_names['item']]
    set_df['item'] = set_df[col_names['item']].replace(to_replace=i_map)
    set_df['user_idx'] = set_df[col_names['user']]
    set_df['user'] = set_df[col_names['user']].replace(to_replace=u_map)
    return set_df

In [289]:
logger.info('Experiment start...')
logger.info(f'{experiment_name}')
logger.info(f'{k=}')
logger.info(f'{work_dir=}')
logger.info(f'{data_file=}')
logger.info(f'{results_dir=}')

2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,389 - INFO - Experiment start...
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR
2025-01-06 00:11:28,396 - INFO - AMBAR

Here we create the dataset out of the data file, the expected data is only with user, item and rating in that order. The name of the columns is defined in the set-up part, same with the data types.

For testing purposes before actually executing the full experiment, we left a filter that takes a sample of 50 users, and gets only the data of those 50 users. Please use it only to make sure that the script executes correctly from start to finish.

In [290]:
# user, item, rating
keys = ['0', '1', '2']

# Crea un diccionario que mapea cada clave a su tipo de dato
if isinstance(val_dtype, str):
    d_type = {key: val_dtype for key in keys}
elif isinstance(val_dtype, list):
    d_type = dict(zip(keys, val_dtype))
else:
    logger.error('Wrong type setup. Must be a type string or a list of type string.')
    exit()

logger.info('Loading data into triplets...')
df = read_csv(
    data_file,
    header=0,
    names=['0', '1', '2']
)[['0', '1', '2']].astype(d_type)

# FOR TESTING ONLY
# Selecciona aleatoriamente 50 usuarios unicos del dataframe
#user_filter = Series(df['0'].unique()).sample(50).to_list()
# Incluye solo filas donde estan los usuarios filtrados
#df = df[df['0'].isin(user_filter)]

data = list(df.to_records(index=False, column_dtypes=d_type))

2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...
2025-01-06 00:11:28,439 - INFO - Loading data into triplets...


Here we create the Ratio Split that will be used by cornac. It splits the data into 3 sets randomly. 1 for test, 1 for train and 1 for validation.

In [291]:
logger.info('Creating ratio split...')
ratio_split = RatioSplit(
    data=data,
    test_size=test_size,
    val_size=val_size,
    exclude_unknowns=exclude_unknown,
    verbose=True,
    seed=123 # Añadida para poder reproducir resultados
)

2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...
2025-01-06 00:11:28,591 - INFO - Creating ratio split...


rating_threshold = 1.0
exclude_unknowns = True
---
Training data:
Number of users = 1288
Number of items = 28940
Number of ratings = 74517
Max rating = 5.0
Min rating = 1.0
Global mean = 1.6
---
Test data:
Number of users = 1288
Number of items = 28940
Number of ratings = 15789
Number of unknown users = 0
Number of unknown items = 0
---
Validation data:
Number of users = 1288
Number of items = 28940
Number of ratings = 7875
---
Total users = 1288
Total items = 28940


We define the metris here. In this experiment, we set up NDCG, Recall and Precision, using the k defined in the set-up.

In [292]:
metrics = [
    NDCG(k),
    Recall(k),
    Precision(k)
]

Also, we define the models with some previously obtained parameters. We could also define the hyperparameter calculation in this part, in this case, is important to leave a models variable with said configuration, so cornac can pick up the array and execute the calculation and exporting of the recommendations.

Because this script is assuming an array with models with parameters already predefined, in case of needing the best parameters obtained by cornac, the exporting of this must be done after running the experiment.

## Base models to compute

In [293]:
base_vaecf = VAECF(
    name='vaecf_default',
    k=k,
    autoencoder_structure=[20],
    act_fn="tanh",
    likelihood="mult",
    n_epochs=100,
    batch_size=100,
    learning_rate=0.001,
    beta=1.0,
    use_gpu=True,
        verbose=True)

base_vaecf_a32 = VAECF(
    name='vaecf_a32',
    k=k,
    autoencoder_structure=[32, 16],  # Deeper network for more complex patterns
    act_fn="relu",                   # ReLU often performs better in deep networks
    likelihood="mult",
    n_epochs=200,                    # More epochs for better convergence
    batch_size=256,                  # Larger batch size for faster training
    learning_rate=0.0005,           # Lower learning rate for stability
    beta=0.8,                       # Slightly lower beta to reduce KL impact
    use_gpu=True,
    verbose=True)

base_vaecf_a64 = VAECF(
    name='vaecf_a64',
    k=k,
    autoencoder_structure=[64],      # Wider single layer for more representation capacity
    act_fn="sigmoid",               # Sigmoid for smoother gradients
    likelihood="mult",
    n_epochs=150,                   # Balanced number of epochs
    batch_size=64,                  # Smaller batch size for better generalization
    learning_rate=0.002,           # Higher learning rate with sigmoid
    beta=1.2,                      # Higher beta for stronger regularization
    use_gpu=True,
    verbose=True)

# No usar, aun necesita testeo
base_mcf = MCF (
        k=k,
        max_iter= 100,
        learning_rate= 0.001,
        gamma = 0.9,
        lamda= 0.001,  
        verbose=True,
        #init_params= any
        #falta informacion del contexto
)

base_neumf = NeuMF (
    name= 'neumf_default',
    num_factors= 8,
    layers = (64, 32, 16, 8),
    act_fn = "relu",
    reg = 0,
    num_epochs= 20,
    batch_size = 256,
    num_neg = 4,
    lr = 0.001,
    learner = "adam",
    backend = "tensorflow",
    verbose = True
)

base_neumf_deep = NeuMF (
    name='neumf_deep',
    num_factors= 16,                    # Incrementado para capturar más factores latentes
    layers = (128, 64, 32, 16),        # Red más profunda para capturar patrones más complejos
    act_fn = "tanh",                   # Tanh para mejor gradiente en capas profundas
    reg = 0.01,                        # Regularización para evitar overfitting
    num_epochs= 50,                    # Más épocas para mejor convergencia
    batch_size = 128,                  # Batch size más pequeño para mejor generalización
    num_neg = 8,                       # Más muestras negativas para mejor discriminación
    lr = 0.0005,                       # Learning rate más bajo para estabilidad
    learner = "rmsprop",               # RMSprop para mejor manejo de gradientes
    backend = "tensorflow",
    verbose = True
)

# Experimento 3: Configuración con balance fairness-performance
base_neumf_fair = NeuMF (
    name='neumf_fair',
    num_factors= 32,                    # Mayor dimensionalidad para representación más rica
    layers = (256, 128, 64),           # Capas más anchas pero menos profundas
    act_fn = "elu",                    # ELU para mejor manejo de sesgos
    reg = 0.05,                        # Mayor regularización para reducir sesgos
    num_epochs= 30,                    # Balance entre entrenamiento y overfitting
    batch_size = 512,                  # Batch size grande para mejor estimación de gradientes
    num_neg = 6,                       # Balance en muestras negativas
    lr = 0.001,                        # Learning rate estándar
    learner = "adam",                  # Adam para adaptación automática
    backend = "tensorflow",
    verbose = True
)

base_svd =  SVD(
        max_iter=5,
        k=k,
        early_stop=True,
        verbose=True,
        lambda_reg=0.0001,
        learning_rate=0.0001
    )

base_skm = SKMeans (
    k = 5,
    max_iter= 100,
    tol=1e-6,
    verbose=True
    
)

base_gmf = GMF (
    num_factors=32,
    reg=0.1,
    num_epochs=50,
    batch_size=64,
    num_neg=8,
    lr=0.0005,
    learner='adagrad',
    verbose=True
)


In [294]:
# Modelos que le pasaremos a CORNAC para ejecutar los experimentos 
models = [
   base_neumf,
   base_neumf_deep,
   base_neumf_fair
]

In [295]:
# Obtener el total de usuarios del split de entrenamiento
total_users = ratio_split.train_set.num_users
# Obtener el total de items del split de entrenamiento
total_items = ratio_split.train_set.num_items
logger.info(f'{total_users=}')
logger.info(f'{total_items=}')

2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,244 - INFO - total_users=1288
2025-01-06 00:11:29,254 - INFO - total_items=28940
2025-01-06 00:11:29,254 - INFO - total_items=28940
2025-01-06 00:11:29,254 - INFO - total_items=28940
2025-01-06 00:11:29,254 - INFO - total_items=28940
2025-01-06 00:11:29,254 - INFO - total_items=28940
2025-01-06 00:11:29,254 - INFO - total_items=28940
2025-01-06 00:11:29,254 - INFO - total_items

After setting up the metrics and models, we export the test, train and validation data into the results directory.

In [296]:
logger.info('Exporting test data...')
get_set_dataframe(
    dict(ratio_split.test_set.user_data),
    list(ratio_split.test_set.item_ids),
    list(ratio_split.test_set.user_ids),
).to_csv(results_dir / 'test_set.csv')

2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...
2025-01-06 00:11:29,275 - INFO - Exporting test data...


In [297]:
logger.info('Exporting train data...')
get_set_dataframe(
    dict(ratio_split.train_set.user_data),
    list(ratio_split.train_set.item_ids),
    list(ratio_split.train_set.user_ids),
).to_csv(results_dir / 'train_set.csv')

2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...
2025-01-06 00:11:31,240 - INFO - Exporting train data...


In [298]:
logger.info('Exporting validation data...')
get_set_dataframe(
    dict(ratio_split.val_set.user_data),
    list(ratio_split.val_set.item_ids),
    list(ratio_split.val_set.user_ids),
).to_csv(results_dir / 'val_set.csv')

2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...
2025-01-06 00:11:35,454 - INFO - Exporting validation data...


And we run the experiments with the defined variables.

In [299]:
logger.info('Running experiment...')
exp = Experiment(
    eval_method=ratio_split,
    models=models,
    metrics=metrics,
    user_based=True,
)
exp.run()

#print(rs_mcf.best_params)

2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...
2025-01-06 00:11:37,553 - INFO - Running experiment...



[neumf_default] Training started!


100%|██████████| 20/20 [01:31<00:00,  4.57s/it, loss=0.0671]



[neumf_default] Evaluation started!


Ranking: 100%|██████████| 1262/1262 [00:10<00:00, 125.97it/s]
Ranking: 100%|██████████| 1219/1219 [00:09<00:00, 132.33it/s]



[neumf_deep] Training started!


100%|██████████| 50/50 [10:08<00:00, 12.17s/it, loss=0.349]



[neumf_deep] Evaluation started!


Ranking: 100%|██████████| 1262/1262 [00:17<00:00, 71.39it/s]
Ranking: 100%|██████████| 1219/1219 [00:17<00:00, 70.28it/s]



[neumf_fair] Training started!


100%|██████████| 30/30 [04:15<00:00,  8.50s/it, loss=0.41]



[neumf_fair] Evaluation started!


Ranking: 100%|██████████| 1262/1262 [00:37<00:00, 33.71it/s]
Ranking: 100%|██████████| 1219/1219 [00:36<00:00, 33.84it/s]


VALIDATION:
...
              | NDCG@1000 | Precision@1000 | Recall@1000 | Time (s)
------------- + --------- + -------------- + ----------- + --------
neumf_default |    0.1828 |         0.0038 |      0.5194 |   9.2150
neumf_deep    |    0.0060 |         0.0002 |      0.0279 |  17.3480
neumf_fair    |    0.0060 |         0.0002 |      0.0279 |  36.0240

TEST:
...
              | NDCG@1000 | Precision@1000 | Recall@1000 | Train (s) | Test (s)
------------- + --------- + -------------- + ----------- + --------- + --------
neumf_default |    0.2328 |         0.0075 |      0.5145 |   92.0966 |  10.0270
neumf_deep    |    0.0075 |         0.0003 |      0.0273 |  608.9439 |  17.6795
neumf_fair    |    0.0075 |         0.0003 |      0.0273 |  255.4582 |  37.4410






After running the experiment, we export the metrics obtained from the calculation into a csv using pandas.

In [300]:
logger.info('Exporting metrics...')
metric_results = {
    exp.models[i].name: dict(exp.result[i].metric_avg_results)
    for i in range(len(models))
}
(DataFrame(metric_results)
 .reset_index()
 .rename(columns={'index': 'metric'})
 .to_csv(results_dir / 'metric_results.csv'))

2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...
2025-01-06 00:29:41,842 - INFO - Exporting metrics...


And finally we export the recommendations. We use a custom multi loop to get the results.
- Here we first loop over the models of the experiment.
- We loop over the users map of cornac to get both the original id and the internal index of cornac.
- We get the scores for the users.
- We get the k top items using a combination of argsort and reversing of the list.
- We loop over the items map of cornac to get both the original id and the internal index of cornac.
- We get the score obtained from cornac, or nan in case of IndexError.
- We append the user and items, both the id and indexes, and the score to the result list.
- After all the loops are finished, we export the data into a csv file.

In [301]:
logger.info('Processing models...')
for model in exp.models:
    model_result = []
    logger.info(f'Getting scores for {model.name}...')

    for user_id, user_index in tqdm(exp.eval_method.train_set.uid_map.items()):
        try:
            scores = model.score(user_index)
        except ScoreException:
            logger.error(f"{model.name}: Couldn't predict for user {user_index} ({user_id=})")
            continue

        top_items = list(reversed(scores.argsort()))[:k]

        for item_id, item_index in exp.eval_method.train_set.iid_map.items():
            if item_index not in top_items:
                continue

            try:
                score = scores[item_index]
            except IndexError:
                logger.error(
                    f"{model.name}: No score for item {item_index} ({item_id=}) in user {user_index} ({user_id=})"
                )
                score = nan

            model_result.append({
                'user_id': user_id,
                'user_index': user_index,
                'item_id': item_id,
                'item_index': item_index,
                'score': score
            })

    logger.info(f'Exporting {model.name}...')
    (DataFrame(model_result)
     .sort_values(by=['user_id', 'score'], ascending=[True, False])
     .to_csv(results_dir / f'{model.name}.csv'))

2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:41,873 - INFO - Processing models...
2025-01-06 00:29:42,206 - INFO - Getting scores for neumf_default...
2025-01-06 00:29:42,206 - INFO - Getting scores for neumf_default...
2025-01-06 00:29:42,206 - INFO - Getting scores for neumf_default...
2025-01-06 00:29:42,206 - INFO - Getting scores for neumf_default...
2025-01-06 00:29:42,20

: 