n=5   | total interactions (user, item, label): 16021685  | unique users: 93971 | unique items: 55011 <br>
n=15 | total interactions (user, item, label): 15797582  | unique users: 86338 | unique items: 26937


[DeepFM](https://recbole.io/docs/user_guide/model/context/deepfm.html)

In [14]:
from recbole.config import Config
from recbole.data import create_dataset, data_preparation
from recbole.model.context_aware_recommender import DeepFM
from recbole.trainer import Trainer
from recbole.utils import init_seed, init_logger
import torch
import logging
from recbole.quick_start import  load_data_and_model

#sanity check for mps
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
print(f"Using device: {device}")


Using device: mps


[custom model doc](https://recbole.io/docs/developer_guide/customize_models.html)

In [2]:
config_dict = {
    'epochs': 10,
    'data_path': '/Users/giulia/Desktop/tesi/',
    'dataset': 'mind_small15',
    #'load_col': {
    #    'inter': ['user_id', 'item_id', 'label', 'timestamp'],
    #    'useremb': ['uid', 'user_emb']
    #},
    'load_col': None, #doc: Note that if load_col is None, then all the existed atomic files will be loaded
    'eval_args': {
        'split': {'RS': [0.8, 0.1, 0.1]},
        'group_by': 'user',
        'order': 'RO',
        'mode': 'labeled'},
    'model': 'DeepFM',
    'learning_rate': 0.001, 
    'device': device, #this doesn't work
    'embedding_size': 32, # 64 -> kernel dies :(
    'train_batch_size': 32, #64-> kernel dies :(
    'eval_batch_size': 32,
    'l2_reg': 0.001,
    'early_stopping_patience': 5,  
    'early_stopping_metric': 'AUC',
    'checkpoint_dir': './saved',
    'log_level': 'DEBUG',
    'seed': 42,
    'reproducibility': True,
    'metrics': ["AUC", "MAE", "RMSE", "LogLoss"] #, "MRR", "NDCG", "Precision", "Recall", "F1" are not supported by DeepFM
}


[handler doc](https://docs.python.org/3/library/logging.handlers.html)

In [3]:
config = Config(model='DeepFM', dataset=config_dict['dataset'], config_dict=config_dict)
init_seed(config['seed'], config['reproducibility'])
#------------logger
init_logger(config)
logger = logging.getLogger()
#------------handler
c_handler = logging.StreamHandler()
c_handler.setLevel(logging.DEBUG)
logger.addHandler(c_handler)
logger.info(config)
#------------data
dataset = create_dataset(config)
logger.info(dataset)

21 Apr 19:38    INFO  
General Hyper Parameters:
gpu_id = 0
use_gpu = True
seed = 42
state = INFO
reproducibility = True
data_path = /Users/giulia/Desktop/tesi/mind_small15
checkpoint_dir = ./saved
show_progress = True
save_dataset = False
dataset_save_path = None
save_dataloaders = False
dataloaders_save_path = None
log_wandb = False

Training Hyper Parameters:
epochs = 10
train_batch_size = 32
learner = adam
learning_rate = 0.001
train_neg_sample_args = {'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}
eval_step = 1
stopping_step = 10
clip_grad_norm = None
weight_decay = 0.0
loss_decimal_place = 4

Evaluation Hyper Parameters:
eval_args = {'split': {'RS': [0.8, 0.1, 0.1]}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'labeled', 'test': 'labeled'}}
repeatable = False
metrics = ['AUC', 'MAE', 'RMSE', 'LogLoss']
topk = [10]
valid_metric = AUC
valid_metric_bigger = True
eval_batch_size = 32
metric_decimal_place = 4

Dataset Hype

In [4]:
train_data, valid_data, test_data = data_preparation(config, dataset)
#------------model
model = DeepFM(config, train_data.dataset).to(config['device'])
logger.info(model)


21 Apr 19:39    INFO  [Training]: train_batch_size = [32] train_neg_sample_args: [{'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}]
[Training]: train_batch_size = [32] train_neg_sample_args: [{'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}]
21 Apr 19:39    INFO  [Evaluation]: eval_batch_size = [32] eval_args: [{'split': {'RS': [0.8, 0.1, 0.1]}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'labeled', 'test': 'labeled'}}]
[Evaluation]: eval_batch_size = [32] eval_args: [{'split': {'RS': [0.8, 0.1, 0.1]}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'labeled', 'test': 'labeled'}}]
21 Apr 19:39    INFO  DeepFM(
  (token_embedding_table): FMEmbedding(
    (embedding): Embedding(113277, 32)
  )
  (first_order_linear): FMFirstOrderLinear(
    (token_embedding_table): FMEmbedding(
      (embedding): Embedding(113277, 1)
    )
  )
  (fm): BaseFactorizationMachine()
  (mlp_l

In [15]:
#LOAD MODEL
checkpoint_path = './saved/DeepFM-Apr-21-2024_16-02-34.pth' #spec at the end of respective log file

config, model, dataset, train_data, valid_data, test_data = load_data_and_model(
    model_file=checkpoint_path,
)

21 Apr 20:56    INFO  
General Hyper Parameters:
gpu_id = 0
use_gpu = True
seed = 42
state = INFO
reproducibility = True
data_path = /Users/giulia/Desktop/tesi/mind_small15
checkpoint_dir = ./saved
show_progress = True
save_dataset = False
dataset_save_path = None
save_dataloaders = False
dataloaders_save_path = None
log_wandb = False

Training Hyper Parameters:
epochs = 10
train_batch_size = 32
learner = adam
learning_rate = 0.001
train_neg_sample_args = {'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}
eval_step = 1
stopping_step = 10
clip_grad_norm = None
weight_decay = 0.0
loss_decimal_place = 4

Evaluation Hyper Parameters:
eval_args = {'split': {'RS': [0.8, 0.1, 0.1]}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'labeled', 'test': 'labeled'}}
repeatable = False
metrics = ['AUC', 'MAE', 'RMSE', 'LogLoss']
topk = [10]
valid_metric = AUC
valid_metric_bigger = True
eval_batch_size = 32
metric_decimal_place = 4

Dataset Hype

In [16]:
trainer = Trainer(config, model)    

In [17]:
#training should show log info now
best_valid_score, best_valid_result = trainer.fit(train_data, valid_data, saved=True) 

21 Apr 21:59    INFO  epoch 0 training [time: 3697.44s, train loss: 72543.7085]
epoch 0 training [time: 3697.44s, train loss: 72543.7085]
21 Apr 22:00    INFO  epoch 0 evaluating [time: 35.19s, valid_score: 0.985100]
epoch 0 evaluating [time: 35.19s, valid_score: 0.985100]
21 Apr 22:00    INFO  valid result: 
auc : 0.9851    mae : 0.0338    rmse : 0.166    logloss : 0.3216
valid result: 
auc : 0.9851    mae : 0.0338    rmse : 0.166    logloss : 0.3216
21 Apr 22:00    INFO  Saving current: ./saved/DeepFM-Apr-21-2024_20-58-12.pth
Saving current: ./saved/DeepFM-Apr-21-2024_20-58-12.pth
21 Apr 23:01    INFO  epoch 1 training [time: 3672.91s, train loss: 84947.7648]
epoch 1 training [time: 3672.91s, train loss: 84947.7648]
21 Apr 23:02    INFO  epoch 1 evaluating [time: 36.01s, valid_score: 0.984800]
epoch 1 evaluating [time: 36.01s, valid_score: 0.984800]
21 Apr 23:02    INFO  valid result: 
auc : 0.9848    mae : 0.0337    rmse : 0.1697    logloss : 0.3823
valid result: 
auc : 0.9848    ma

In [18]:
test_result = trainer.evaluate(test_data)
logger.info(test_result)

22 Apr 07:20    INFO  Loading model structure and parameters from ./saved/DeepFM-Apr-21-2024_20-58-12.pth
Loading model structure and parameters from ./saved/DeepFM-Apr-21-2024_20-58-12.pth
22 Apr 07:20    INFO  OrderedDict([('auc', 0.985), ('mae', 0.034), ('rmse', 0.1666), ('logloss', 0.3239)])
OrderedDict([('auc', 0.985), ('mae', 0.034), ('rmse', 0.1666), ('logloss', 0.3239)])


my issue/question
- <s>[SPLIT DATA ISSUE](https://chat.openai.com/share/699f3143-fe80-4a2a-8e75-eb5392212b80) ---> [Modify/extend sampler?](https://recbole.io/docs/developer_guide/customize_samplers.html) // in alternativa unire train+val e lasciare che Recbole faccia lo split--> [SEE THIS](https://recbole.io/docs/recbole/recbole.quick_start.quick_start.html#recbole.quick_start.quick_start.load_data_and_model) -> </s> [(LRS) to use pre-configured train\validation\test data splits](https://github.com/RUCAIBox/RecBole/pull/1950): doesn't work :(
- [value-based metrics and ranking-based metrics can not be used together](https://recbole.io/docs/user_guide/train_eval_intro.html) -> 2 run x model?

- Can't seem to make it run on mps, only cpu

In [None]:
#LRS
"""config_dict = {
    'data_path': './mind',
    'dataset': 'mind',
    'eval_args': {
    'split': {'LRS': None},
    'order': 'RO'  ,# not relevant
    'group_by': '-', # not relevant
    'mode': 'full'  # Train, validation, test split
}}

config = Config(model='DMF', dataset=config_dict['dataset'], config_dict=config_dict)
dataset = create_dataset(config)
"""
#

"config_dict = {\n    'data_path': './mind',\n    'dataset': 'mind',\n    'eval_args': {\n    'split': {'LRS': None},\n    'order': 'RO'  ,# not relevant\n    'group_by': '-', # not relevant\n    'mode': 'full'  # Train, validation, test split\n}}\n\nconfig = Config(model='DMF', dataset=config_dict['dataset'], config_dict=config_dict)\ndataset = create_dataset(config)\n"

-----------------------------------------------------------------------------------------


just personal notes & useful links:<br>
[ADD-> TF-IDF EMBEDDING](https://recbole.io/docs/user_guide/usage/load_pretrained_embedding.html)

[kaggle example to get prediction](https://www.kaggle.com/code/astrung/recbole-lstm-sequential-for-recomendation-tutorial#4.-Create-recommendation-result-from-trained-model)



[MODELS link](https://recbole.io/docs/user_guide/model_intro.html)

higher FLOPS->higher complexity & more computation



save_dataset (bool): Determines whether the processed dataset is saved to disk. This can be useful for large datasets that take a long time to preprocess, as it allows for quicker loading in subsequent runs.

[training hyperparam.](https://recbole.io/docs/user_guide/config/training_settings.html)

[clip_grad_norm](https://pytorch.org/docs/stable/generated/torch.nn.utils.clip_grad_norm_.html): This parameter is used to prevent the exploding gradient problem by clipping the gradients of the parameters during backpropagation to have a maximum norm of a specified value. If set to None, no clipping is applied. If you specify a number (e.g., 5.0), it will clip the gradients such that their norm does not exceed this value. Gradient clipping can be crucial for stabilizing the training of **deep learning models**.

[eval hyperparam](https://recbole.io/docs/user_guide/config/evaluation_settings.html)
reproducibility vs repeatible (args):

**Reproducibility**: set to True, the framework will explicitly set random seeds for all underlying libraries (e.g., PyTorch, NumPy) and any internal operations that use random numbers. This ensures that every aspect of the computation, from the way data is split to the initialization of model parameters, is consistent across runs.

**Repeatable**: False might allow for variability in how data is sampled, ordered, or split during the evaluation phase, potentially leading to slight differences in evaluation metrics across runs. Conversely, setting it to True would fix these aspects to ensure consistency in evaluation outcomes.


[data hyp](https://recbole.io/docs/user_guide/config/data_settings.html)