In [1]:
import os
import sys
import numpy as np
import pandas as pd

src_path = os.path.abspath(os.path.join(os.path.curdir, "src"))
sys.path.append(src_path)

from src.dataset import read_dataset

In [2]:
df = read_dataset()

For collaborative filtering, we need to choose only 1 type of "item" to recommand to the user. For that, we choose to recommand artists to user. 

In [3]:
user_artists_df = df.groupby(['user_id', 'artist_id']).agg(
    timestamp = pd.NamedAgg(column='timestamp', aggfunc='max'),
).reset_index()

In [4]:
from src.cf_mf import MatrixFactorization

mf = MatrixFactorization(df=user_artists_df,
                         batch_size=1024,
                         embedding_size=64,
                         learning_rate=0.001,
                         regularization=1e-5)
mf.fit(epochs=10, val_epoch=1)

**************************************************************
Epoch 1/10


100%|██████████| 24/24 [00:02<00:00, 11.36it/s, BPR loss=0.02816]


Validation: Recall@20=0.0020	Precision@20=0.0100	nDCG@20=0.0093
Test: Recall@20=0.0009	Precision@20=0.0027	nDCG@20=0.0023
**************************************************************
Epoch 2/10


100%|██████████| 24/24 [00:02<00:00, 11.10it/s, BPR loss=0.02717]


Validation: Recall@20=0.0073	Precision@20=0.0250	nDCG@20=0.0262
Test: Recall@20=0.0052	Precision@20=0.0162	nDCG@20=0.0170
**************************************************************
Epoch 3/10


100%|██████████| 24/24 [00:02<00:00, 10.81it/s, BPR loss=0.02587]


Validation: Recall@20=0.0080	Precision@20=0.0283	nDCG@20=0.0299
Test: Recall@20=0.0345	Precision@20=0.0284	nDCG@20=0.0368
**************************************************************
Epoch 4/10


100%|██████████| 24/24 [00:02<00:00, 11.59it/s, BPR loss=0.02453]


Validation: Recall@20=0.0067	Precision@20=0.0250	nDCG@20=0.0268
Test: Recall@20=0.0089	Precision@20=0.0351	nDCG@20=0.0383
**************************************************************
Epoch 5/10


100%|██████████| 24/24 [00:02<00:00, 11.27it/s, BPR loss=0.02256]


Validation: Recall@20=0.0058	Precision@20=0.0233	nDCG@20=0.0256
Test: Recall@20=0.0363	Precision@20=0.0378	nDCG@20=0.0448
**************************************************************
Epoch 6/10


100%|██████████| 24/24 [00:02<00:00, 11.36it/s, BPR loss=0.02072]


Validation: Recall@20=0.0102	Precision@20=0.0317	nDCG@20=0.0340
Test: Recall@20=0.0117	Precision@20=0.0405	nDCG@20=0.0421
**************************************************************
Epoch 7/10


100%|██████████| 24/24 [00:02<00:00, 11.59it/s, BPR loss=0.01907]


Validation: Recall@20=0.0136	Precision@20=0.0417	nDCG@20=0.0470
Test: Recall@20=0.0135	Precision@20=0.0446	nDCG@20=0.0482
**************************************************************
Epoch 8/10


100%|██████████| 24/24 [00:02<00:00, 11.36it/s, BPR loss=0.01707]


Validation: Recall@20=0.0148	Precision@20=0.0467	nDCG@20=0.0471
Test: Recall@20=0.0156	Precision@20=0.0486	nDCG@20=0.0527
**************************************************************
Epoch 9/10


100%|██████████| 24/24 [00:02<00:00, 11.36it/s, BPR loss=0.01577]


Validation: Recall@20=0.0146	Precision@20=0.0500	nDCG@20=0.0501
Test: Recall@20=0.0172	Precision@20=0.0527	nDCG@20=0.0577
**************************************************************
Epoch 10/10


100%|██████████| 24/24 [00:02<00:00, 11.59it/s, BPR loss=0.01443]

Validation: Recall@20=0.0131	Precision@20=0.0517	nDCG@20=0.0490
Test: Recall@20=0.0182	Precision@20=0.0541	nDCG@20=0.0556

TRAINING COMPLETE!
Best epoch: 8
Test metrics: Recall@20=0.0156	Precision@20=0.0486	nDCG@20=0.0527



