## Import

In [1]:
# set the environment path to find Recommenders
import sys

import itertools
import logging
import os

import numpy as np
import pandas as pd
import papermill as pm

from recommenders.datasets import movielens
from recommenders.datasets.python_splitters import python_stratified_split
from recommenders.evaluation.python_evaluation import map_at_k, ndcg_at_k, precision_at_k, recall_at_k, logloss
from recommenders.models.sar import SAR

print("System version: {}".format(sys.version))
print("Pandas version: {}".format(pd.__version__))

System version: 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)]
Pandas version: 1.4.4


In [2]:
# top k items to recommend
TOP_K = 10

## Load data

In [3]:
os.getcwd()

'C:\\Users\\Administrator\\Documents\\GitHub\\recommender_systems\\KIS_recommenders'

In [4]:
data_path = os.getcwd() + "\\datasets\\"

In [5]:
sheet_1 = pd.read_excel(data_path + 'recommender_base.xlsx', sheet_name = 0)
sheet_2 = pd.read_excel(data_path + 'recommender_base.xlsx', sheet_name = 1)

reco = pd.concat([sheet_1, sheet_2])
del sheet_1, sheet_2

* get rating

In [6]:
def portion(x):
    return x / x.sum()

In [7]:
sum_df = pd.DataFrame(reco.groupby(['FAKE_CANO'])['CNT'].sum()).reset_index()
sum_df.columns = ['FAKE_CANO', 'TOT']

In [8]:
reco = pd.merge(reco, sum_df, on = 'FAKE_CANO', how = 'inner')
reco['Rating'] = reco['CNT'] / reco['TOT']

In [9]:
reco['Rating'] = reco['Rating'].astype(np.float32)

In [10]:
reco.head()

Unnamed: 0,FAKE_CANO,PRDT_TYPE_CD,CATEGORY,CNT,TOT,Rating
0,753457,300,주식,64,77,0.831169
1,753457,512,해외주식-NASD,5,77,0.064935
2,753457,513,해외주식-NYSE,4,77,0.051948
3,753457,529,해외주식-AMEX,4,77,0.051948
4,754560,200,신탁,76,162,0.469136


### 3.2 Split the data using the python random splitter provided in utilities:

We split the full dataset into a `train` and `test` dataset to evaluate performance of the algorithm against a held-out set not seen during training. Because SAR generates recommendations based on user preferences, all users that are in the test set must also exist in the training set. For this case, we can use the provided `python_stratified_split` function which holds out a percentage (in this case 25%) of items from each user, but ensures all users are in both `train` and `test` datasets. Other options are available in the `dataset.python_splitters` module which provide more control over how the split occurs.


In [11]:
header = {
    "col_user": "FAKE_CANO",
    "col_item": "CATEGORY",
    "col_rating": "Rating",
#    "col_timestamp": "Timestamp",
#    "col_prediction": "Prediction",
    "col_timestamp": None,
    "col_prediction": "Prediction",
}

In [12]:
train, test = python_stratified_split(reco, ratio=0.75, col_user=header["col_user"], col_item=header["col_item"], seed=42)

In this case, for the illustration purpose, the following parameter values are used:

|Parameter|Value|Description|
|---------|---------|-------------|
|`similarity_type`|`jaccard`|Method used to calculate item similarity.|
|`time_decay_coefficient`|30|Period in days (term of $T$ shown in the formula of Section 1.2)|
|`time_now`|`None`|Time decay reference.|
|`timedecay_formula`|`True`|Whether time decay formula is used.|

In [13]:
train.head()

Unnamed: 0,FAKE_CANO,PRDT_TYPE_CD,CATEGORY,CNT,TOT,Rating
522711,1,512,해외주식-NASD,31,62,0.5
522713,1,529,해외주식-AMEX,13,62,0.209677
522710,1,300,주식,1,62,0.016129
1048481,7,512,해외주식-NASD,38,90,0.422222
1048483,7,529,해외주식-AMEX,8,90,0.088889


In [14]:
test.head()

Unnamed: 0,FAKE_CANO,PRDT_TYPE_CD,CATEGORY,CNT,TOT,Rating
522712,1,513,해외주식-NYSE,17,62,0.274194
1048482,7,513,해외주식-NYSE,38,90,0.422222
795469,20,512,해외주식-NASD,19,37,0.513514
1111746,51,513,해외주식-NYSE,5,83,0.060241
1055839,79,513,해외주식-NYSE,10,49,0.204082


In [15]:
# set log level to INFO
logging.basicConfig(level=logging.DEBUG, 
                    format='%(asctime)s %(levelname)-8s %(message)s')

model = SAR(
    similarity_type="jaccard", 
#    time_decay_coefficient=30, 
#    time_now=None, 
#    timedecay_formula=True, 
    **header
)

In [16]:
model.fit(train)

2022-11-30 09:57:22,553 INFO     Collecting user affinity matrix
2022-11-30 09:57:22,572 INFO     Creating index columns
2022-11-30 09:57:23,499 INFO     Building user affinity sparse matrix
2022-11-30 09:57:23,523 INFO     Calculating item co-occurrence
2022-11-30 09:57:23,586 INFO     Calculating item similarity
2022-11-30 09:57:23,587 INFO     Using jaccard based similarity
2022-11-30 09:57:23,588 INFO     Done training


In [17]:
TOP_K = 5
top_k = model.recommend_k_items(test, top_k=TOP_K, remove_seen=True)

2022-11-30 09:57:23,734 INFO     Calculating recommendation scores
2022-11-30 09:57:23,802 INFO     Removing seen items


The final output from the `recommend_k_items` method generates recommendation scores for each user-item pair, which are shown as follows.

In [18]:
top_k

Unnamed: 0,FAKE_CANO,CATEGORY,Prediction
0,1,펀드,0.036699
1,1,RP,0.028855
2,1,해외주식-NYSE,0.027780
3,1,해외주식-SEHK,0.021750
4,1,채권,0.018319
...,...,...,...
1273450,3719916,해외주식-AMEX,0.514809
1273451,3719916,해외주식-NASD,0.493284
1273452,3719916,RP,0.122207
1273453,3719916,해외주식-NYSE,0.090650


### 3.3 Evaluate the results

It should be known that the recommendation scores generated by multiplying the item similarity matrix $S$ and the user affinity matrix $A$ **DOES NOT** have the same scale with the original explicit ratings in the movielens dataset. That is to say, SAR algorithm is meant for the task of *recommending relevent items to users* rather than *predicting explicit ratings for user-item pairs*. 

To this end, ranking metrics like precision@k, recall@k, etc., are more applicable to evaluate SAR algorithm. The following illustrates how to evaluate SAR model by using the evaluation functions provided in the `recommenders`.

In [76]:
header

{'col_user': 'FAKE_CANO',
 'col_item': 'CATEGORY',
 'col_rating': 'Rating',
 'col_timestamp': None,
 'col_prediction': 'Prediction'}

In [77]:
# all ranking metrics have the same arguments
args = [test, top_k]
kwargs = dict(col_user=header['col_user'], 
              col_item=header['col_item'], 
              col_rating=header['col_rating'] ,
              col_prediction=header['col_prediction'] ,
              relevancy_method='top_k', 
              k=TOP_K)

eval_map = map_at_k(*args, **kwargs)
eval_ndcg = ndcg_at_k(*args, **kwargs)
eval_precision = precision_at_k(*args, **kwargs)
eval_recall = recall_at_k(*args, **kwargs)

In [78]:
print(f"Model:",
      f"Top K:\t\t {TOP_K}",
      f"MAP:\t\t {eval_map:f}",
      f"NDCG:\t\t {eval_ndcg:f}",
      f"Precision@K:\t {eval_precision:f}",
      f"Recall@K:\t {eval_recall:f}", sep='\n')

Model:
Top K:		 5
MAP:		 0.434180
NDCG:		 0.558410
Precision@K:	 0.197538
Recall@K:	 0.901373


In [19]:
test_log = test
test_log['Rating'] = 1

args_log = [test_log, top_k]
kwargs_log = dict(col_user=header['col_user'], 
              col_item=header['col_item'], 
              col_rating=header['col_rating'] ,
              col_prediction=header['col_prediction'])

In [20]:
logloss(*args_log, **kwargs_log)

ValueError: y_true contains only one label (1). Please provide the true labels explicitly through the labels argument.