# Evaluation of Recommender Systems

Based on the same dataset used on previous weeks, let us evaluate the Collaborative Filtering (CF) models implemented last week.

## Exercise 1

1. Load the test set and the predictions made with both Collaborative Filtering models in the previous session. 
2. Detect those users which are in the training set but not in the test set. Remove their predictions before evaluating the systems.
3. Report the Root Mean Square Error (RMSE) for both CF models defined in the previous session.

In [None]:
import os
import sys
sys.path.append('../')
import pickle
import pandas as pd

# TEST
### YOUR CODE HERE ###

# PREDICTIONS
### YOUR CODE HERE ###

# Detect users from training set that are not in test
nb_users = set([pred.uid for pred in pred_nb_list])
lf_users = set([pred.uid for pred in pred_lf_list])
nb_users_in_pred_but_not_in_test = list(nb_users.difference(set(df_test['reviewerID'])))
lf_users_in_pred_but_not_in_test = list(lf_users.difference(set(df_test['reviewerID'])))
assert nb_users_in_pred_but_not_in_test == lf_users_in_pred_but_not_in_test
print(f"There are {len(lf_users_in_pred_but_not_in_test)} users in the training set that are not in the test set.")

# Remove these users' predictions for evaluation
### YOUR CODE HERE ###

## Exercise 2
Define a general method to get the top-k recommendations for each user. Print the top-k with k={5, 10} recommendations for the user with ID 'ARARUVZ8RUF5T' and its estimated ratings.

## Excercise 3
Report Precision@k (P@k), MAP@k and the MRR@k with k={5, 10, 20} averaged across users for both CF systems. When computing precision, we consider as relevant items those with an observed rating >= 4.0 (i.e., those items from the test set with a rating >= 4.0). Reflect on the differences obtained. 

## Excercise 4

Based on the top-5, top-10 and top-20 predictions from Exercise 2, compute the systems’ hit rate averaged over the total number of users in the test set.

In [None]:
#precision at 5 for neigboorhod based model.


def top_rec(df, k):
    df = df.sort_values(by= ['uid', 'est'], ascending= False)
    return df.groupby('uid').head(k)

babe = top_rec(upd_pred_nbm_df, 10)

babe.head(10)


In [None]:
# get r_ui column all to zeros
# merge predictions df with test df. Atention! -> merge following columns from testset -> ['reviewerID', 'asin', 'overall']
# change of r_ui in predictions df part of new df, and deleted the merged part that belongs to test df.
# We end up with predictions df that has, at best!, one relevant gound thruth. Why at best? -> because, 1st r_ui must be in the top k predictions and r_ui must be above the threshold



upd_pred_nbm_df = upd_pred_nbm_df.assign(r_ui=0)
# upd_pred_nbm_df[upd_pred_nbm_df['r_ui']] = 0 

In [None]:
def prepare(upd_pred_nbm_df, test):

    # upd_pred_nbm_df = top_rec(upd_pred_nbm_df, 10)
    # get r_ui column all to zeros
    upd_pred_nbm_df = upd_pred_nbm_df.assign(r_ui=0)

    #prepare test df for merging
    new_test = test[['reviewerID', 'asin', 'overall']]
    new_column_list = ['uid', 'iid', 'r_ui']
    new_test = new_test.set_axis(new_column_list, axis=1)

    #concat predictions df with test df
    joint = pd.concat([upd_pred_nbm_df, new_test]).sort_values(by = ['uid', 'iid'])

    #make df with just duplicates
    duplies = joint[joint.duplicated(subset = ['uid', 'iid'], keep= False)].sort_values(by=['uid', 'r_ui'])

    #shift up by 1, so the predictions rows have the real value
    duplies['r_ui'] = duplies['r_ui'].shift(-1)

    #drop test rows. They no longer matter
    no_duplies = duplies[duplies['est'].notna()]


    final = pd.concat([joint, no_duplies]).sort_values(by = ['uid', 'iid', 'r_ui']).drop_duplicates(['uid', 'iid'], keep = 'last')#.reset_index(drop=True)

    final = final.sort_index(axis = 0)

    final = final.sort_values(by= ['uid', 'est'], ascending= False)

    #make new column with row index by group
    final['group_index'] = final.groupby(['uid']).cumcount()+1

    return final

def get_precision(df, test, k):

    final = prepare(df, test)
    # tt = final.groupby(['uid']).head(k)

    # Count number of non-zero elements PER USER. It returns df with many columns. Our case, we only care about ground truth.
    gg = tt.groupby(['uid']).agg(lambda x: x.ne(0).sum())

    # gg = final.groupby(['uid']).agg(lambda x: x.ne(0).sum())

    # gg = tt.groupby('uid').count()


    #must account for when a user has no recommended items. -> precision = 0

    # np.mean(gg['r_ui']/5)

    # df.sort_values(by= ['uid', 'est'], ascending= False)

    # return np.mean(gg['r_ui'])

    return gg

def get_MAP(df, test):
    
    df = prepare(df, test)

    # df = df.groupby(['uid']).head(k)

    #make new column with row index by group
    # df['group_index'] = df.groupby(['uid']).cumcount()+1

    # get rows index that matter to calculate MAP@k
    df_new = df[df['r_ui'].apply(lambda x: x != 0)]

    return df_new
    # return sum(1/(df_new['group_index'] +1))/len(test)


def hit_rate(df, test, k):
    """ Difference from Precision@k is: at the end 
    -> we have count of non-zeros 'r_ui' per user. (Note: in this exercicise we get AT MOST 1 count per user, because of initial building of test dataset, with just 1 item per user).
    -> caluclate mean over all users.
    In Precision@k:
    -> .... same
    -> divide the count of non-zeros for a given user by k. 
    -> caluclate mean over all users  """
    
    final = prepare(df, test)

    # Select only top k rows for each user
    tt = final.groupby(['uid']).head(k)

    # Count number of non-zero elements PER USER. It returns df with many columns (only 'r_ui' matters)
    gg = tt.groupby(['uid']).agg(lambda x: x.ne(0).sum())
    

    # np.mean(gg['r_ui']/5)

    # df.sort_values(by= ['uid', 'est'], ascending= False)

    return np.mean(gg['r_ui'])

# oh1 = get_MAP(upd_pred_nbm_df, test,20)

# oh1

ui = prepare(upd_pred_nbm_df, test)

ui.head(10)

# ui.loc[ui['uid'] == 'AZ520NWW40I9B']


kj = get_precision(upd_pred_nbm_df, test, 57)


kj = kj.reset_index()

kj

kj.loc[kj['uid'] == 'AZ520NWW40I9B' ]

# kj['AZ520NWW40I9B']

# get_MAP(upd_pred_nbm_df, test).head(10)
    
# get_precision(upd_pred_nbm_df, test, 10)
# AZ520NWW40I9B

Unnamed: 0,uid,iid,r_ui,est,details,group_index
54691,AZRD4IZU6TBFV,B000VV1YOY,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",1
54692,AZRD4IZU6TBFV,B001LNODUS,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",2
54693,AZRD4IZU6TBFV,B00006L9LC,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",3
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54695,AZRD4IZU6TBFV,B001OHV1H4,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54696,AZRD4IZU6TBFV,B000X7ST9Y,0.0,5.0,"{'actual_k': 1, 'was_impossible': False}",6
54697,AZRD4IZU6TBFV,B00DY59MB6,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",7
54698,AZRD4IZU6TBFV,B019FWRG3C,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",8
54699,AZRD4IZU6TBFV,B00L1I1VMG,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",9
54700,AZRD4IZU6TBFV,B0010ZBORW,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",10


Unnamed: 0,uid,iid,r_ui,est,details
0,A105A034ZG9EHO,5,0,5,5
1,A10JB7YPWZGRF4,5,0,5,5
2,A10M2MLE2R0L6K,5,0,5,5
3,A10P0NAKKRYKTZ,5,0,5,5
4,A10ZJZNO4DAVB,5,0,5,5
...,...,...,...,...,...
944,AZCOSCQG73JZ1,5,0,5,5
945,AZD3ON9ZMEGL6,5,0,5,5
946,AZFYUPGEE6KLW,5,0,5,5
947,AZJMUP77WBQZQ,5,0,5,5


Unnamed: 0,uid,iid,r_ui,est,details
943,AZ520NWW40I9B,5,1,5,5


In [None]:
# ui.loc[ui['uid'] == 'AZ520NWW40I9B']

# ui.loc[ui[['r_ui', 'est']].where(ui[['r_ui', 'est']]!= 'NaN')]

# ui[(ui > 0).sum(axis=1) >= 3]
# 

In [None]:
# understand why 'AZ520NWW40I9B' returns estimate NAN after prepare.

# Ok, so what happened?? -> basically, this item 'B00QXW95Q4' was only ever rated by this user 'AZ520NWW40I9B'.
# And this pair happens to be in the test dataset. So in the train dataset, this item 'B00QXW95Q4' will never be
# recommended, because it just "doesn't exist" !


upd_pred_nbm_df.head(10)

ww = upd_pred_nbm_df.loc[upd_pred_nbm_df['uid'] == 'AZ520NWW40I9B']

ww.loc[ww['iid'] == 'B00QXW95Q4']

upd_pred_nbm_df.loc[upd_pred_nbm_df['iid'] == 'B00QXW95Q4']



Unnamed: 0,uid,iid,r_ui,est,details,group_index
54691,AZRD4IZU6TBFV,B000VV1YOY,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",1
54692,AZRD4IZU6TBFV,B001LNODUS,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",2
54693,AZRD4IZU6TBFV,B00006L9LC,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",3
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54695,AZRD4IZU6TBFV,B001OHV1H4,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54696,AZRD4IZU6TBFV,B000X7ST9Y,0.0,5.0,"{'actual_k': 1, 'was_impossible': False}",6
54697,AZRD4IZU6TBFV,B00DY59MB6,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",7
54698,AZRD4IZU6TBFV,B019FWRG3C,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",8
54699,AZRD4IZU6TBFV,B00L1I1VMG,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",9
54700,AZRD4IZU6TBFV,B0010ZBORW,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",10


Unnamed: 0,uid,iid,r_ui,est,details


Unnamed: 0,uid,iid,r_ui,est,details


In [None]:
def hit_rate(df, test, k):
    """ Difference from Precision@k is: at the end 
    -> we have count of non-zeros 'r_ui' per user. (Note: in this exercicise we get AT MOST 1 count per user, because of initial building of test dataset, with just 1 item per user).
    -> caluclate mean over all users.
    In Precision@k:
    -> .... same
    -> divide the count of non-zeros for a given user by k. 
    -> caluclate mean over all users  """
    
    final = prepare(df, test)

    # Select only top k rows for each user
    tt = final.groupby(['uid']).head(k)

    # Count number of non-zero elements PER USER. It returns df with many columns (only 'r_ui' matters)
    gg = tt.groupby(['uid']).agg(lambda x: x.ne(0).sum())

    # np.mean(gg['r_ui']/5)

    # df.sort_values(by= ['uid', 'est'], ascending= False)

    return np.mean(gg['r_ui'])

In [None]:
################## TEST with more than one Ground Truth per user #######################

In [None]:
# upd_pred_nbm_df

my_copy = ui.copy()

my_copy.iat[6, my_copy.columns.get_loc('r_ui')] = float(4)

my_copy.head(10)

Unnamed: 0,uid,iid,r_ui,est,details,group_index
54691,AZRD4IZU6TBFV,B000VV1YOY,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",1
54692,AZRD4IZU6TBFV,B001LNODUS,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",2
54693,AZRD4IZU6TBFV,B00006L9LC,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",3
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54695,AZRD4IZU6TBFV,B001OHV1H4,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54696,AZRD4IZU6TBFV,B000X7ST9Y,0.0,5.0,"{'actual_k': 1, 'was_impossible': False}",6
54697,AZRD4IZU6TBFV,B00DY59MB6,4.0,5.0,"{'actual_k': 0, 'was_impossible': False}",7
54698,AZRD4IZU6TBFV,B019FWRG3C,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",8
54699,AZRD4IZU6TBFV,B00L1I1VMG,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",9
54700,AZRD4IZU6TBFV,B0010ZBORW,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",10


In [None]:
def get_precision(df, k):

    # final = prepare(df, test)
    tt = df.groupby(['uid']).head(k)

    # Count number of non-zero elements PER USER. It returns df with many columns. Our case, we only care about ground truth.
    gg = tt.groupby(['uid']).agg(lambda x: x.ne(0).sum())

    # np.mean(gg['r_ui']/5)

    # df.sort_values(by= ['uid', 'est'], ascending= False)

    # return np.mean(gg['r_ui'])

    return gg

# my_copy.head(10)

# get_MAP(my_copy, test, 10).head(10)

get_precision(my_copy, 10)

Unnamed: 0_level_0,iid,r_ui,est,details
uid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A105A034ZG9EHO,10,1,10,10
A10JB7YPWZGRF4,10,1,10,10
A10M2MLE2R0L6K,10,0,10,10
A10P0NAKKRYKTZ,10,1,10,10
A10ZJZNO4DAVB,10,1,10,10
...,...,...,...,...
AZCOSCQG73JZ1,10,1,10,10
AZD3ON9ZMEGL6,10,1,10,10
AZFYUPGEE6KLW,10,1,10,10
AZJMUP77WBQZQ,10,1,10,10


In [None]:

def get_MAP(df, k):
    
    # df = prepare(df, test)

    # df = df.groupby(['uid']).head(k)

   

    #make new column with row index by group
    df['group_index'] = df.groupby(['uid']).cumcount()+1


    # # get number of ground truths per user
    gg = df.groupby(['uid']).agg(lambda x: x.ne(0).sum())


    # df = df.groupby(['uid']).head(k)
    # get rows index that matter to calculate MAP@k
    df_new = df[df['r_ui'].apply(lambda x: x != 0)]

    
    try2 = df_new.groupby(["uid"])['uid'].count().reset_index(name="Ground_Truth")#.set_index('uid')
    result = pd.merge(df_new, try2, on=['uid'])

    # select only top k
    result = result[result['group_index'] < k]

    #now for top k recommended items get Average Precision for each user

    try3 = result.groupby(["uid"])['uid'].count().reset_index(name="Precision")#.set_index('uid')
    result_2 = pd.merge(result, try3, on=['uid'])


    result_2['Average Precision'] = result_2['Precision'] /(k*result_2['Ground_Truth'])


    # result['AP'] = 1/result['Ground_Truth']

    # df_new.groupby('uid').count()



    
    df_news = df_new[df_new['est'].isna()]
    

    # df_new.groupby('uid').count()

    # gets precision. Works for more than 1 hit in top k
    # ss = df_new.groupby('uid').count() / k

    # ss/ ground_thruth

    #maybe add colum with ground truth for each user

    return df_news
    # return (result_2).groupby('uid').mean()

    # return sum(result_2['Average Precision']) /len(test)
    
    # return sum(1/(df_new['group_index']))/len(test)

# len(get_MAP(my_copy,1000))

dd = get_MAP(my_copy,10)
# len(dd)

# dd.loc[dd['uid'] == 'AZ520NWW40I9B']

dd



# AZ520NWW40I9B	B00QXW95Q4


Unnamed: 0,uid,iid,r_ui,est,details,group_index
5221,AZ520NWW40I9B,B00QXW95Q4,5.0,,,56
5142,ATEX4XVEQYN7B,B004KEJ65C,5.0,,,56
5165,AT9SW0VOLAKQD,B00BSE3III,5.0,,,56
5194,AT2F2DJ7RELJI,B00IJHY54S,5.0,,,56
5207,AROYPRQ35VSAT,B00JF2GVWK,5.0,,,56
...,...,...,...,...,...,...
4139,A157AUOFPJQ46Q,B001E5PLCM,4.0,,,57
5174,A156IOMOA59X7N,B00CQ0LN80,5.0,,,56
5182,A12X146LZM6KM0,B00HLXEXDO,5.0,,,56
5170,A121C9UWQFW5W6,B00CQ0LN80,5.0,,,56


In [None]:
dd.groupby('uid').count()

Unnamed: 0_level_0,iid,r_ui,est,details,group_index,Ground_Truth,AP
uid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
A105A034ZG9EHO,1,1,1,1,1,1,1
A10JB7YPWZGRF4,1,1,1,1,1,1,1
A10P0NAKKRYKTZ,1,1,1,1,1,1,1
A10ZJZNO4DAVB,1,1,1,1,1,1,1
A115LE3GBAO8I6,1,1,1,1,1,1,1
...,...,...,...,...,...,...,...
AZCOSCQG73JZ1,1,1,1,1,1,1,1
AZD3ON9ZMEGL6,1,1,1,1,1,1,1
AZFYUPGEE6KLW,1,1,1,1,1,1,1
AZJMUP77WBQZQ,1,1,1,1,1,1,1


In [None]:
try2 = dd.groupby(["uid"])['uid'].count().reset_index(name="count")#.set_index('uid')

result = pd.merge(dd, try2, on=['uid'])
result

fun = result[result['group_index'] < 6]


result.groupby('uid')['r_ui'].count()

result['AP'] = 1/result['count']

result

Unnamed: 0,uid,iid,r_ui,est,details,group_index,count
0,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,2
1,AZRD4IZU6TBFV,B00DY59MB6,4.0,5.000000,"{'actual_k': 0, 'was_impossible': False}",7,2
2,AZJMUP77WBQZQ,B001OHV1H4,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",5,1
3,AZFYUPGEE6KLW,B001OHV1H4,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",5,1
4,AZD3ON9ZMEGL6,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,1
...,...,...,...,...,...,...,...
945,A10ZJZNO4DAVB,B001OHV1H4,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",5,1
946,A10P0NAKKRYKTZ,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,1
947,A10M2MLE2R0L6K,B019FWRG3C,5.0,4.445363,"{'actual_k': 23, 'was_impossible': False}",44,1
948,A10JB7YPWZGRF4,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,1


uid
A105A034ZG9EHO    1
A10JB7YPWZGRF4    1
A10M2MLE2R0L6K    1
A10P0NAKKRYKTZ    1
A10ZJZNO4DAVB     1
                 ..
AZCOSCQG73JZ1     1
AZD3ON9ZMEGL6     1
AZFYUPGEE6KLW     1
AZJMUP77WBQZQ     1
AZRD4IZU6TBFV     2
Name: r_ui, Length: 949, dtype: int64

Unnamed: 0,uid,iid,r_ui,est,details,group_index,count,AP
0,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,2,0.5
1,AZRD4IZU6TBFV,B00DY59MB6,4.0,5.000000,"{'actual_k': 0, 'was_impossible': False}",7,2,0.5
2,AZJMUP77WBQZQ,B001OHV1H4,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",5,1,1.0
3,AZFYUPGEE6KLW,B001OHV1H4,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",5,1,1.0
4,AZD3ON9ZMEGL6,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,1,1.0
...,...,...,...,...,...,...,...,...
945,A10ZJZNO4DAVB,B001OHV1H4,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",5,1,1.0
946,A10P0NAKKRYKTZ,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,1,1.0
947,A10M2MLE2R0L6K,B019FWRG3C,5.0,4.445363,"{'actual_k': 23, 'was_impossible': False}",44,1,1.0
948,A10JB7YPWZGRF4,B0012Y0ZG2,5.0,5.000000,"{'actual_k': 40, 'was_impossible': False}",4,1,1.0


In [None]:
ff = result.groupby('uid').sum()

Unnamed: 0_level_0,r_ui,est,group_index,count,AP
uid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
A105A034ZG9EHO,5.0,5.000000,4,1,1.0
A10JB7YPWZGRF4,5.0,5.000000,4,1,1.0
A10M2MLE2R0L6K,5.0,4.445363,44,1,1.0
A10P0NAKKRYKTZ,5.0,5.000000,4,1,1.0
A10ZJZNO4DAVB,5.0,5.000000,5,1,1.0
...,...,...,...,...,...
AZCOSCQG73JZ1,5.0,5.000000,5,1,1.0
AZD3ON9ZMEGL6,5.0,5.000000,4,1,1.0
AZFYUPGEE6KLW,5.0,5.000000,5,1,1.0
AZJMUP77WBQZQ,5.0,5.000000,5,1,1.0


In [None]:
my_copy.groupby('uid').head(8)
my_copy.head(10)

Unnamed: 0,uid,iid,r_ui,est,details,group_index
54691,AZRD4IZU6TBFV,B000VV1YOY,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",1
54692,AZRD4IZU6TBFV,B001LNODUS,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",2
54693,AZRD4IZU6TBFV,B00006L9LC,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",3
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54695,AZRD4IZU6TBFV,B001OHV1H4,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
...,...,...,...,...,...,...
4,A105A034ZG9EHO,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
5,A105A034ZG9EHO,B001OHV1H4,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
6,A105A034ZG9EHO,B000X7ST9Y,0.0,5.0,"{'actual_k': 1, 'was_impossible': False}",6
7,A105A034ZG9EHO,B00DY59MB6,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",7


Unnamed: 0,uid,iid,r_ui,est,details,group_index
54691,AZRD4IZU6TBFV,B000VV1YOY,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",1
54692,AZRD4IZU6TBFV,B001LNODUS,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",2
54693,AZRD4IZU6TBFV,B00006L9LC,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",3
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54695,AZRD4IZU6TBFV,B001OHV1H4,0.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54696,AZRD4IZU6TBFV,B000X7ST9Y,0.0,5.0,"{'actual_k': 1, 'was_impossible': False}",6
54697,AZRD4IZU6TBFV,B00DY59MB6,4.0,5.0,"{'actual_k': 0, 'was_impossible': False}",7
54698,AZRD4IZU6TBFV,B019FWRG3C,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",8
54699,AZRD4IZU6TBFV,B00L1I1VMG,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",9
54700,AZRD4IZU6TBFV,B0010ZBORW,0.0,5.0,"{'actual_k': 0, 'was_impossible': False}",10


In [None]:
class Metrics:

  def __init__(self,df,test,k):

    self.df = df
    self.test = test
    self.k = k

  def prepare(self):
    """ Join prediction df with test df so the final df has at the ground truth (in this case there's just 1 GT per user)"""

    # upd_pred_nbm_df = top_rec(upd_pred_nbm_df, 10)
    # get r_ui column all to zeros
    upd_pred_nbm_df = self.df.assign(r_ui=0)

    #prepare test df for merging
    new_test = self.test[['reviewerID', 'asin', 'overall']]
    new_column_list = ['uid', 'iid', 'r_ui']
    new_test = new_test.set_axis(new_column_list, axis=1)

    #concat predictions df with test df
    joint = pd.concat([upd_pred_nbm_df, new_test]).sort_values(by = ['uid', 'iid'])

    #make df with just duplicates
    duplies = joint[joint.duplicated(subset = ['uid', 'iid'], keep= False)].sort_values(by=['uid', 'r_ui'])

    #shift up by 1, so the predictions rows have the real value
    duplies['r_ui'] = duplies['r_ui'].shift(-1)

    #drop test rows. They no longer matter
    no_duplies = duplies[duplies['est'].notna()]


    final = pd.concat([joint, no_duplies]).sort_values(by = ['uid', 'iid', 'r_ui']).drop_duplicates(['uid', 'iid'], keep = 'last')#.reset_index(drop=True)

    final = final.sort_index(axis = 0)

    final = final.sort_values(by= ['uid', 'est'], ascending= False)

    #make new column with row index by group
    final['group_index'] = final.groupby(['uid']).cumcount()+1

    # get rows index that have ground_truth
    # final= final[final['r_ui'].apply(lambda x: x != 0)]

    return final


  def get_precision(self):

    final = self.prepare()
    tt = final.groupby(['uid']).head(self.k)

    gg = tt.groupby(['uid']).agg(lambda x: x.ne(0).sum())

    # return np.mean(gg['r_ui']/self.k)

    return gg
    
    # # select only top k
    # result = result[result['group_index'] < self.k]

    # #now for top k recommended items get Average Precision for each user
    # try3 = result.groupby(["uid"])['uid'].count().reset_index(name="Precision")#.set_index('uid')
    # result_2 = pd.merge(result, try3, on=['uid'])

    # # in case there's more than 1 hit in top k recommendation per user keep just 1st appearance.
    # # it already contains info needed to calculate precision
    # result_2 = result_2.drop_duplicates('uid', keep = 'first')

    # coisa = np.mean(result_2['Precision']/self.k)

    # # return np.mean(gg['r_ui']/self.k)
    # # return coisa/ len(self.test)
    # return coisa


  
  def hit_rate(self):
    """ Difference from Precision@k is: at the end 
    -> we have count of non-zeros 'r_ui' per user. (Note: in this exercicise we get AT MOST 1 count per user, because of initial building of test dataset, with just 1 item per user).
    -> calculate mean over all users.
    In Precision@k:
    -> .... same
    -> divide the count of non-zeros for a given user by k. 
    -> caluclate mean over all users  """
    
    final = prepare(df, test)

    # Select only top k rows for each user
    tt = final.groupby(['uid']).head(self.k)

    # Count number of non-zero elements PER USER. It returns df with many columns (only 'r_ui' matters to us)
    gg = tt.groupby(['uid']).agg(lambda x: x.ne(0).sum())

    

    return np.mean(gg['r_ui'])


  # def get_MAP(self):
    
  #   df = self.prepare()

  #   df = df.groupby(['uid']).head(self.k)

  #   #make new column with row index by group
  #   df['group_index'] = df.groupby('uid').cumcount()

  #   # get rows index that matter to calculate MAP@k
  #   # df_new = df[df['r_ui'].apply(lambda x: x != 0)]

  #   return df


  
  def get_MAP(self):
      
      df = self.prepare()

      # get rows index that matter to calculate MAP@k
      # df_new = df[df['r_ui'].apply(lambda x: x != 0)]

      # Calculate Ground truth for each user and merge in one df
      try2 = df.groupby(["uid"])['uid'].count().reset_index(name="Ground_Truth")#.set_index('uid')
      result = pd.merge(df, try2, on=['uid'])

      # select only top k
      result = result[result['group_index'] < self.k]

      #now for top k recommended items get Average Precision for each user
      try3 = result.groupby(["uid"])['uid'].count().reset_index(name="Precision")#.set_index('uid')
      result_2 = pd.merge(result, try3, on=['uid'])


      result_2['Average Precision'] = result_2['Precision'] /(self.k * result_2['Ground_Truth'])

      return (result_2)

      # return sum(result_2['Average Precision']) /len(self.test)




hola = Metrics(upd_pred_nbm_df, test, 5)
hola.prepare()
hola.get_precision().head(30)
# hola.get_MAP()

# my_final = hola.prepare()

Unnamed: 0,uid,iid,r_ui,est,details,group_index
54691,AZRD4IZU6TBFV,B000VV1YOY,0.0,5.00,"{'actual_k': 0, 'was_impossible': False}",1
54692,AZRD4IZU6TBFV,B001LNODUS,0.0,5.00,"{'actual_k': 0, 'was_impossible': False}",2
54693,AZRD4IZU6TBFV,B00006L9LC,0.0,5.00,"{'actual_k': 40, 'was_impossible': False}",3
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.00,"{'actual_k': 40, 'was_impossible': False}",4
54695,AZRD4IZU6TBFV,B001OHV1H4,0.0,5.00,"{'actual_k': 40, 'was_impossible': False}",5
...,...,...,...,...,...,...
22,A105A034ZG9EHO,B000PKKAGO,0.0,4.25,"{'actual_k': 1, 'was_impossible': False}",52
32,A105A034ZG9EHO,B00112DRHY,0.0,4.00,"{'actual_k': 1, 'was_impossible': False}",53
0,A105A034ZG9EHO,B00W259T7G,0.0,3.10,"{'actual_k': 2, 'was_impossible': False}",54
39,A105A034ZG9EHO,B000W0C07Y,0.0,2.00,"{'actual_k': 1, 'was_impossible': False}",55


Unnamed: 0_level_0,iid,r_ui,est,details,group_index
uid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
A105A034ZG9EHO,5,1,5,5,5
A10JB7YPWZGRF4,5,1,5,5,5
A10M2MLE2R0L6K,5,0,5,5,5
A10P0NAKKRYKTZ,5,1,5,5,5
A10ZJZNO4DAVB,5,1,5,5,5
A1118RD3AJD5KH,5,0,5,5,5
A115LE3GBAO8I6,5,1,5,5,5
A11AZPF8R6A1F8,5,1,5,5,5
A11L1TI883AOSV,5,1,5,5,5
A121C9UWQFW5W6,5,0,5,5,5


In [None]:
test['reviewerID'].nunique()

len(test)

949

949

In [None]:
hola.prepare().head(20)

Unnamed: 0,uid,iid,r_ui,est,details,group_index
54694,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54639,AZJMUP77WBQZQ,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54583,AZFYUPGEE6KLW,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54526,AZD3ON9ZMEGL6,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54471,AZCOSCQG73JZ1,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
5221,AZ520NWW40I9B,B00QXW95Q4,5.0,,,56
54360,AZ4T3DDT8L9EQ,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54304,AYY463Q7V3LTU,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5
54247,AYORX1AK30JMB,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4
54199,AYNTULRNAIPNY,B0010ZBORW,4.0,3.916667,"{'actual_k': 4, 'was_impossible': False}",54


In [None]:


hola = Metrics(upd_pred_nbm_df, test, 60)
ty = hola.get_MAP()

ty.head(10)

np.mean(hola.get_MAP()['Average Precision'])

Unnamed: 0,uid,iid,r_ui,est,details,group_index,Ground_Truth,Precision,Average Precision
0,AZRD4IZU6TBFV,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4,1,1,0.016667
1,AZJMUP77WBQZQ,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5,1,1,0.016667
2,AZFYUPGEE6KLW,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5,1,1,0.016667
3,AZD3ON9ZMEGL6,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4,1,1,0.016667
4,AZCOSCQG73JZ1,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5,1,1,0.016667
5,AZ520NWW40I9B,B00QXW95Q4,5.0,,,56,1,1,0.016667
6,AZ4T3DDT8L9EQ,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5,1,1,0.016667
7,AYY463Q7V3LTU,B001OHV1H4,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",5,1,1,0.016667
8,AYORX1AK30JMB,B0012Y0ZG2,5.0,5.0,"{'actual_k': 40, 'was_impossible': False}",4,1,1,0.016667
9,AYNTULRNAIPNY,B0010ZBORW,4.0,3.916667,"{'actual_k': 4, 'was_impossible': False}",54,1,1,0.016667


0.01666666666666705