# Age Aware Recommendation

This notebook details the code of our age aware recommendation experiment, included in the paper "Anonimous Authors, Soundtracks of Our Lives: How Age Influences Musical Preferences" (the paper is currently going through a double-blind review process and therefore authors are omited). In cells such as this and inline comments, the code is documented for the purpose of exploitation and experiment reproducability.

The code for this experiment is written in Python 3 utilizing the Jupyter Notebook format (https://jupyter-notebook.readthedocs.io). We also use the following exteral libraries:
- "Pandas" (https://pandas.pydata.org/) - data wrangling library, 
- "NumPy" (https://numpy.org/) - linear algebra library,
- "LensKit" (https://lenskit.org/) - recommender system library,

as well as, the native python 3 libraties: "collections" (data structures for python), and "random" (random number generator).

For the purpose of the experiment we use a sample of users, tracks and listening events (LEs) taken from the LFM2b dataset ("http://www.cp.jku.at/datasets/LFM-2b/"). The samples of users, tracks and LEs necessary for the experiment are included in the repository with this notebook. For more information on the sample, we point the reader to the aforementioned paper.

## 0. Results folder
Before begining the experiment a folder needs to be created to store the results. We store path to the fold in the "Results_folder" variable.

In [None]:
Results_folder='Experiment_Results/'

## 1. Necessary libraries and modules

We import the necessary libraries and modules for the experiment. An overview of the libraries used can be found above. We use inline comments to document each module.

In [None]:
#Load the pandas and numpy libraries
import pandas as pd
import numpy as np

#From LensKit load the necessary classes for bacth recommendations ("batch"), top-n recommendation evaluation ("topn")
from lenskit import batch, topn

#From LensKit load the User KNN algorithm class ("user_knn").
from lenskit.algorithms import user_knn

#From LensKit load a warper for Top N recommendations ("TopN")
#The warper will generate a ranked list of recommendations using the provided algorithm
from lenskit.algorithms.ranking import TopN

#From LensKit load a selector for unrated (unseen) items, we use this to select candidate items
from lenskit.algorithms.basic import UnratedItemCandidateSelector

#From the base python pacjages we load a priority queue class ('deque') and a random number generator ('random').
from collections import deque
import random


In [None]:
#Instruction for pandas not to give chained assignment warnings (this is for clearer output during execution)
#All instances whith chained assignment warnings were checked for correctness prior to performing the experiment
pd.set_option('mode.chained_assignment', None)

## 2. Importing and preprocessing the necessary data

We import the necessary data for the experiment. For the purpose of the experiment we use a sample of users, tracks and listening events (LEs) taken from the LFM2b dataset ("http://www.cp.jku.at/datasets/LFM-2b/"). The samples of users, tracks and LEs necessary for the experiment are included in the repository with this notebook. For more information on the sample, we point the reader to the aforementioned paper. 

After the data we select only those user with >20 intearections to be included in the experiment.

In [None]:
#Load the ratings data and remane columns to fit the format used in LensKit algorithms (user, item, rating)
ratings=pd.read_parquet("Data/Experiment_Data/full_training_subset_of_select_tracks_experiment.parquet")
ratings.columns=['user', 'item', 'rating']

In [None]:
#load the user age and track artist data
user_age_index_df=pd.read_parquet("Data/Experiment_Data/user_age_index_basic_for_experiment_updated.parquet")
user_age_index_df=user_age_index_df.reset_index()
artists_index_df=pd.read_parquet("Data/Experiment_Data/tracks_artist_index_for_experiment.parquet")

#Convert the user age and artist data into a lookup dictonary
user_age_index=dict(zip(list(user_age_index_df['user']), list(user_age_index_df['user_age'])))
artists_index=dict(zip(list(artists_index_df['item']), list(artists_index_df['artist'])))

In [None]:
#Select only users who have listened to more than 20 tracks in our sample
user_counts=ratings.groupby('user').count().reset_index()
users_admited=user_counts[user_counts['item']>20]
users_admited=pd.DataFrame({'user':users_admited.user.unique()})
ratings=ratings.merge(users_admited,how='inner',on='user')

## 3. Utility Functions

For the purpose of the experiment we define a number of utility functions. Primaraly, a function needed to adjust the diversity of the top n lists of each user for given % (to test our diversity adjustment hypothesis) and a function to split the data into folds where each user is equally represented in all folds.

These utility functions are:

**adjust_divesities 
(users, diversity, TopN_lists_persanolized, n,k):** 

- Adjusts the diversity of the top n lists of each user. It takes as **input** (i) a list of presonalized recomendations ("TopN_lists_persanolized"),  (ii) a set of users ("users"), (iii) a desired diversity adjustment % ("diversity") and (iv) desired top n list size ('k'). 

- **Returns** a list of top n list for each adjusted for diversity of the desired length 'k' as well as the average intra-list artist diversity of the lists after adjustment


**split_by_users(ratings, folds=5):**

- For a **given** (i) ratings dataframe ("ratings") and (ii) number of folds ("folds"),
- **returns** the orignal ratings dataframe split into the desired number of folds per user (the interactions of each user split equally among all folds).


**generate_recommendations(user_set, algorithm, n=100):** 

- For a **given** (i) set of users ("user_set"), (ii) recommendation algorithm ("algorithm") and (iii) a desired number of recommendations ("n"), 
- it **returns** a list of "n" recommended tracks for each user based on the given algorithm.



**evaluate_recomendations(recomendations, truth, k=100):**

- For **given** (i) recomendation lists per user ("recomendations"), (ii) true interaction per user ("truth"), and the length of each recomended list ("k")

- it **returns** evalutations of the recomendations for each user using the following metrics:ndcg, precision, recall, hitrate (calculate these methric using the function provided by LensKit).

 We use inline comments to document the procedure of each function further.   

In [None]:

def ajdust_diversity_user(userID, diversity, topN_list, n=100,k=50):
    
    #Convert float probability to a whole number format (percentages)
    similar_track_removal_probability=int(np.floor(diversity*100))
    
    #Define a list to hold the artists that are already present in the topN_list,
    #and a list to hold the indices of the tracks we should remove to adjust the diversity
    artist_already_pressent=[]
    removed_tracks_indices=[]
    
    #For i in the range of 0 to the length of the Top n list (the first 'top' track to the last (lowest rated) track)
    for i in range(0,len(topN_list)):
        
        #Find the next track in the list, it's rank and it's artist
        candidate=topN_list.iloc[i]
        item=int(candidate['item'])
        rank=int(candidate['rank'])
        item_artist=artists_index[item]
        
        #If the artist has already been seen in the list, remove it from the list with the probability of 'similar_track_removal_probability'
        if (item_artist in artist_already_pressent):
            removal_chance=random.randrange(1,100)
            if(removal_chance<similar_track_removal_probability):
                removed_tracks_indices.append(i)
        #Else, if the artist has not yet been seen in the recommendation list, add it to the list of already seen artists
        else:
            artist_already_pressent.append(item_artist)

        #If we have reached the desired length of the Top n list, held in the variable 'k' , break the loop
        if i == k:
            break
    
    #Remove the tracks that have been flagged for removal and reconstruct the (ranks in the) topN list
    topN_list_final=topN_list.drop(removed_tracks_indices).reset_index(drop=True)
    topN_list_final['rank']=list(range(1,len(topN_list_final)+1))
    
    #Select the top k tracks in the list, where k is the desired length of the recomendation list
    topN_list_final=topN_list_final.head(k)
    
    #If a list is pressent in the topN_list, return the list and calculation of the interalistic diversity of the list
    #Inra-list diversity in our case is defined as
    #the number of artists in the list divided by the number of recommendations in the topN_list
    if len(topN_list_final)>0:
        return len(artist_already_pressent)/len(topN_list_final), topN_list_final
    
    #In the case of no list, return 0 intra-list diversity and the list
    else:
        return 0, topN_list_final


#Adjust the diversities each user by a given % ("diversity" paramater)
def adjust_divesities(users, diversity, TopN_lists_persanolized, n=100,k=50):
    
    #Set vraiables to record the final lists and intralistic diversity of each user 
    final_list=None
    set_list=True
    diversities=[]
    
    #users_processed=0
    
    #For each user "user"
    for user in users:
        
        #Select the list of recomendations for the user
        user_list=TopN_lists_persanolized[TopN_lists_persanolized['user']==user]
        user_list=user_list.reset_index(drop=True)
        
        #Adjust the diversity of the user's list by the % defined in the "diversity" paramater,
        #and retrieve the intra-list diversity of the final list
        
        intalistic_diversitiy, user_adujusted_list=ajdust_diversity_user(user, diversity, user_list, n=n,k=k)
        
        #Record the intralistic diversity of the users adjuste list
        diversities.append(intalistic_diversitiy)
        if set_list:
            final_list=user_adujusted_list
            set_list=False
        else:
            final_list=final_list.append(user_adujusted_list)
    
    #Filter the users with zero recommendations 
    diversities=list(filter(lambda x: x != 0, diversities))
    #Return the average inta-list diversity of the users and the adjusted lists of recomendations
    return sum(diversities)/len(diversities), final_list

In [None]:
def convert_to_DF(series):
    return pd.DataFrame(series).T

def generate_recommendations(user_set, algorithm, n=100):
    
    recommendations = batch.recommend(algorithm, user_set, n ,n_jobs=25)
        
    return recommendations

def evaluate_recomendations(recomendations, truth, k=100):
    
    analysis = topn.RecListAnalysis()
    analysis.add_metric(topn.ndcg,k=k)
    analysis.add_metric(topn.precision,k=k)
    analysis.add_metric(topn.recall,k=k)
    analysis.add_metric(topn.hit)
    results = analysis.compute(recomendations, truth)
    
    return results


In [None]:
def generate_single_split(lenght, folds=5):
    
    adjustment_for_remainder_after_split=lenght % folds 
    split_length=int((lenght-adjustment_for_remainder_after_split)/folds)
    
    
    split=[]
    
    
    for i in range(1,6):
        if adjustment_for_remainder_after_split >0:
            split=split+[i]*(split_length+1)
            adjustment_for_remainder_after_split=adjustment_for_remainder_after_split - 1
        else:
            split=split+[i]*(split_length)
    
    return split

def split_by_users(ratings, folds=5):
    
    #Sampling 100% of the dataset just returnes the datased shuffled 
    ratings=ratings.sample(frac=1)
    splited_ratings=None
    
    set_inital=True
    for _,user in ratings.groupby('user', sort=False):
        user['split']=generate_single_split(len(user),folds=folds)
        
        if set_inital:
            splited_ratings=user
            set_inital=False
        else:
            splited_ratings=splited_ratings.append(user)
    return splited_ratings

## 4. Experiment Procedure

This section details the procedure of the experiment. We first split the data into 5 folds for cross-validation, and then train and cross-validate a user-based KNN recommendation algorithm for each combination of paramaters:

- Age range: 10-20 16-26, 26-36, 36-46, 46-56, 49-59, 49-55, 55-61 and 10-64 (all ages, used as a baseline).
- Number of neighbours: 6, 8, 12, 18, 24, 36, 50, 60, 70, 100, 110, 120, 150
- Diversity adjustment: 0 (no adjustement), 0.2, 0.4, 0.6

For each algorithm we geneate a list of 10 recommendations and evaluate NDCG@10, Precision@10, Recall@10 and HitRate@10 (Recall and HitRate are not used in the final paper).

The final results of the procedure are expoted as a .csv file and stored in the "Results_folder". We use inline comments to document the steps further in the code below.

In [None]:
#Split the ratings into 5 folds per User

ratings=split_by_users(ratings, folds=5)

#EXTRA: Save and load ratings with split if needed
#ratings.to_parquet("Data/Experiment_Data/Ratings_with_Split.parquet")
#ratings=pd.read_parquet("Data/Experiment_Data/Ratings_with_Split.parquet")

In [None]:
# Crossvalidate sliding window

#Define the set of paramates and age ranges which we will test on
age_ranges=[(10,64),(10,20),(16,26),(26,36),(36,46),(46,56),(49,59),(49,55),(55,61)] 
neighbours_set=[6,8,12,18,24,36,50,60,70,100,110,120,150]
diversities=[0,0.2,0.4,0.6]
#To acomodate for testing the performance in multiple list lengths define a set of list lengths to be tested and the number of recommendations n before divesity adjustment
#In our case we only test for a list length of 10, and initial list length before divesity adjustment of n=50
list_lengths=[10]

n=50


results=None
set_results=True
cv_iter=0

#For each of the 5 folds (CV iterations) "cv_iter"
for cv_iter in range(1,5+1):
    
    #Select the train and test split for the itration
    test=ratings[ratings['split']==cv_iter]
    train=ratings.drop(test.index)
    
    #Drop the split column (the column which indicates to which split the rating belong)
    test=test.drop('split', axis=1)
    train=train.drop('split', axis=1)
    
    print("Testing in CV iteration:",cv_iter)
    
    #For each configuration of the number of neighbours "neighbours"
    for neighbours in neighbours_set:
        print("Traning recommender with",neighbours,"neighbours") #, save_nbrs=100
        
        #Buld a KNN recommender with the selected number of neighbours
        predictor = user_knn.UserUser(neighbours,min_nbrs=neighbours,center=False,feedback='implicit',use_ratings=False)#
        Unseen_item_selector = UnratedItemCandidateSelector()
        recommender = TopN(predictor, Unseen_item_selector)    
        predictor.fit(train)
        Unseen_item_selector.fit(train)
        
        #Generate a large list of recomendations for all users
        recomendations_all=generate_recommendations(test.user.unique(),recommender,n=n)
        
        #For each of the recomendation list length "k" (in the final paper this is only 10)
        for k in list_lengths:
            print("Testing at",k,"recommendations")
            
            #For each diversity configuration "diversity",
            for diversity in diversities:  
                
                #and for each age range
                for age_range_start,age_range_end in age_ranges:

                    #Find the users in the age range
                    age_range_str=str(age_range_start)+'-'+str(age_range_end)
                    users_in_age_range=user_age_index_df[(user_age_index_df['user_age']>=(age_range_start)) & 
                                            (user_age_index_df['user_age']<=(age_range_end))]                

                    info= "age range "+age_range_str+" with "+str(neighbours)+" neighbours and " + str(diversity) + " diversity" 
                    print('Processing',info,"in itteration",cv_iter)
                    
                    #Find the list of recommendations in the age range and the true list of consumed items in the age range
                    truth_in_age_range=test.merge(users_in_age_range[['user']], how='inner', on='user')
                    recs_in_age_range = recomendations_all.merge(users_in_age_range[['user']], how='inner', on='user')

                    #Adjust the diversity based on the diversity paramater, generate a list of k recommendations, and calculate the intralist diversity
                    Intralist_diversity_avg, recs_in_age_range = adjust_divesities(users_in_age_range.user.unique(), diversity, 
                                                          recs_in_age_range, n=n,k=k)
                    
                    #Evaluate the recomendations
                    results_i=evaluate_recomendations(recs_in_age_range, truth_in_age_range, k=k)
                    
                    #*Export per user evaluations in this configuration and age range 
                    filename='User_Results_'+age_range_str+'_'+str(neighbours)+'_neig_'+str(int(10*diversity))+"_diver_CViter_"+str(cv_iter)+'.csv'
                    results_i=results_i.reset_index()
                    results_i.to_csv(Results_folder+CV_iters/'+filename, index=False)
                    
                    #Average the results in this configuration and age range over all users
                    results_i=results_i[["ndcg","precision","recall","hit"]].mean()
                    
                    #Convert the results to a dataframe and add them to the overall table of result
                    results_i=convert_to_DF(results_i)
                    results_i['List_Len']=[k]*len(results_i)
                    results_i['Neighbours']=[neighbours]*len(results_i)
                    results_i['Diversity_adjustment']=[diversity]*len(results_i)
                    results_i['Intralist_Diversity_calculated']=[Intralist_diversity_avg]*len(results_i)
                    results_i['Age_range']=[age_range_str]*len(results_i)
                    results_i['CV_iter']=[cv_iter]*len(results_i)
                    if set_results:
                        results=results_i
                        set_results=False
                    else:
                        results=results.append(results_i)

                    print('Final Evaluation for',k, 'recomendations in', info, ", with Inta-user",Intralist_diversity_avg)
                    print(results_i)

    #Export intermediate results per fold as csv
    sifix="_at_fold_"+str(cv_iter)
    results.to_parquet(Results_folder+"Results_Cross_validationNew_5_10_2"+sifix+".parquet")

#Export final results as csv
results.to_parquet(Results_folder+"Results_Cross_validationNew_5_10_2_Final.parquet")    
results.head()                


In [None]:

#For each age group, configuration (neighbours, diversity adjustment) and list length, get the mean results.
results_means_all=results.groupby(['Age_range','Neighbours','Diversity_adjustment','List_Len']).mean().reset_index()

#For each list length, print the 4 best performing configurations of each age range and merge all of them in one table.
print_l=True
list_lengths=[10]
add=None
set_add=True
for list_len in list_lengths:
    
    results_means=results_means_all[results_means_all['List_Len']==list_len]
    for age_r in results.Age_range.unique():
        print("Age Range",age_r)
        results_age_r=results_means[results_means['Age_range']==age_r].sort_values(by='ndcg',ascending=False).head(4)
        print(results_age_r)

        if set_add:
            add=results_age_r
            set_add=False
        else:
            add=add.append(results_age_r)

#Export the table of best performing results into csv format.
filename=Results_folder+"Results_Cross_validation_AllAges_Summarized.csv"
add.to_csv(filename)