In [1]:
import cPickle as pickle

with open("24OCT17_pipelinerun/als_trainpreds_tvsr2.pkl", 'rb') as f:
    trainpreds = pickle.load(f)
    
with open("24OCT17_pipelinerun/als_train_df_pd.pkl", 'rb') as f:
    train_df = pickle.load(f)

In [2]:
trainpreds.head()

Unnamed: 0,PersonID,EventID,Participated,Event_Date,prediction
0,148,31,1,1448064000000000000,0.788674
1,463,31,0,1448064000000000000,0.0
2,471,31,0,1448064000000000000,0.000785
3,496,31,1,1448064000000000000,0.361366
4,833,31,0,1448064000000000000,0.0


In [11]:
unique_personID = trainpreds['PersonID'].unique()

In [22]:
len(unique_personID)

10764

In [57]:
numerator_sum = 0
denominator_sum = 0

rank_list = []

for person in sorted(unique_personID):
    user = trainpreds[trainpreds['PersonID'] == person].copy()
    user['rank_ui'] = user['prediction'].apply(lambda x: (1 - (x / user['prediction'].max()))*100 
                                               if user['prediction'].max()>0 else 0)
    numerator_sum += sum(user['Participated'] * user['rank_ui'])
    denominator_sum += sum(user['Participated'])
    rank_list.append((person, len(user), user['prediction'].max(), sum(user['Participated'] * user['rank_ui']), sum(user['Participated'])))

In [58]:
rank_bar = numerator_sum / denominator_sum

In [60]:
rank_bar

25.236794063623854

### From "Collaborative Filtering for Implicit Feedback Datasets" by Hu, Koren, and Volinsky:

Lower values of rank_bar are more desirable, as they indicate ranking actually attended events closer to the top of the recommendation lists.  FOr random predictions, the expected value of rank_ui is 50% (placing event i in the middle of the sorted list).  Thus, rank_bar >= 50% indicates an algorithm no better than random.

So in our case a rank_bar of 25% is notably better than random but still may not be great.  For comparison, in the referenced article the study they did had expected % rankings below 12% and achieving close to 8% with more factors.

For another point of comparison, we can look at a naive popularity model:

In [97]:
group_counts = trainpreds[trainpreds['Participated']==1].groupby(by=['EventID']).count()['Participated'].copy()

In [98]:
group_counts.sort_values(ascending=False, inplace=True)

In [102]:
len(group_counts)

56

In [103]:
import pandas as pd

In [118]:
popularity = pd.DataFrame({'EventID': group_counts.index, 'attendance': group_counts.values})

In [119]:
rank_ui = list((popularity.index.values / 55.)*100)

In [120]:
popularity['rank_ui'] = rank_ui

In [121]:
popularity.head()

Unnamed: 0,EventID,attendance,rank_ui
0,1003,592,0.0
1,1009,439,1.818182
2,37,407,3.636364
3,12,354,5.454545
4,38,353,7.272727


In [123]:
merged_df = pd.merge(trainpreds, popularity, how='left', on='EventID')

In [125]:
merged_df.head()

Unnamed: 0,PersonID,EventID,Participated,Event_Date,prediction,attendance,rank_ui
0,148,31,1,1448064000000000000,0.788674,268,29.090909
1,463,31,0,1448064000000000000,0.0,268,29.090909
2,471,31,0,1448064000000000000,0.000785,268,29.090909
3,496,31,1,1448064000000000000,0.361366,268,29.090909
4,833,31,0,1448064000000000000,0.0,268,29.090909


In [126]:
numerator_sum = 0
denominator_sum = 0

rank_list = []

for person in sorted(unique_personID):
    user = merged_df[merged_df['PersonID'] == person].copy()
    numerator_sum += sum(user['Participated'] * user['rank_ui'])
    denominator_sum += sum(user['Participated'])

In [127]:
popularity_rank_bar = numerator_sum / denominator_sum
popularity_rank_bar

31.912785438312564

The current model gives a better set of recommendations than simply recommending the most popular (i.e. most highly attended) events to every person with no personalization.  This is an okay start, however we have room for improvement.

Consider increasing:
alpha
number of factors
look at both cold start of nan and drop

Note that the above calculations are based on testing the model fit to the training data, which we would expect to be very good.  So taking a step back, this is still not great.  Testing directly on the validation data won't work because of the event cold start problem.  Look at a model fit using something other than EventID to assess on validation data.