<h2>LightFM: A hybrid recommender engine library</h2>

Getting to understand the underlying data representation:

Datatype docs:<br>
<a href="https://docs.python.org/3/tutorial/datastructures.html#dictionaries">dictionaries</a><br>
<a href="https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.ndarray.html">numpy.ndarray</a><br>
<a href="https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.sparse.csr_matrix.html">scipy.sparse.csr.csr_matrix</a><br>
<a href="https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.sparse.coo_matrix.html">scipy.sparse.coo.coo_matrix</a><br>
<a href="http://www.scipy-lectures.org/advanced/scipy_sparse/index.html">scipy sparse datatypes tutorial</a>

In [35]:
import numpy as np
from lightfm.datasets import fetch_movielens
from lightfm import LightFM
from lightfm.evaluation import precision_at_k
from lightfm.evaluation import auc_score

#fetch data with a threshold
user_item_rating_data = fetch_movielens()
print(type(user_item_rating_data))

for key, value in user_item_rating_data.items():
    print(key,type(value),value.shape)

<class 'dict'>
item_feature_labels <class 'numpy.ndarray'> (1682,)
item_features <class 'scipy.sparse.csr.csr_matrix'> (1682, 1682)
test <class 'scipy.sparse.coo.coo_matrix'> (943, 1682)
train <class 'scipy.sparse.coo.coo_matrix'> (943, 1682)
item_labels <class 'numpy.ndarray'> (1682,)


Loss functions available:
<br>
<ul><li>BPR: Bayesian Personalised Ranking [1] pairwise loss. Maximises the prediction difference between a positive example and a randomly chosen negative example. Useful when only positive interactions are present and optimising ROC AUC is desired.</li><li>
WARP: Weighted Approximate-Rank Pairwise [2] loss. Maximises the rank of positive examples by repeatedly sampling negative examples until rank violating one is found. Useful when only positive interactions are present and optimising the top of the recommendation list (precision@k) is desired.</li>

This example shows how to estimate these models on the Movielens dataset.

[1] Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit feedback." Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2009.

[2] Weston, Jason, Samy Bengio, and Nicolas Usunier. "Wsabie: Scaling up to large vocabulary image annotation." IJCAI. Vol. 11. 2011.

In [27]:
print(repr(user_item_rating_data['item_feature_labels']))

array(['Toy Story (1995)', 'GoldenEye (1995)', 'Four Rooms (1995)', ...,
       'Sliding Doors (1998)', 'You So Crazy (1994)',
       'Scream of Stone (Schrei aus Stein) (1991)'], dtype=object)


In [30]:
print(repr(user_item_rating_data['item_labels']))

array(['Toy Story (1995)', 'GoldenEye (1995)', 'Four Rooms (1995)', ...,
       'Sliding Doors (1998)', 'You So Crazy (1994)',
       'Scream of Stone (Schrei aus Stein) (1991)'], dtype=object)


In [21]:
print(repr(user_item_rating_data['item_features'].toarray()))

array([[ 1.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  1.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  1., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  1.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  1.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  1.]], dtype=float32)


In [25]:
#print training and testing data
print(repr(user_item_rating_data['train']))
print(repr(user_item_rating_data['test']))

<943x1682 sparse matrix of type '<class 'numpy.float32'>'
	with 49906 stored elements in COOrdinate format>
<943x1682 sparse matrix of type '<class 'numpy.int32'>'
	with 5469 stored elements in COOrdinate format>


Example usage:

In [36]:
#create model using loss function of choice:
#weighted approximate-rank pairwise, 
#uses gradiant descent, 
#uses users past rating history and neighborhood ratings
model = LightFM(learning_rate=0.05, loss='warp')#also avaiable 'bpr'

#train model
model.fit(user_item_rating_data['train'], epochs=30, num_threads=2)

def recommend(model, data, uids):
    
    #number of users and movies in training data
    n_users, n_items = data['train'].shape
    
    #generate recommendations for each user we input
    for uid in uids:
        
        #movies they already like
        known_positives = data['item_labels'][data['train'].tocsr()[uid].indices]
        
        #movies our model predicts they will like
        scores = model.predict(uid, np.arange(n_items))
        
        #rank them in order of most liked to least liked
        top_items = data['item_labels'][np.argsort(-scores)]

        #print out the results
        print("User %s" % uid)
        print("     Known Positives:")
        
        for kp in known_positives[:3]:
            print("          %s" % kp)
        
        print("     Recommended:")
        
        for ti in top_items[:3]:
            print("          %s" % ti)
            
    return

recommend(model, user_item_rating_data, [3,25,450])

User 3
     Known Positives:
          Seven (Se7en) (1995)
          Indiana Jones and the Last Crusade (1989)
          Contact (1997)
     Recommended:
          Scream (1996)
          Contact (1997)
          Air Force One (1997)
User 25
     Known Positives:
          Toy Story (1995)
          Twelve Monkeys (1995)
          Dead Man Walking (1995)
     Recommended:
          Scream (1996)
          Toy Story (1995)
          Contact (1997)
User 450
     Known Positives:
          Kolya (1996)
          Devil's Own, The (1997)
          Contact (1997)
     Recommended:
          Lost Highway (1997)
          Devil's Advocate, The (1997)
          Spawn (1997)


<h2>Evaluating performance</h2>

In [37]:
#tracking the precision
train_precision = precision_at_k(model, user_item_rating_data['train'], k=10).mean()
test_precision = precision_at_k(model, user_item_rating_data['test'], k=10).mean()

train_auc = auc_score(model, user_item_rating_data['train']).mean()
test_auc = auc_score(model, user_item_rating_data['test']).mean()

print('Precision: train %.2f, test %.2f.' % (train_precision, test_precision))
print('AUC: train %.2f, test %.2f.' % (train_auc, test_auc))

Precision: train 0.64, test 0.11.
AUC: train 0.95, test 0.91.


<h1>Working with features</h1>

A true hybrid is the one where you also include the features of items and users instead of just the neighborhood similarity or latent factors. The next example deals with stack exchange data which is openly available on stackexchange archives.


In [38]:
from lightfm.datasets import fetch_stackexchange

stkex_data = fetch_stackexchange('crossvalidated',
                           test_set_fraction=0.1,
                           indicator_features=False,
                           tag_features=True)

train = stkex_data['train']
test = stkex_data['test']

print('The dataset has %s users and %s items, '
      'with %s interactions in the test and %s interactions in the training set.'
      % (train.shape[0], train.shape[1], test.getnnz(), train.getnnz()))

The dataset has 3221 users and 72360 items, with 4307 interactions in the test and 57830 interactions in the training set.


<h3>Pure Collaborative filtering approach.</h3>

In [39]:
# Set the number of threads; you can increase this
# ify you have more physical cores available.
NUM_THREADS = 2
NUM_COMPONENTS = 30
NUM_EPOCHS = 3
ITEM_ALPHA = 1e-6
#lyst.github.io/lighfm/docs/lightfm.html

# Let's fit a WARP model: these generally have the best performance.
stkex_cf_model = LightFM(loss='warp', item_alpha=ITEM_ALPHA, no_components=NUM_COMPONENTS)

# Run 3 epochs and time it.
%time stkex_cf_model = stkex_cf_model.fit(train, epochs=NUM_EPOCHS, num_threads=NUM_THREADS)


CPU times: user 448 ms, sys: 28 ms, total: 476 ms
Wall time: 650 ms


In [40]:
# Compute and print the AUC score
train_auc = auc_score(stkex_cf_model, train, num_threads=NUM_THREADS).mean()
print('Collaborative filtering train AUC: %s' % train_auc)
# We pass in the train interactions to exclude them from predictions.
# This is to simulate a recommender system where we do not
# re-recommend things the user has already interacted with in the train
# set.
test_auc = auc_score(stkex_cf_model, test, train_interactions=train, num_threads=NUM_THREADS).mean()
print('Collaborative filtering test AUC: %s' % test_auc)

Collaborative filtering train AUC: 0.888174
Collaborative filtering test AUC: 0.345964


Note above the test AUC is < 0.5 which means this model is worse than chance. The biases are causing part of this problem

In [56]:
# Set biases to zero
stkex_cf_model.item_biases *= 0.0

test_auc = auc_score(stkex_cf_model, test, train_interactions=train, num_threads=NUM_THREADS).mean()
print('Collaborative filtering test AUC: %s' % test_auc)
print(repr(train))
print(repr(test))

Collaborative filtering test AUC: 0.501792
<3221x72360 sparse matrix of type '<class 'numpy.float32'>'
	with 57830 stored elements in COOrdinate format>
<3221x72360 sparse matrix of type '<class 'numpy.float32'>'
	with 4307 stored elements in COOrdinate format>


<h2>Let's incorporate some features</h2><br>This is where things start to get really interesting

In [45]:
item_features = stkex_data['item_features']
tag_labels = stkex_data['item_feature_labels']

print('There are %s distinct tags, with values like %s.' % (item_features.shape[1], tag_labels[:3].tolist()))

There are 1246 distinct tags, with values like ['bayesian', 'prior', 'elicitation'].


In [46]:
# Define a new model instance
stkex_hyb_model = LightFM(loss='warp',
                item_alpha=ITEM_ALPHA,
                no_components=NUM_COMPONENTS)

# Fit the hybrid model. Note that this time, we pass
# in the item features matrix.
stkex_hyb_model = stkex_hyb_model.fit(train,
                item_features=item_features,
                epochs=NUM_EPOCHS,
                num_threads=NUM_THREADS)


In [47]:
# Don't forget the pass in the item features again!
train_auc = auc_score(stkex_hyb_model,
                      train,
                      item_features=item_features,
                      num_threads=NUM_THREADS).mean()
print('Hybrid training set AUC: %s' % train_auc)

Hybrid training set AUC: 0.856972


In [48]:
test_auc = auc_score(stkex_hyb_model,
                    test,
                    train_interactions=train,
                    item_features=item_features,
                    num_threads=NUM_THREADS).mean()
print('Hybrid test set AUC: %s' % test_auc)

Hybrid test set AUC: 0.708887


<h2>Byproduct</h2>
Word2Vec like tag similarity via item_embeddings

In [54]:
def get_similar_tags(model, tag_id):
    # Define similarity as the cosine of the angle
    # between the tag latent vectors
    
    # Normalize the vectors to unit length
    tag_embeddings = (model.item_embeddings.T
                      / np.linalg.norm(model.item_embeddings, axis=1)).T
    
    query_embedding = tag_embeddings[tag_id]
    similarity = np.dot(tag_embeddings, query_embedding)
    most_similar = np.argsort(-similarity)[1:4]
    
    return most_similar


for tag in (u'events', u'prior', u'elicitation'):
    tag_id = tag_labels.tolist().index(tag)
    print('Most similar tags for %s: %s' % (tag_labels[tag_id],
                                            tag_labels[get_similar_tags(stkex_hyb_model, tag_id)]))

Most similar tags for events: ['ancillary-statistics' 'box-muller' 'bimodal']
Most similar tags for prior: ['mcmc' 'bayesian' 'range']
Most similar tags for elicitation: ['t-distribution' 'plyr' 'rule-of-thumb']
