## Useful Advice

- How can we update/store the model for updating when new users/new items/ new interactions come in?
    - The most robust answer is that you periodically recompute the model from scratch to handle all three. For just new interactions, you can run additional fitting iterations on the same model with the new data. (You can persist the model by pickling it.) Adding new users/items (this is called fold-in) is somewhat tricky, and not explicitly supported. For models that naturally take in new information (and new users), you should have a look at [sequence-based models](https://maciejkula.github.io/spotlight/index.html#sequential-models)
- Getting similar items
    - https://github.com/lyst/lightfm/issues/244
- bdicts library to convert UUIDs to 32 bit ints for recommender libraries
    - https://stackoverflow.com/questions/48068147/recommender-systems-convert-uuids-to-32-bit-ints-for-recommender-libraries
- Evaluation:
    - The scores themselves have no meaning in isolation; they are only meaningful because they define a ranking over items for a given user. The scale they take depends on the loss you specify, the learning rate, the regularization parameters, and the data itself. I recommend keeping an eye on the MRR/AUC scores of your model, and comparing them with what a random or a popularity model would achieve.
- Mini-Batch Training for large dataset:
    - https://github.com/lyst/lightfm/issues/234
- Item-Item Recommendation ???
    - https://github.com/lyst/lightfm/issues/239#issuecomment-352774493
- Sample weight for different interactions
    - https://github.com/lyst/lightfm/issues/260
- Handling non-categorical features
    - https://github.com/lyst/lightfm/issues/261
- Explaining Recommendation
    - try using item similarity to do that present support for your recommendations. For example, you could use cosine similarity between the embedding of the item you are recommending and items that the user has interacted with in the past to find items from the user's history that are most similar to the recommendation you are making.
    - https://github.com/lyst/lightfm/issues/251#issuecomment-363314396

In [1]:
import numpy as np

from lightfm.datasets import fetch_movielens

movielens = fetch_movielens()



In [2]:
train = movielens['train']
test = movielens['test']

In [3]:
test

<943x1682 sparse matrix of type '<class 'numpy.int32'>'
	with 9430 stored elements in COOrdinate format>

In [4]:
from time import time
from lightfm import LightFM
from lightfm.evaluation import precision_at_k
from lightfm.evaluation import auc_score

model = LightFM(learning_rate=0.05, loss='bpr')

start = time()
model.fit(train, epochs=50)
print(time() - start)

train_precision = precision_at_k(model, train, k=10).mean()
test_precision = precision_at_k(model, test, k=10).mean()

train_auc = auc_score(model, train).mean()
test_auc = auc_score(model, test).mean()

print('Precision: train %.2f, test %.2f.' % (train_precision, test_precision))
print('AUC: train %.2f, test %.2f.' % (train_auc, test_auc))

3.4169018268585205
Precision: train 0.62, test 0.09.
AUC: train 0.92, test 0.87.


In [7]:
?model.predict_rank

In [8]:
ranks = model.predict_rank(test)

<1x1682 sparse matrix of type '<class 'numpy.float32'>'
	with 10 stored elements in Compressed Sparse Row format>

In [9]:
ranks[0].data

array([ 438.,  324.,  337.,   82.,  177.,  288.,  191.,  166.,   22.,   46.], dtype=float32)

In [5]:
model = LightFM(learning_rate=0.05, loss='warp')

start = time()
model.fit_partial(train, epochs=50)
print(time() - start)

train_precision = precision_at_k(model, train, k=10).mean()
test_precision = precision_at_k(model, test, k=10).mean()

train_auc = auc_score(model, train).mean()
test_auc = auc_score(model, test).mean()

print('Precision: train %.2f, test %.2f.' % (train_precision, test_precision))
print('AUC: train %.2f, test %.2f.' % (train_auc, test_auc))

3.286494016647339
Precision: train 0.65, test 0.11.
AUC: train 0.95, test 0.91.
