Spotlight on Implicit Library
======
by Sanghyeon Lee

In my spotlight, I'll implement three recommendation algorithms that Implicit library covers and test their result. Also, I'll check how much it is faster than simple implementation of the paper algorithms.

Implicit Feedback Datasets
------
Implicit feedback Dataset is a dataset recorded users' interest log without distinguishing how much the user prefers the item. On the other hand, Explicit Dataset is a dataset which is distinguished the clear preference of items by users.

Implicit Library
-----
Implicit Library [1] can present fast Python implementations of popular recommendation algorithms for implicit feedback datasets. 

Recommendation algorithms for implicit feedback datasets
-----
1. Alternating Least Squares(ALS) in the papers Collaborative Filtering for Implicit Feedback Datasets [2] and Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering [3].
2. Bayesian Personailzed Ranking(BPR) in the paper BPR: Bayesian Personalized Ranking from Implicit Feedback [4].
3. Logistic Matrix Factorization(LMF) in the paper Logistic matrix factorization for implicit feedback data [5].


Index
-----
1. ALS implementation on simple rating matrix
2. ALS implementation on simple rating matrix using Implicit Library
3. Test based on MovieLens dataset using Implicit Library
    1. Alternating Least Squares model test
    2. Bayesian Personailzed Ranking model test
    3. Logistic Matrix Factorization test


Installation
-----
``pip install implicit``





1 . Alternating Least Squares implementation on simple rating matrix
=======

In [1]:
import numpy as np

confidence = 40
dimLatentFactor = 200
regularization = 40
iteration = 15


R = np.array([[0, 0, 0, 0, 4, 0, 0, 0, 0, 2, 0],
              [2, 0, 0, 3, 0, 0, 0, 0, 0, 0, 1],
              [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
              [0, 3, 4, 0, 3, 0, 0, 2, 0, 0, 4],
              [0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 2],
              [3, 0, 0, 0, 0, 0, 5, 0, 0, 5, 0],
              [0, 0, 4, 0, 0, 0, 0, 2, 0, 0, 0],
              [0, 0, 0, 3, 0, 4, 0, 0, 0, 0, 4],
              [2, 0, 0, 0, 0, 0, 5, 0, 0, 5, 0],
              [0, 0, 0, 3, 0, 0, 2, 0, 0, 0, 0]])

nu = R.shape[0]
ni = R.shape[1]

# initialize Latent Factor Matrix X and Y with very small values
X = np.random.rand(nu, dimLatentFactor) * 0.01
Y = np.random.rand(ni, dimLatentFactor) * 0.01

# initialize Binary Rating Matriox P
# Convert original rating matrix R into P
P = np.copy(R)
P[P > 0] = 1

# initialize Confidence Matrix C
C = 1 + confidence * R

1.1. Loss Function
-----
$$
\min_{{x}_{*}{y}_{*}}{\sum {({p}_{ui}-{{x}_{u}}^{T}{{y}_{i}})}^{2}+{\lambda }_{2}(\sum_u {\parallel{x}_{u}\parallel}^{2}+\sum_i {\parallel{y}_{i}\parallel}^{2})}
$$

predict error : (pui - xTyi)^2  (binary rating prediction error)

confidence error: cui(pui - xTyi) ^ 2 (predict error with confidence level)

regularization: lambda(sumX +sumY)

total loss: confidence error + regularization


In [2]:
def lossFunction(C, P, xTy, X, Y, regularization):
    predictError = np.square(P - xTy)
    confidenceError = np.sum(C * predict_error)
    regularizations = regularization * (np.sum(np.square(X)) + np.sum(np.square(Y)))
    totalLoss = confidence_error + regularizations
    return np.sum(predictError), confidenceError, regularizations, totalLoss

1.2. Optimization Function
------
Latent Factor Optimizers

X[u] = (yTCuy + lambda*I)^-1yTCuy

Y[i] = (xTCix + lambda*I)^-1xTCix

In [3]:
def optimizeUser(X, Y, C, P, nu, dimLatentFactor, regularization):
    yT = np.transpose(Y)
    for u in range(nu):
        Cu = np.diag(C[u])
        yT_Cu_y = np.matmul(np.matmul(yT, Cu), Y)
        lI = np.dot(regularization, np.identity(dimLatentFactor))
        yT_Cu_pu = np.matmul(np.matmul(yT, Cu), P[u])
        X[u] = np.linalg.solve(yT_Cu_y + lI, yT_Cu_pu)

def optimizeItem(X, Y, C, P, ni, dimLatentFactor, regularization):
    xT = np.transpose(X)
    for i in range(ni):
        Ci = np.diag(C[:, i])
        xT_Ci_x = np.matmul(np.matmul(xT, Ci), X)
        lI = np.dot(regularization, np.identity(dimLatentFactor))
        xT_Ci_pi = np.matmul(np.matmul(xT, Ci), P[:, i])
        Y[i] = np.linalg.solve(xT_Ci_x + lI, xT_Ci_pi)

1.3. Model Training
-----

In [4]:
%%time

for i in range(iteration):
    if i!=0:   
        optimizeUser(X, Y, C, P, nu, dimLatentFactor, regularization)
        optimizeItem(X, Y, C, P, ni, dimLatentFactor, regularization)

predict = np.matmul(X, np.transpose(Y))
print('Item recommendation for each users')
for i in range(len(predict)):
    indx = np.argmax(predict[i])
    print('For user ',i,', item',indx, 'is recommended with ',predict[i][indx],' of score')

Item recommendation for each users
For user  0 , item 9 is recommended with  0.8189000020922086  of score
For user  1 , item 3 is recommended with  0.8653270615814131  of score
For user  2 , item 1 is recommended with  0.6323422073162843  of score
For user  3 , item 1 is recommended with  0.9690189464115635  of score
For user  4 , item 1 is recommended with  0.8557531274158778  of score
For user  5 , item 6 is recommended with  0.9316228652695778  of score
For user  6 , item 1 is recommended with  0.8450908789112005  of score
For user  7 , item 3 is recommended with  0.9287112769416471  of score
For user  8 , item 6 is recommended with  0.9223089587055476  of score
For user  9 , item 3 is recommended with  0.8113946177451477  of score
CPU times: user 697 ms, sys: 0 ns, total: 697 ms
Wall time: 138 ms


In [5]:
print('User/Item Matrix with predicted recommendations')
print(predict)

User/Item Matrix with predicted recommendations
[[0.73001581 0.63084186 0.5852843  0.75621235 0.81081852 0.56075793
  0.74837494 0.55336634 0.         0.8189     0.76564602]
 [0.79462611 0.51352688 0.40615924 0.86532706 0.68816839 0.6494663
  0.82500854 0.38966112 0.         0.82531365 0.74732484]
 [0.2146761  0.63234221 0.60640689 0.29700636 0.504976   0.34399379
  0.17993764 0.58344294 0.         0.26203917 0.55863135]
 [0.52603874 0.96901895 0.91939471 0.63353651 0.88769542 0.6143943
  0.49286105 0.88125052 0.         0.6123354  0.93729686]
 [0.42787612 0.85575313 0.78765904 0.55589193 0.7247942  0.57151699
  0.39269657 0.75993603 0.         0.47744857 0.83058351]
 [0.87670699 0.35551365 0.27254749 0.88587351 0.67634803 0.58069699
  0.93162287 0.25463194 0.         0.92783773 0.64661234]
 [0.2544088  0.84509088 0.82944438 0.34782154 0.68135464 0.41529393
  0.20610399 0.79563596 0.         0.33146933 0.72123916]
 [0.78231665 0.73163666 0.58524231 0.92871128 0.76244247 0.78960521
  0.

2 . Alternating Least Squares implementation on simple rating matrix using Implilcit Library
=======

In [6]:
#pip install implicit

In [7]:
%%time
import implicit
import numpy as np
import pandas as pd
from scipy.sparse import csr_matrix, random

#best parameter described in " Y. Hu et al, Collaborative Filtering for Implicit Feedback Datasets"
confidence = 40
dimLatentFactor = 200
regularization = 40
iteration = 15

# # # sample rating matrix
raw =  [[0, 0, 0, 0, 4, 0, 0, 0, 0, 2, 0],
        [2, 0, 0, 3, 0, 0, 0, 0, 0, 0, 1],
        [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
        [0, 3, 4, 0, 3, 0, 0, 2, 0, 0, 4],
        [0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 2],
        [3, 0, 0, 0, 0, 0, 5, 0, 0, 5, 0],
        [0, 0, 4, 0, 0, 0, 0, 2, 0, 0, 0],
        [0, 0, 0, 3, 0, 4, 0, 0, 0, 0, 4],
        [2, 0, 0, 0, 0, 0, 5, 0, 0, 5, 0],
        [0, 0, 0, 3, 0, 0, 2, 0, 0, 0, 0]]

#number of users
nu = len(raw)

counts = csr_matrix(raw, dtype=np.float64)

user_items = counts * confidence
item_users = user_items.T

CPU times: user 584 ms, sys: 913 ms, total: 1.5 s
Wall time: 215 ms


2.1. Model select from Implicit Library
------

In [8]:
# initialize a model with ALS algorithm
model = implicit.als.AlternatingLeastSquares(factors=dimLatentFactor,
                                        regularization=regularization,
                                        use_native=False,
                                        use_cg=False,
                                        iterations=iteration)

'''
# initialize a model with BPR algorithm
model = implicit.bpr.BayesianPersonalizedRanking(factors=dimLatentFactor,
                                        regularization=regularization,
                                        iterations=iteration)
                                        
# initialize a model with Logistic Matrix Factorization algorithm                                        
model = implicit.lmf.LogisticMatrixFactorization(factors=dimLatentFactor,
                                        regularization=regularization,
                                        iterations=iteration)
                                                                       
'''



'\n# initialize a model with BPR algorithm\nmodel = implicit.bpr.BayesianPersonalizedRanking(factors=dimLatentFactor,\n                                        regularization=regularization,\n                                        iterations=iteration)\n                                        \n# initialize a model with Logistic Matrix Factorization algorithm                                        \nmodel = implicit.lmf.LogisticMatrixFactorization(factors=dimLatentFactor,\n                                        regularization=regularization,\n                                        iterations=iteration)\n                                                                       \n'

2.2. Model training
-------

In [9]:
%%time
# train the model on a sparse matrix of item/user/confidence weights
np.random.seed(2020)
model.fit(item_users)

# recommend items for users
recs = []
print('Item recommendation for each users')
for i in range(nu):
    rec = model.recommend(i, item_users, N=1)
    item, score = rec[0]
    print('For user ',i,', item',item, 'is recommended with ',score,' of score')

HBox(children=(IntProgress(value=0, max=15), HTML(value='')))


Item recommendation for each users
For user  0 , item 9 is recommended with  0.81682473  of score
For user  1 , item 3 is recommended with  0.86407113  of score
For user  2 , item 1 is recommended with  0.6257595  of score
For user  3 , item 1 is recommended with  0.96877205  of score
For user  4 , item 1 is recommended with  0.85487294  of score
For user  5 , item 6 is recommended with  0.9312121  of score
For user  6 , item 1 is recommended with  0.8436243  of score
For user  7 , item 3 is recommended with  0.9280933  of score
For user  8 , item 6 is recommended with  0.92179334  of score
For user  9 , item 3 is recommended with  0.80980325  of score
CPU times: user 960 ms, sys: 14.5 ms, total: 975 ms
Wall time: 188 ms


In [10]:
#reconstruct user_item_data with prediction
rows, cols = model.user_factors, model.item_factors
reconstructed = rows.dot(cols.T)
print('User/Item Matrix with predicted recommendations')
print(reconstructed)

User/Item Matrix with predicted recommendations
[[0.7274743  0.6258645  0.5780643  0.7580672  0.8096755  0.5659705
  0.7455008  0.5458418  0.         0.81682473 0.76590437]
 [0.7925495  0.50359607 0.3961622  0.8640711  0.6862477  0.6482832
  0.82320595 0.37948275 0.         0.8253248  0.7419009 ]
 [0.20252983 0.62575954 0.600812   0.28676173 0.49821347 0.33905017
  0.16691616 0.5766836  0.         0.24980061 0.55165   ]
 [0.51324373 0.968772   0.919315   0.62567884 0.8866189  0.61491966
  0.4783373  0.8793268  0.         0.6004365  0.9369913 ]
 [0.4165172  0.8548732  0.78825164 0.54732835 0.7246945  0.56920743
  0.37993887 0.7586232  0.         0.46772486 0.82880586]
 [0.8755797  0.34092456 0.25614685 0.8862569  0.6696711  0.58071303
  0.9312121  0.23906814 0.         0.9274596  0.63871104]
 [0.2391633  0.843624   0.8283986  0.3369742  0.67645466 0.4132946
  0.18910562 0.7928772  0.         0.31590137 0.71829796]
 [0.781733   0.72797704 0.58344996 0.9280933  0.76868844 0.78861237
  0.7

2.3. Model Evaluation using Item similarity
-------

In [11]:
itemid = 0
related = model.similar_items(itemid = 0)

print('Item similarity\n')
for i in range(nu):
    for other, score in model.similar_items(i, 4):
        print("User %i is similar with user %i with %f of score" % (i, other, score))
    print('\n')

Item similarity

User 0 is similar with user 0 with 0.911322 of score
User 0 is similar with user 6 with 0.908381 of score
User 0 is similar with user 9 with 0.899549 of score
User 0 is similar with user 3 with 0.885760 of score


User 1 is similar with user 1 with 0.943069 of score
User 1 is similar with user 7 with 0.928176 of score
User 1 is similar with user 2 with 0.924139 of score
User 1 is similar with user 10 with 0.863385 of score


User 2 is similar with user 2 with 0.902895 of score
User 2 is similar with user 7 with 0.902443 of score
User 2 is similar with user 1 with 0.884772 of score
User 2 is similar with user 10 with 0.759710 of score


User 3 is similar with user 3 with 0.969173 of score
User 3 is similar with user 0 with 0.941988 of score
User 3 is similar with user 6 with 0.925512 of score
User 3 is similar with user 9 with 0.908733 of score


User 4 is similar with user 4 with 0.911109 of score
User 4 is similar with user 10 with 0.843509 of score
User 4 is similar 

2.4. Assert explanation makes sense
-----
Provides explanations for why the item is liked by the user.

A list of the top N (itemid, score) contributions for this user/item pair.

ex) The reason why user 0 get the recommendation of item 9 is that item 9 got the contributions from user 5, 1, and 8.

In [12]:
userid=0
recs = model.recommend(userid, item_users, N=1)
top_rec, score = recs[0]

score_explained, contributions, W = model.explain(userid, item_users, itemid=top_rec)
items = [i for i, _ in contributions]
scores = [s for _, s in contributions]
print('User 0 is recommended item',top_rec,', because item', top_rec,
      'got contributions from user',items,'with',scores,'of scores\n')

print('The total predicted score for this user/item pair is ',score_explained)

User 0 is recommended item 9 , because item 9 got contributions from user [5, 1, 8] with [0.5510207960592248, 0.06673046275865645, 0.0] of scores

The total predicted score for this user/item pair is  0.6177512588178813


3 . Test based on MovieLens dataset using Implicit Library
=====


This code will automatically download a HDF5 version of this
dataset when first run. The original dataset can be found here:

MovieLens [6] 
https://grouplens.org/datasets/movielens/.

Since this dataset contains explicit 5-star ratings, the ratings are
filtered down to positive reviews (4+ stars) to construct an implicit
dataset

Calulated similar movies will be stored in similar-movies.tsv file.

In this case, I saved only the hightest 100 similarities.

In [13]:
from __future__ import print_function

import argparse
import codecs
import logging
import time

import numpy as np
import tqdm

from implicit.als import AlternatingLeastSquares
from implicit.bpr import BayesianPersonalizedRanking
from implicit.datasets.movielens import get_movielens
from implicit.lmf import LogisticMatrixFactorization
from implicit.nearest_neighbours import (BM25Recommender, CosineRecommender,
                                         TFIDFRecommender, bm25_weight)

log = logging.getLogger("implicit")


def calculate_similar_movies(output_filename,
                             model_name="als", min_rating=4.0,
                             variant='20m'):
    # read in the input data file
    start = time.time()
    titles, ratings = get_movielens(variant)

    # remove things < min_rating, and convert to implicit dataset
    # by considering ratings as a binary preference only
    ratings.data[ratings.data < min_rating] = 0
    ratings.eliminate_zeros()
    ratings.data = np.ones(len(ratings.data))

    log.info("read data file in %s", time.time() - start)

    # generate a recommender model based off the input params
    if model_name == "als":
        model = AlternatingLeastSquares()

        # lets weight these models by bm25weight.
        log.debug("weighting matrix by bm25_weight")
        ratings = (bm25_weight(ratings, B=0.9) * 5).tocsr()

    elif model_name == "bpr":
        model = BayesianPersonalizedRanking()

    elif model_name == "lmf":
        model = LogisticMatrixFactorization()

    elif model_name == "tfidf":
        model = TFIDFRecommender()

    elif model_name == "cosine":
        model = CosineRecommender()

    elif model_name == "bm25":
        model = BM25Recommender(B=0.2)

    else:
        raise NotImplementedError("TODO: model %s" % model_name)

    # train the model
    log.debug("training model %s", model_name)
    start = time.time()
    model.fit(ratings)
    log.debug("trained model '%s' in %s", model_name, time.time() - start)
    log.debug("calculating top movies")

    user_count = np.ediff1d(ratings.indptr)
    to_generate = sorted(np.arange(len(titles)), key=lambda x: -user_count[x])

    log.debug("calculating similar movies")
    with tqdm.tqdm(total=len(to_generate)) as progress:
        with codecs.open(output_filename, "w", "utf8") as o:
            i = 0
            for movieid in to_generate:
                # if this movie has no ratings, skip over (for instance 'Graffiti Bridge' has
                # no ratings > 4 meaning we've filtered out all data for it.
                if ratings.indptr[movieid] != ratings.indptr[movieid + 1]:
                    title = titles[movieid]
                    for other, score in model.similar_items(movieid, 11):
                        i += 1
                        if i < 100:
                            print("%s\t%s\t%s" % (title, titles[other], score))
                            o.write("%s\t%s\t%s\n" % (title, titles[other], score))
                        else:
                            break
                #progress.update(1)

A. Alternating Least Squares model
------

In [14]:
if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Generates related movies from the MovieLens 20M "
                                     "dataset (https://grouplens.org/datasets/movielens/20m/)",
                                     formatter_class=argparse.ArgumentDefaultsHelpFormatter)

    parser.add_argument('--output', type=str, default='similar-movies.tsv',
                        dest='outputfile', help='output file name')
    parser.add_argument('--model', type=str, default='als',
                        dest='model', help='model to calculate (als/bm25/tfidf/cosine)')
    parser.add_argument('--variant', type=str, default='20m', dest='variant',
                        help='Whether to use the 20m, 10m, 1m or 100k movielens dataset')
    parser.add_argument('--min_rating', type=float, default=4.0, dest='min_rating',
                        help='Minimum rating to assume that a rating is positive')
    args, unknown = parser.parse_known_args()  #args = parser.parse_args()

    logging.basicConfig(level=logging.DEBUG)

    calculate_similar_movies(args.outputfile,
                             model_name=args.model,
                             min_rating=args.min_rating, variant=args.variant)

HBox(children=(IntProgress(value=0, max=15), HTML(value='')))

  0%|          | 0/131263 [00:00<?, ?it/s]


Shawshank Redemption, The (1994)	Shawshank Redemption, The (1994)	0.41175756
Shawshank Redemption, The (1994)	Silence of the Lambs, The (1991)	0.40154997
Shawshank Redemption, The (1994)	Good Will Hunting (1997)	0.4010231
Shawshank Redemption, The (1994)	Schindler's List (1993)	0.40032485
Shawshank Redemption, The (1994)	Forrest Gump (1994)	0.39940026
Shawshank Redemption, The (1994)	Seven (a.k.a. Se7en) (1995)	0.39861926
Shawshank Redemption, The (1994)	Usual Suspects, The (1995)	0.39752764
Shawshank Redemption, The (1994)	Pulp Fiction (1994)	0.3952949
Shawshank Redemption, The (1994)	Truman Show, The (1998)	0.3947969
Shawshank Redemption, The (1994)	Green Mile, The (1999)	0.39478654
Shawshank Redemption, The (1994)	Saving Private Ryan (1998)	0.39394292
Pulp Fiction (1994)	Pulp Fiction (1994)	0.47626424
Pulp Fiction (1994)	Reservoir Dogs (1992)	0.4687731
Pulp Fiction (1994)	Usual Suspects, The (1995)	0.46771082
Pulp Fiction (1994)	Seven (a.k.a. Se7en) (1995)	0.46470976
Pulp Fiction (




B. Bayesian Personalized Ranking model implementation on MovieLens dataset using Implicit Library
-----

In [15]:
if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Generates related movies from the MovieLens 20M "
                                     "dataset (https://grouplens.org/datasets/movielens/20m/)",
                                     formatter_class=argparse.ArgumentDefaultsHelpFormatter)

    parser.add_argument('--output', type=str, default='similar-movies.tsv',
                        dest='outputfile', help='output file name')
    parser.add_argument('--model', type=str, default='bpr',
                        dest='model', help='model to calculate (als/bm25/tfidf/cosine)')
    parser.add_argument('--variant', type=str, default='20m', dest='variant',
                        help='Whether to use the 20m, 10m, 1m or 100k movielens dataset')
    parser.add_argument('--min_rating', type=float, default=4.0, dest='min_rating',
                        help='Minimum rating to assume that a rating is positive')
    args, unknown = parser.parse_known_args()  #args = parser.parse_args()

    logging.basicConfig(level=logging.DEBUG)

    calculate_similar_movies(args.outputfile,
                             model_name=args.model,
                             min_rating=args.min_rating, variant=args.variant)

HBox(children=(IntProgress(value=0), HTML(value='')))

  0%|          | 0/131263 [00:00<?, ?it/s]


Shawshank Redemption, The (1994)	Shawshank Redemption, The (1994)	2.6484053
Shawshank Redemption, The (1994)	Schindler's List (1993)	2.071487
Shawshank Redemption, The (1994)	Silence of the Lambs, The (1991)	1.9213673
Shawshank Redemption, The (1994)	Bikes vs Cars (2015)	1.7971182
Shawshank Redemption, The (1994)	Usual Suspects, The (1995)	1.7845862
Shawshank Redemption, The (1994)	Forrest Gump (1994)	1.6885874
Shawshank Redemption, The (1994)	Pulp Fiction (1994)	1.6822878
Shawshank Redemption, The (1994)	Seven (a.k.a. Se7en) (1995)	1.6141193
Shawshank Redemption, The (1994)	Braveheart (1995)	1.564118
Shawshank Redemption, The (1994)	Boy Meets Girl (2015)	1.5237815
Shawshank Redemption, The (1994)	Petting Zoo (2015)	1.5064116
Pulp Fiction (1994)	Pulp Fiction (1994)	2.6284437
Pulp Fiction (1994)	Silence of the Lambs, The (1991)	1.9648343
Pulp Fiction (1994)	Reservoir Dogs (1992)	1.9462489
Pulp Fiction (1994)	Seven (a.k.a. Se7en) (1995)	1.9331075
Pulp Fiction (1994)	Usual Suspects, The 




C. Logistic Matrix Factorization model implementation on MovieLens dataset using Implicit Library
-------

In [16]:
if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Generates related movies from the MovieLens 20M "
                                     "dataset (https://grouplens.org/datasets/movielens/20m/)",
                                     formatter_class=argparse.ArgumentDefaultsHelpFormatter)

    parser.add_argument('--output', type=str, default='similar-movies.tsv',
                        dest='outputfile', help='output file name')
    parser.add_argument('--model', type=str, default='lmf',
                        dest='model', help='model to calculate (als/bm25/tfidf/cosine)')
    parser.add_argument('--variant', type=str, default='20m', dest='variant',
                        help='Whether to use the 20m, 10m, 1m or 100k movielens dataset')
    parser.add_argument('--min_rating', type=float, default=4.0, dest='min_rating',
                        help='Minimum rating to assume that a rating is positive')
    args, unknown = parser.parse_known_args()  #args = parser.parse_args()

    logging.basicConfig(level=logging.DEBUG)

    calculate_similar_movies(args.outputfile,
                             model_name=args.model,
                             min_rating=args.min_rating, variant=args.variant)

HBox(children=(IntProgress(value=0, max=30), HTML(value='')))

  0%|          | 0/131263 [00:00<?, ?it/s]


Shawshank Redemption, The (1994)	Shawshank Redemption, The (1994)	6.8404126
Shawshank Redemption, The (1994)	Schindler's List (1993)	6.5997467
Shawshank Redemption, The (1994)	Silence of the Lambs, The (1991)	6.4703774
Shawshank Redemption, The (1994)	Forrest Gump (1994)	6.469405
Shawshank Redemption, The (1994)	Pulp Fiction (1994)	6.1917257
Shawshank Redemption, The (1994)	Get Shorty (1995)	6.0944605
Shawshank Redemption, The (1994)	Misérables, Les (1995)	6.0934052
Shawshank Redemption, The (1994)	Seven (a.k.a. Se7en) (1995)	6.092912
Shawshank Redemption, The (1994)	Apollo 13 (1995)	6.0081043
Shawshank Redemption, The (1994)	Usual Suspects, The (1995)	5.969749
Shawshank Redemption, The (1994)	Dances with Wolves (1990)	5.957731
Pulp Fiction (1994)	Pulp Fiction (1994)	6.8870683
Pulp Fiction (1994)	Seven (a.k.a. Se7en) (1995)	6.4878306
Pulp Fiction (1994)	Silence of the Lambs, The (1991)	6.3163342
Pulp Fiction (1994)	Shawshank Redemption, The (1994)	6.233958
Pulp Fiction (1994)	Schindle




Conclusion
======

The Implicit Library is good for compareing various recommendation algorithms. Since the cutting-edge recommendation for implicit feedback datasets are implemented in the library, I could simply use and test their results. Even though the calculation time of ALS training function was slower than the one of ALS implementation without using the library function in my environment, the recommendation result was same which was very impressive. The library showed a benchmarks camparing the ALS fitting time versus Spark and QMF [7], but I counldn't cover this experience in my spotlight. Recommendation algorithms using implicit feedback data is used in many research fields. With no doubt, I would definitly use this library in the future for my research.

[1] [Implicit Libaray](https://implicit.readthedocs.io/en/latest/).  
[2] [Collaborative Filtering for Implicit Feedback Datasets](http://yifanhu.net/PUB/cf.pdf).    
[3] [Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering](https://www.semanticscholar.org/paper/Applications-of-the-conjugate-gradient-method-for-Tak%C3%A1cs-Pil%C3%A1szy/bfdf7af6cf7fd7bb5e6b6db5bbd91be11597eaf0).  
[4] [BPR: Bayesian Personalized Ranking from Implicit Feedback](https://arxiv.org/pdf/1205.2618.pdf).  
[5] [Logistic matrix factorization for implicit feedback data](https://web.stanford.edu/~rezab/nips2014workshop/submits/logmat.pdf)  
[6] [MovieLens](https://grouplens.org/datasets/movielens/)  
[7] [Spark and QMF](https://github.com/benfred/implicit/tree/master/benchmarks)


