# Indexable Representation Learning

As mentioned in the presentation, indexable representation refers to recommendation algorihtms whose latent vector representations are immediately sublinear searchable. In this tutorial, we are going to experiment with one of such models, namely Indexable Bayesian Personalized Ranking or IBPR for short.

In [None]:
%matplotlib inline
import numpy as np
import time
import pickle
import matplotlib.pyplot as plt
import scipy.sparse as ss

from cornac.eval_methods import BaseMethod
from cornac.models import BPR, IBPR
from utils.lsh import *
from utils.load_data import *
from utils.pmf import *
from utils.evaluation import *

## Load the train/test data

In [None]:
train   = pickle.load(open('train_data', 'rb'))
test    = pickle.load(open('test_data', 'rb'))

In [None]:
eval_method = BaseMethod.from_provided(train_data=train, test_data=test,
                                       exclude_unknowns=False, verbose=True)

## Bayesian Personalized Ranking - BPR

We first measure the performance both after learning and after indexing of BPR model. Details of the paper can be found in the following link: [BPR-paper](https://arxiv.org/ftp/arxiv/papers/1205/1205.2618.pdf).


<img src='resources/images/bpr.png' width = 700>



In [None]:
#rec_bpr = BPR(k = 10, max_iter=20, learning_rate=0.01, lamda=0.001, batch_size=10000, init_params={'U':None, 'V':None})
#rec_bpr.fit(eval_method.train_set)
#pickle.dump(rec_bpr, open('bpr.model', 'wb'))

rec_bpr = pickle.load(open('bpr.model', 'rb'))

In [None]:
#number of recommendations
topK = 10

bpr_queries = rec_bpr.U[1:1000,:]
bpr_data    = rec_bpr.V
        
bpr_prec, bpr_recall = evaluate_topK(test, bpr_data, bpr_queries, topK)
print('bpr_prec@{0} \t bpr_recall@{0}'.format(topK))
print('{0}\t{1}'.format(bpr_prec, bpr_recall))

## Indexable Bayesian Personalized Ranking - IBPR

We will test the effectiveness of indexable model here. In this experiment, we train IBPR with the same data we use for BPR above. Details of the paper can be found in the following link: [IBPR-paper](https://ink.library.smu.edu.sg/sis_research/3884/)

<img src='resources/images/ibpr.png' width=700>



In [7]:
#rec_ibpr = IBPR(k = 10, max_iter=20, learning_rate=0.01, lamda=0.001, batch_size=5000, init_params={'U':None, 'V':None})
#rec_ibpr.fit(eval_method.train_set)
#pickle.dump(rec_ibpr, open('ibpr.model', 'wb'))

rec_ibpr = pickle.load(open('ibpr.model', 'rb'))

In [None]:
#number of recommendations
topK = 10

ibpr_queries = rec_ibpr.U[1:1000, :]
ibpr_data    = rec_ibpr.V

ibpr_prec, ibpr_recall = evaluate_topK(test, ibpr_data, ibpr_queries, topK)
print('ibpr_prec@{0} \t ibpr_recall@{0}'.format(topK))
print('{0}\t{1}'.format(ibpr_prec, ibpr_recall))

In [None]:
topK = 10
b_vals = [4]
L_vals = [10]

print('#table\t #bit \t model \t prec@{0} \t recall@{0} \t touched'.format(topK))
for nt in L_vals:
    print('------------------------------------------------------------------------------')
    for b in b_vals: 
        lsh_index = LSHIndex(hash_family = CosineHashFamily(bpr_data.shape[1]), k = b, L=nt)
        
        lsh_bpr_prec, lsh_bpr_recall, touched_bpr = evaluate_LSHTopK(test, bpr_data, -bpr_queries, lsh_index, dot, topK)
        print("{0}\t{1}\t{2}\t{3}\t{4}\t{5}".format(nt, b, 'bpr',lsh_bpr_prec, lsh_bpr_recall, touched_bpr)) 

        lsh_ibpr_prec, lsh_ibpr_recall, touched_ibpr = evaluate_LSHTopK(test, ibpr_data, -ibpr_queries, lsh_index, dot, topK)
        print("{0}\t{1}\t{2}\t{3}\t{4}\t{5}".format(nt, b, 'ibpr',lsh_ibpr_prec, lsh_ibpr_recall, touched_ibpr)) 
