Author: Tathagat Saha (Matricola no: 902046)

## Explaining Recommender System

Till now we have built a recommender system using the Neural Matrix Factorization framework. This framework allowed us to combine the GMF layers with the MLP layers.

Let's focus on the MLP model:
<center>  <img src="https://drive.google.com/uc?export=view&id=1rL_8kkHIhSlQjWr8hNal4Tyog87-2kNP" width="550" height="350"> </center> 

It easy to understand that the prediction is not computed over the user or the item indeces per se, but ove the embedding learned and produced. So, it could be useful to understand how these embeddings are produced and used to make the prediction. 

Let's try to apply some additive feature attribution method to such embeddings.

- TASK 1: Apply LIME to the MLP model by keeping the embedding layers out of the forward function and using the embedding produced as input to the model. This can be done without modifying the forward function and using the predefined methods from the class [captum.attr.InterpretableEmbeddingBase](https://captum.ai/api/utilities.html#).

- TASK 2: Apply LIME to the NeuMF model and check the results.

- TASK 3 [OPTIONAL]: choose another additive feature attribution method and apply it to the MLP model, following the implementation you prefere from Task 1.


Suggestion: the best way to work on embedding is that of using  the InterpretableEmbeddingBase class. In particular, you can use the  configure_interpretable_embedding_layer method to create interpretation over the embedding layer. Pay attention to input and output dimensions of the surrogate model [This tutorial](https://captum.ai/tutorials/Multimodal_VQA_Interpret) may help you in better understanding this concept and, particularly, how to use additive feature models with embeddings. 


In [21]:
import os
import time
import random
import argparse
import numpy as np 
import pandas as pd 
import torch
import torch.nn as nn
import torch.optim as optim
import torch.utils.data as data
from torch.utils.data import Dataset, DataLoader
#import pandas_profiling as pdp
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import argparse
from os import path
import os

#scikit-learn related imports
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# pytorch relates imports
import torch
import torch.nn as nn
import torch.optim as optim

from torch.utils.data import TensorDataset, DataLoader

!pip install captum

# imports from captum library
from captum.attr import LayerConductance, LayerActivation, LayerIntegratedGradients
from captum.attr import DeepLift, KernelShap, DeepLiftShap, ShapleyValueSampling

from captum.attr import Lime, LimeBase
from captum._utils.models.linear_model import SkLearnLinearRegression, SkLearnLasso
from captum.attr._core.lime import get_exp_kernel_similarity_function
from torch.utils.data import Dataset, DataLoader
from captum.attr import (
    IntegratedGradients,
    LayerIntegratedGradients,
    TokenReferenceBase,
    configure_interpretable_embedding_layer,
    remove_interpretable_embedding_layer,
    visualization
)
from captum.attr._utils.input_layer_wrapper import ModelInputWrapper


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [22]:
torch.manual_seed(1234)
np.random.seed(1234)
num_sample_data = '100k'

MODEL_PATH_MLP = 'drive/MyDrive/Colab Notebooks/movielens_{}/MLP.pt'.format(num_sample_data) #change this with your directory 
MODEL_PATH_NeuMF = 'drive/MyDrive/Colab Notebooks/movielens_{}/neuMF.pt'.format(num_sample_data) #change this with your directory 

use_cuda = torch.cuda.is_available()
device = torch.device("cuda:0" if use_cuda else "cpu")

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


First we collect a sample data for applying LIME to different models. And then select any single value to apply.

In [23]:
input_user_indices = torch.tensor([124,  53,  41, 336, 706, 781, 794,  19,  23, 617,  11, 545, 421,  58,
        823, 453, 897, 587, 573,  37, 477, 239, 531, 570,  65,  19, 496,  22,
        739, 537, 602,  19,  26, 792,  70, 782, 837, 892, 837, 496, 598, 311,
        421, 388, 489, 865, 728, 402, 739,  38, 605, 883, 318,  65, 674,   3,
          0, 409,  92, 504, 263, 759,  45,   3, 797, 752, 757, 158, 749,  87,
        814, 278,  68,  85, 915, 317, 617, 365, 402, 701,  80, 694, 136, 751,
        505, 136, 839, 162, 321, 841, 217, 549, 817,  30, 933, 556, 202, 720,
        450, 845, 703, 796, 784, 616, 428,  91,  35,   4, 409, 939, 617, 517,
        264,  37, 536,   6, 928, 409,  99, 601, 117, 791, 402, 530, 660, 432,
        621, 648,  41, 390, 244, 205, 279, 196,  41, 797, 532, 531, 715, 627,
        499,  39, 177, 129,  37,  26,  35, 857, 477,  17, 195, 369, 117, 167,
        413, 428, 635,  12, 345, 752, 409,  58, 819, 137, 442, 366, 369, 455,
        202, 778,  37, 862,  65, 118, 793, 366,  58, 817, 625, 742, 778, 940,
        893, 602, 131, 887,   8, 376, 862, 532, 934, 467, 628,   3, 703, 329,
         16, 103, 862, 394,  63, 423, 747, 565, 229, 381, 421, 803,  96, 867,
        310, 864, 384, 786, 543, 695, 315, 145, 464, 166, 542, 430, 742, 223,
        745, 311,  58,  93, 655, 878, 912, 325, 689, 352, 491, 195,  23, 441,
        234, 899,  48, 873, 391,  90,  19, 476, 802, 162,  38,  20, 685, 180,
        600, 881, 925, 179])
input_label_indices = torch.tensor([1., 1., 0., 0., 1., 1., 0., 0., 1., 0., 1., 1., 1., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.,
        1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1.,
        0., 0., 1., 1., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 1., 0., 1.,
        0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
        0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0.,
        0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 1., 1.,
        1., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0., 0., 0., 1.,
        0., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 1., 0., 0., 0., 1., 1., 0., 0.,
        0., 0., 0., 1.])
input_item_indices = torch.tensor([ 118,  405,  212,  819,   29,  222,  239, 1174,  238, 1443,  247,  725,
          31,  645, 1651,  327, 1039, 1116, 1601,  752,  564,   91, 1437, 1419,
        1348, 1484,   81,  300, 1446,  717, 1030, 1370, 1472,  198, 1584,   41,
         867,   99,  582,  524,  596,  404,   15, 1517, 1157,  789,  625,  595,
        1266,  802,   24, 1611,  851,  314, 1677,  532,  632,  149,  607,  774,
        1563,  133,  910,  423,  976,  520, 1242, 1205, 1601,   85,  721,  473,
         815,  155, 1215, 1387,   18,  428,  851,  373,  126,  283,  473,  649,
        1218,   58,  743,  683, 1184, 1243,  462,  479,  606,  244,  696,  245,
         441, 1624, 1339, 1142,    5, 1473, 1419,  560, 1401,   28,  589,   51,
        1662,  372, 1207, 1627,  907, 1597,  229,   47, 1146, 1160,  231, 1303,
        1432,  824,  852, 1318,  569,  267,  129,  967,  392,  456,  479,  862,
           7,  120, 1190,  623,  319,  867, 1405,  724,  375,   31,  919,  270,
        1477,  562,  548, 1401, 1262,  928,   79,  940, 1423,  269,  746, 1177,
        1568,  773,  266, 1537, 1420,  855,  577,   14,  437,  217,  927,  203,
         668,   42,   95, 1274,  469,  422, 1143, 1474, 1487, 1444,  423,  114,
          48,  802,  621, 1537, 1496, 1600, 1366,  344,  228,  455, 1126,  166,
         928, 1162, 1218,   30,   94, 1322, 1335,  198, 1044,  551, 1174, 1351,
         614,  367, 1171,  782,  408,  958,  762,  787, 1464, 1157,  579,   78,
        1041,  102,  240,  679, 1146, 1063, 1294, 1258,  430,  727,  593,   35,
        1318, 1302,  321,  625, 1178,  687,  293,  301,  127,  245,  723,  647,
         747,  696,  577,  135,  581,  251, 1368,  330, 1337,  751, 1345,  805,
        1543, 1227, 1356,  647])

In [24]:
# Task 1
modelMLP = torch.load(MODEL_PATH_MLP, map_location=device) #load the pre-trained Model
netMLP = torch.load(MODEL_PATH_MLP, map_location=device) #load the pre-trained Model

idx = 6 #select any id for choosing a data

interpretable_emb_user = configure_interpretable_embedding_layer(netMLP,'embedding_user')
interpretable_emb_item = configure_interpretable_embedding_layer(netMLP,'embedding_item')

prediction = modelMLP(input_user_indices[idx],input_item_indices[idx]) #Get the prediction of the model on the given single data

input_emb_user = interpretable_emb_user.indices_to_embeddings(input_user_indices[idx])
input_emb_item = interpretable_emb_item.indices_to_embeddings(input_item_indices[idx])

exp_eucl_distance = get_exp_kernel_similarity_function('euclidean', kernel_width=1000)

lr_lime = Lime(
    netMLP, 
    interpretable_model=SkLearnLinearRegression(),  # build-in wrapped sklearn Linear Regression
    similarity_func=exp_eucl_distance
)

attr = lr_lime.attribute((input_emb_user.view(1,-1),input_emb_item.view(1,-1)), n_samples=100)
print('Lime attribute for MLP are {}'.format(attr))
print('Prediction for {}th value using MLP model is {}'.format(idx,1)) if prediction > 0.5 else print('Prediction for {}th value using neuMF model is {}'.format(idx,0))
print('Label for {}th value using MLP model is {}'.format(idx,input_label_indices[idx]))



Lime attribute for MLP are (tensor([[ 0.0121, -0.0208,  0.0584,  0.0350, -0.2202,  0.0256,  0.0184, -0.0500,
          0.0247, -0.0678, -0.0172,  0.0729, -0.0996,  0.0368, -0.1170, -0.0071,
         -0.1568, -0.0013, -0.0082,  0.0357, -0.0222, -0.0128, -0.0091, -0.0671,
         -0.1252, -0.0062,  0.0631,  0.0326, -0.0715,  0.0159, -0.1495, -0.0395]]), tensor([[-1.4448e-04, -3.7393e-01, -2.2949e-02,  5.7151e-02, -1.3964e-02,
          1.3760e-01,  4.1288e-02,  1.7292e-02, -9.3369e-02, -4.4838e-02,
         -7.4124e-02,  9.5524e-02,  2.5529e-02, -2.1355e-02,  5.1708e-02,
         -7.5598e-02,  5.3712e-02,  7.7821e-02,  6.6502e-02, -1.8127e-01,
          4.6816e-02,  1.1386e-02, -2.3367e-02,  4.8847e-02, -2.8622e-02,
          9.1473e-03,  6.1751e-02,  5.6433e-02, -5.2631e-02, -3.7514e-03,
          1.0204e-01,  2.1102e-02]]))
Prediction for 6th value using neuMF model is 0
Label for 6th value using MLP model is 0.0


In [25]:
# Task 2
modelneuMF = torch.load(MODEL_PATH_NeuMF, map_location=device) #load the pre-trained Model
netneuMF = torch.load(MODEL_PATH_NeuMF, map_location=device) #load the pre-trained Model

idx = 90 #select any id for choosing a data

interpretable_emb_user1 = configure_interpretable_embedding_layer(netneuMF,'embedding_user_GMF')
interpretable_emb_item1 = configure_interpretable_embedding_layer(netneuMF,'embedding_item_GMF')

prediction = modelneuMF(input_user_indices[idx],input_item_indices[idx]) #Get the prediction of the model on the given single data

input_emb_user1 = interpretable_emb_user1.indices_to_embeddings(input_user_indices[idx])
input_emb_item1 = interpretable_emb_item1.indices_to_embeddings(input_item_indices[idx])

exp_eucl_distance = get_exp_kernel_similarity_function('euclidean', kernel_width=1000)

lr_lime = Lime(
    netneuMF, 
    interpretable_model=SkLearnLinearRegression(),  # build-in wrapped sklearn Linear Regression
    similarity_func=exp_eucl_distance
)

attr = lr_lime.attribute((input_emb_user1.view(1,-1),input_emb_item1.view(1,-1)), n_samples=100)
remove_interpretable_embedding_layer(netneuMF, interpretable_emb_user1)
remove_interpretable_embedding_layer(netneuMF, interpretable_emb_item1)
print('Lime attribute for neuMF are {}'.format(attr))
print('Prediction for {}th value using neuMF model is {}'.format(idx,1)) if prediction > 0.5 else print('Prediction for {}th value using neuMF model is {}'.format(idx,0))
print('Actual Label for {}th value using neuMF model is {}'.format(idx,input_label_indices[idx]))

Lime attribute for neuMF are (tensor([[ 0.0051, -0.0712, -0.0636, -0.0483, -0.1804, -0.0693,  0.0402,  0.0150,
         -0.1247,  0.0132,  0.0080, -0.0691, -0.0964,  0.0695, -0.0262,  0.0786,
         -0.0064,  0.0093, -0.0740, -0.0750, -0.0338,  0.0796,  0.0151,  0.0542,
         -0.1026,  0.0574, -0.0561,  0.0243,  0.0033, -0.0300, -0.0014, -0.0038]]), tensor([[-0.0200, -0.0278,  0.0148, -0.0180,  0.0165, -0.1284, -0.0075,  0.0214,
          0.0538,  0.0788,  0.0133,  0.0465, -0.0357,  0.0338,  0.1059, -0.0530,
         -0.0529, -0.0487, -0.0101, -0.0925,  0.0904, -0.0194,  0.0444,  0.0067,
          0.0339, -0.0849,  0.0310, -0.1009,  0.0801, -0.0679, -0.0188,  0.0069]]))
Prediction for 90th value using neuMF model is 0
Actual Label for 90th value using neuMF model is 0.0




In [26]:
# Task 3
# As an Additive Feature Attribution Method with CAPTUM we are using DeepLift algorith using the embeddings already made in Task 1.
# Warning: Please run it only and only after executing Task 1

dl = DeepLift(netMLP)
dl_attr_test = dl.attribute((input_emb_user.view(1,-1),input_emb_item.view(1,-1)))
print('DeepLift attribute for MLP model are {}'.format(dl_attr_test))
remove_interpretable_embedding_layer(netMLP, interpretable_emb_user)
remove_interpretable_embedding_layer(netMLP, interpretable_emb_item)

DeepLift attribute for MLP model are (tensor([[-2.1367e-04,  8.0887e-03,  1.0362e-01,  2.2262e-02, -1.7562e-01,
         -6.9118e-04, -5.8024e-03, -5.2917e-03,  1.5857e-02, -1.2039e-01,
          1.4820e-03, -2.2773e-03, -1.4333e-01,  8.6720e-02, -1.2584e-01,
         -1.0729e-02, -2.5563e-01, -1.3054e-02,  6.3040e-04, -4.2994e-03,
         -1.3471e-03, -6.5906e-03, -4.6718e-02, -1.5959e-02, -1.3743e-01,
         -2.2429e-02,  9.4424e-02,  1.2648e-02, -7.0995e-02, -2.0196e-02,
         -7.1867e-02, -1.8032e-02]], grad_fn=<MulBackward0>), tensor([[ 5.8646e-02, -2.8480e-01, -2.4862e-02,  2.4976e-02, -5.3873e-03,
          7.4753e-02, -2.1247e-04,  2.2422e-03, -4.3256e-02, -3.6812e-02,
          3.8539e-03,  5.8308e-02,  7.5212e-03, -2.1494e-02, -2.0720e-04,
         -2.0040e-02,  5.1701e-03,  8.4333e-02,  3.8598e-03, -4.9603e-02,
          1.3901e-02, -1.6433e-04,  5.1911e-03,  9.6673e-03,  1.2261e-02,
          9.9057e-03, -4.0715e-03,  5.8596e-02, -2.6044e-02,  3.3372e-02,
          2.

               activations. The hooks and attributes will be removed
            after the attribution is finished
