In [None]:
%%bash

pip install pandas==0.23.0
pip install numpy==1.14.3
pip install matplotlib==3.0.3
pip install seaborn==0.8.1
pip install PyAthena==1.8.0

In [None]:
!pip install mxnet==1.5.1

In [None]:
from datetime import datetime

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

from scipy.sparse import lil_matrix

import boto3
import botocore
import sagemaker

In [None]:
session = boto3.session.Session()
region_name = session.region_name

sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()

print(bucket)


## Extracting parameters from FM model

Now that we have the model created and stored in SageMaker, we can download the same and extract the parameters.  The FM model is stored in MxNet format.

This section is reproduced with minor modifications from the blog cited above for the sake of completeness.

### Download model data

Skip the next cell block if you have already downloaded the model.

In [None]:
import mxnet as mx
import os

model_file_name = "model.tar.gz"
model_full_path = fm.output_path + "/" + fm.latest_training_job.job_name + "/output/" + model_file_name
print("Model Path: ", model_full_path)

In [None]:
#Download FM model 

!rm -rf ./model
!mkdir -p ./model/
!aws s3 cp $model_full_path ./model

In [None]:
%%bash 
# TODO:  Fix this

#Extract model file for loading to MXNet
echo $model_full_path
cd ./model/
ls -al
tar xzvf model.tar.gz
unzip -o model_algo-1
mv symbol.json model-symbol.json
mv params model-0000.params

ls -al

### Extract model data to create item and user latent matrixes

In [None]:
print('num_customers: {}'.format(num_customers))
print('num_products: {}'.format(num_products))

In [None]:
import mxnet as mx
#Extract model data
m = mx.module.Module.load('./model/model', 0, False, label_names=['out_label'])

In [None]:
V = m._arg_params['v'].asnumpy()
w = m._arg_params['w1_weight'].asnumpy()
b = m._arg_params['w0_weight'].asnumpy()

# user latent matrix - concat (V[u], 1) 
ones = np.ones(num_customers).reshape((num_customers, 1))
knn_user_matrix = np.concatenate((V[:num_customers], ones), axis=1)
print('knn_user_matrix.shape')
print(knn_user_matrix.shape)

# item latent matrix - concat(V[i], w[i]). 
# Note:  The +1 is not part of the original example
knn_item_matrix = np.concatenate((V[num_customers + 1:], w[num_customers + 1:]), axis=1)
print('knn_item_matrix.shape')
print(knn_item_matrix.shape)

knn_train_label = np.arange(1, num_products + 1)
print('knn_train_label')
print(knn_train_label.shape)

## Calculate Influence Matrix

Per the paper cited above, the influence matrix for user $j$ is calculated as:

$$J_j=U^T(U W_j U^T)^{-1}UW_j$$

Let's map those symbols to the variables in this notebook.

* $U$ is the embedding matrix for items.  In this formula, it is the transpose of the item matrix we extracted from the FM model.  So $U={knn\_item\_matrix}^{T}$
* $U^T={knn\_item\_matrix}$
* $W$ is a binary matrix with 1s on the diagonal in positions corresponding the known entries of X for this user.  In other words, it's a matrix of size $nb\_movies$ by $nb\_movies$, with a one on the diagonal in row and column $i$ where user $j$ rated movie $i$.

Now let's confirm that our dimensions line up properly.

In [None]:
knn_item_matrix.shape

In [None]:
knn_user_matrix.shape

### Build the matrix $W$.

For the sake of an example, let's pick user `846`, just because that user was the first row in our training set.

In [None]:
print(num_products)

In [None]:
W = np.zeros([num_products, num_products])
W.shape

In [None]:
test_customer_ids = df[df.star_rating > 3].customer_id
print(test_customer_ids[0])

Find a `customer_id`

In [None]:
user_of_interest = test_customer_ids[0]

u1 = df[df.customer_id == user_of_interest]
u2 = df[df.customer_id == user_of_interest]

In [None]:
u1.head(5)

In [None]:
u2.head(5)

In [None]:
u_all = np.concatenate((np.array(u1['product_id']), np.array(u2['product_id'])), axis=0)
u_all

In [None]:
#print(product_index)
print(product_index[product_index.product_id == 'B00BWDH368'])

In [None]:
for u_rating in u_all:
    # Convert the user_of_interest <=> index using customer_index
    # Subtract the num_customers since the indexes are combined customers + products (perhaps we reset_index() above)
    u_rating_idx = product_index[product_index.product_id == u_rating].item - num_customers
    W[u_rating_idx, u_rating_idx] = 1

### Calculate $J$ for user $j$

In [None]:
# influence matrix = u_tr * (u*w*u_tr)-1 * u * w
J1 = np.matmul(np.transpose(knn_item_matrix), W) # u*w
J2 = np.matmul(J1, knn_item_matrix) # u*w*u_tr
J3 = np.linalg.inv(J2) # (u*w*u_tr)-1
J4 = np.matmul(knn_item_matrix, J3) # u_tr * (u*w*u_tr)-1
J5 = np.matmul(J4, np.transpose(knn_item_matrix)) # u_tr * (u*w*u_tr)-1 * u
J = np.matmul(J5, W) # # u_tr * (u*w*u_tr)-1 * u * w

In [None]:
J.shape

## Explaining recommendations for a user

Now we can use the influence matrix to calculate the two metrics explained in the research paper:

_Influence_ of the actual rating that user $j$ assigned to item $k$ on the predicted rating for item $i$.  This is calculated as:

$${\beta}_k = J_{ik}^j$$

In other words, we just look up the element at row $i$ and column $k$ of the influence matrix $J$ for user $j$

_Impact_ of the actual rating that user $j$ assigned to item $k$ on the predicted rating for item $i$.  This is calculated as:

$${\gamma}_k = {\beta}_{k}x_{kj}$$

In other words, we multiply the influence by the actual rating that user $j$ gave to item $k$

In this example I'll just use influence, since we converted the ratings to a binary like/don't like.


### Look up influence for a test recommendation

For our selected user, let's find a movie in our test set that they rated.

In [None]:
u2.head(5)

In [None]:
movie_to_rate = 60

In [None]:
result = fm_predictor.predict(X_test[8451:8452].toarray()) # use the row number from the test set

In [None]:
result

In [None]:
influence_i = J[movie_to_rate-1,:] # movies are indexed at 1, so we offset to 0

In [None]:
influence_i[movie_to_rate-1] = 0.0 # zero this out; it's the influence of the movie itself

In [None]:
# join with movie names
df_movies = pd.read_csv('ml-100k/u.item', sep='|', header=None, names=['movie_id', 'movie_name', 'c3','c4','c5','c6','c7',
                                                                      'c9','c9','c10','c11','c12','c13','c14','c15','c16','c17',
                                                                      'c18','c19','c20','c21','c22','c23','c24'])
df_movies.head(5)

In [None]:
df_influence = pd.DataFrame(data={'influence': influence_i, 'movie': df_movies['movie_name']})
df_influence.head(5)

This movie is 'Three Colors:Blue', a French drama that probably appeals to 'art house' movie goers

In [None]:
df_movies[df_movies['movie_id'] == movie_to_rate]

And what do we recommend?

In [None]:
df_top_influence = df_influence.nlargest(20, 'influence')
df_top_influence

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')

In [None]:
ax = df_top_influence.plot(x ='movie', y='influence', kind = 'barh', figsize=(20,20), title='Top 20 Influences', color='blue')
ax.set_ylabel("Movie")
ax.set_xlabel("Influence")

In [None]:
np.sort(u_all)[:5]

In [None]:
movie_to_rate = 9

In [None]:
rate_data = np.zeros((1, num_features))

In [None]:
rate_data[0, user_of_interest-1] = 1.0

In [None]:
rate_data[0, nb_users + movie_to_rate -1] = 1.0

In [None]:
result = fm_predictor.predict(rate_data) 
result

In [None]:
influence_i = J[movie_to_rate-1,:] # movies are indexed at 1, so we offset to 0
influence_i[movie_to_rate-1] = 0.0

In [None]:
df_influence = pd.DataFrame(data={'influence': influence_i, 'movie': df_movies['movie_name']})
df_influence.head(5)

We're looking at the movie 'Dead Man Walking', which was an acclaimed movie about a prisoner on Death Row.

In [None]:
df_movies[df_movies['movie_id'] == movie_to_rate]

In [None]:
df_top_influence = df_influence.nlargest(20, 'influence')
df_top_influence

In [None]:
ax = df_top_influence.plot(x ='movie', y='influence', kind = 'barh', figsize=(20,20), title='Top 20 Influences', color='blue')
ax.set_ylabel("Movie")
ax.set_xlabel("Influence")

Are these results intuitively satisfying?  I'm not quite sure, but remember that built this model with a relatively limited data set.

## Clean-up

In [None]:
fm_predictor.delete_endpoint()