![nafre](../data/books_dataset/images/nafre.png)


## Myself

<img src="https://media.licdn.com/dms/image/C5103AQFTTkPNQerCIA/profile-displayphoto-shrink_200_200/0?e=1559174400&v=beta&t=e46jBak0gH9thREsq94CitbkQxvqVUrNXFAe3WIuBBE" width="140" height="140" border-radius="50%" align="left"/>

        Avinash Mishra (Avi) | Data Scientist | Catalina marketing Japan

        Linkedin : https://www.linkedin.com/in/avinash-mishra-a0846360/
  
        Github : https://github.com/avinash-mishra

        Catalina Japan : https://catalina-jp.com/

## Outline
 - Brief Introduction of recSys
 - Dataset
 - Build Models using neural approaches
 - Embedding visualization
 - Book recommendation

## Introduction

Recommendataion Engines are one of the most popular applications of Machine learning systems. Due to their widespread success, they are quickly becoming ubiquitous to a lot of business units. 

Types of Recommendation Systems:
- Popularity based
- Content based
- Collaborative filtering
	- Nearest Neighbor
	- Matrix Factorization

Above methods are quite common to build recommendation systems. 

In the past couple of years this trend has been changing. Due to the massive success of effectively training deep neural networks, new approaches have been developed by levaraging the tools and modeling flexibility from the Deep Learning ecosystem. 

This notebook gives a quickstart concepts using neural network architectures. 

## Building book Recommendation
Goal : <font color=blue>predict the rating or preference a user would give to a book given his old books ratings or preferences. </font>

### Dataset : [goodbooks-10k](http://fastml.com/goodbooks-10k-a-new-dataset-for-book-recommendations/)

## Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import warnings

# Keras
from keras.layers import Input, Embedding, Flatten, Dot, Dense, Concatenate, Dropout, multiply, concatenate
from keras.models import Model
from keras.models import load_model
from keras.utils.vis_utils import model_to_dot


from sklearn.model_selection import train_test_split
from IPython.display import SVG

warnings.filterwarnings('ignore')
%matplotlib inline


## Loading data

In [None]:
ratings_df = pd.read_csv('../data/books_dataset/ratings.csv')


## EDA

In [None]:
ratings_df.head()

In [None]:
ratings_df.shape

In [None]:
ratings_df.describe().transpose()

In [None]:
# 5 ratings
ratings_df.rating.unique()

In [None]:
# total user
n_users = ratings_df.user_id.nunique()
n_users

In [None]:
# Total books
n_books = ratings_df.book_id.nunique()
n_books

## Create train & test set

In [None]:
train, test = train_test_split(ratings_df, test_size=0.2, random_state=42)

In [None]:
display(train.head())
display(test.head())

## Model building

### Creating dot product model
Most recommendation systems are build using a simple dot product as shown below but newer ones are now implementing a neural network instead of the simple dot product.


#### Keras functional api working
> Create layers

> Combine everything inside model

> Compile model

> fit the model

#### Keras Dot product api is like this
`Dot(name="Dot-Product", axes=1)([book_vec, user_vec])`

In [None]:
# creating book embedding path
book_input = Input(shape=[1], name="Book-Input")
book_embedding = Embedding(n_books+1, output_dim=5, name="Book-Embedding")(book_input)
book_vec = Flatten(name="Flatten-Books")(book_embedding)

# creating user embedding path
user_input = Input(shape=[1], name="User-Input")
user_embedding = Embedding(n_users+1, output_dim=5, name="User-Embedding")(user_input)
user_vec = Flatten(name="Flatten-Users")(user_embedding)

# performing dot product and creating model
prod = Dot(name="Dot-Product", axes=1)([book_vec, user_vec])
model = Model([user_input, book_input], prod)
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

In [None]:
# Model visualization
SVG(model_to_dot(model, show_shapes=True).create(prog='dot', format='svg'))


In [None]:
# model.summary()

In [None]:
def train_and_save(model, model_name):
    epoch = 5
    if os.path.exists(f'../models/{model_name}'):
        model = load_model(f'../models/{model_name}')
    else:
        history = model.fit([train.user_id, train.book_id], train.rating, epochs=epoch, verbose=1)
        model.save(f'../models/{model_name}')
        plt.plot(history.history['loss'])
        plt.xlabel("Epochs")
        plt.ylabel("Training Error")
    return model

In [None]:
model = train_and_save(model, 'dot_product.h5')

In [None]:
# Evaluate model
display(['mse, mae'])
display(model.evaluate([test.user_id, test.book_id], test.rating))


# make prediction
predictions = model.predict([test.user_id.head(10), test.book_id.head(10)])

print('predicted', 'actual')
[print(predictions[i], test.rating.iloc[i]) for i in range(0, 10)]


## Creating Neural Network

Neural Networks proved their effectiveness for almost every machine learning problems with enough data. Neurals Networks perform exceptionally well for recommendation systems. 

### Model 1 
Use Embedding multiplication and Fully connected Dense NN layer. So that model can learn complex non-linear relationship.

<img src="../data/books_dataset/images/model1.png" alt="Model 1" style="width: 500px;" />

In [None]:
# creating book embedding path
book_input = Input(shape=[1], name="Book-Input")
book_embedding = Embedding(n_books+1, output_dim=5, name="Book-Embedding")(book_input)

# creating user embedding path
user_input = Input(shape=[1], name="User-Input")
user_embedding = Embedding(n_users+1, output_dim=5, name="User-Embedding")(user_input)

# Combine features
o = multiply([book_embedding, user_embedding])
o = Dropout(0.5)(o)
o = Flatten(name="Flatten-Embeddings")(o)

# add fully-connected-layers
out = Dense(1)(o)

# Create model and compile it
model1 = Model([user_input, book_input], out)
model1.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

In [None]:
# Model visualization
SVG(model_to_dot(model1, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
model1 = train_and_save(model1, 'model1.h5')

In [None]:
# Evaluate model
display(['mse, mae'])
display(f'Dot_product : {[1.1797639321064701, 0.7913027472862053]}')
print('model 1')
display(model1.evaluate([test.user_id, test.book_id], test.rating))


# make prediction
predictions = model1.predict([test.user_id.head(10), test.book_id.head(10)])
print('predicted', 'actual')
[print(predictions[i], test.rating.iloc[i]) for i in range(0, 10)]


### Model 2 
Add bias that a user might have in giving consistently high scores to every book he read or a book having consistently bad scores for all users.

<img src="../data/books_dataset/images/model2.png" alt="Model 2" style="width: 500px;" />

In [None]:
bias = 1

# creating book embedding path
book_input = Input(shape=[1], name="Book-Input")
book_embedding = Embedding(n_books+1, output_dim=5, name="Book-Embedding")(book_input)
book_bias = Embedding(n_books+1, bias, name="Book-Bias-Embedding")(book_input)

# creating user embedding path
user_input = Input(shape=[1], name="User-Input")
user_embedding = Embedding(n_users+1, output_dim=5, name="User-Embedding")(user_input)
user_bias = Embedding(n_users+1, bias, name="User-Bias-Embedding")(user_input)

# Combine features
o = multiply([book_embedding, user_embedding])
o = concatenate([o, book_bias, user_bias])
o = Dropout(0.5)(o)
o = Flatten(name="Flatten-Features")(o)

# add fully-connected-layers
out = Dense(1)(o)

# Create model and compile it
model2 = Model([user_input, book_input], out)
model2.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

In [None]:
# Model visualization
SVG(model_to_dot(model2, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
model2 = train_and_save(model2, 'model2.h5')

In [None]:
# Evaluate model
display(['mse, mae'])
display(f'Dot_product : {[1.1797639321064701, 0.7913027472862053]}')
display(f'Model 1 : {[0.8265591262168259, 0.7198698723681436]}')
print('model 2')
display(model2.evaluate([test.user_id, test.book_id], test.rating))


# make prediction
predictions = model2.predict([test.user_id.head(10), test.book_id.head(10)])
print('predicted', 'actual')
[print(predictions[i], test.rating.iloc[i]) for i in range(0, 10)]

### Model 3 
Add `relu` activation function. The nicest thing about relu is that it’s gradient is always equal to 1, this way we can pass the maximum amount of the error through the network during back-propagation.

<img src="../data/books_dataset/images/model3.png" alt="Model 3" style="width: 500px;" />

In [None]:
bias = 1

# creating book embedding path
book_input = Input(shape=[1], name="Book-Input")
book_embedding = Embedding(n_books+1, output_dim=5, name="Book-Embedding")(book_input)
book_bias = Embedding(n_books+1, bias, name="Book-Bias-Embedding")(book_input)

# creating user embedding path
user_input = Input(shape=[1], name="User-Input")
user_embedding = Embedding(n_users+1, output_dim=5, name="User-Embedding")(user_input)
user_bias = Embedding(n_users+1, bias, name="User-Bias-Embedding")(user_input)

# Combine features
o = multiply([book_embedding, user_embedding])
o = concatenate([o, book_bias, user_bias])
o = Dropout(0.5)(o)
o = Flatten(name="Flatten-Features")(o)

# add fully-connected-layers
o = Dense(10, activation='relu')(o)
out = Dense(1, activation='relu')(o)

# Create model and compile it
model3 = Model([user_input, book_input], out)
model3.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

In [None]:
# Model visualization
SVG(model_to_dot(model3, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
model3 = train_and_save(model3, 'model3.h5')

In [None]:
# Evaluate model
display(['mse, mae'])
display(f'Dot_product : {[1.1797639321064701, 0.7913027472862053]}')
display(f'Model 1 : {[0.8265591262168259, 0.7198698723681436]}')
display(f'Model 2 : {[0.8215828870157219, 0.7241504425097942]}')

print('model 3')
display(model3.evaluate([test.user_id, test.book_id], test.rating))


# make prediction
predictions = model3.predict([test.user_id.head(10), test.book_id.head(10)])
print('predicted', 'actual')
[print(predictions[i], test.rating.iloc[i]) for i in range(0, 10)]

### Model 4 
Increase Dense layers to check if model improves.

In [None]:
bias = 1

# creating book embedding path
book_input = Input(shape=[1], name="Book-Input")
book_embedding = Embedding(n_books+1, output_dim=5, name="Book-Embedding")(book_input)
book_bias = Embedding(n_books+1, bias, name="Book-Bias-Embedding")(book_input)

# creating user embedding path
user_input = Input(shape=[1], name="User-Input")
user_embedding = Embedding(n_users+1, output_dim=5, name="User-Embedding")(user_input)
user_bias = Embedding(n_users+1, bias, name="User-Bias-Embedding")(user_input)

# Combine features
o = multiply([book_embedding, user_embedding])
o = concatenate([o, book_bias, user_bias])
o = Dropout(0.5)(o)
o = Flatten(name="Flatten-Features")(o)

# add fully-connected-layers
o = Dense(128, activation='relu')(o)
o = Dropout(0.2)(o)
o = Dense(32, activation='relu')(o)
o = Dropout(0.2)(o)
out = Dense(1, activation='relu')(o)

# Create model and compile it
model4 = Model([user_input, book_input], out)
model4.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

In [None]:
# Model visualization
SVG(model_to_dot(model4, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
model4 = train_and_save(model4, 'model4.h5')

In [None]:
# Evaluate model
display(f'Dot_product : {[1.1797639321064701, 0.7913027472862053]}')
display(f'Model 1 : {[0.8265591262168259, 0.7198698723681436]}')
display(f'Model 2 : {[0.8215828870157219, 0.7241504425097942]}')
display(f'Model 3 : {[0.7899722075320518, 0.7070955941659997]}')


display(model4.evaluate([test.user_id, test.book_id], test.rating))


# make prediction
predictions = model4.predict([test.user_id.head(10), test.book_id.head(10)])

[print(predictions[i], test.rating.iloc[i]) for i in range(0, 10)]

### Using extra information 
Often, along with the user-interaction data, other information such as user metadata and item metadata is also given. With the above networks, it's trivial to add this metadata to our model. Let's see how.

model = Model([user_input, book_input, age_input, book_contextual_input], out)


### Visualizing Embeddings
Embeddings are weights that are learned to represent some specific variables like books and user in our case and therefore we can not only use them to get good results on our recommendation problem but also to extract inside about our data. 

__[Tensorflow embedding projector](http://projector.tensorflow.org/)__

In [None]:
# Extract embeddings
book_em = model4.get_layer('Book-Embedding')   # Book-Embedding is same name that we provided to Embedding layer
book_em_weights = book_em.get_weights()[0]

In [None]:
book_em_weights[:5]

In [None]:
from sklearn.decomposition import PCA
import seaborn as sns

pca = PCA(n_components=2)   # We used output_dim=5 so 1 < n_components <=5 
pca_result = pca.fit_transform(book_em_weights)
sns.scatterplot(x=pca_result[:, 0], y=pca_result[:,1])

### Making recommendations

In [None]:
# Creating dataset for making recommendations for the first user
book_data = ratings_df.book_id.unique()
book_data[:5]

In [None]:
np.sort(ratings_df.user_id.unique())

In [None]:
user = np.array([1 for i in range(len(book_data))])
user[:5]

In [None]:
predictions = model4.predict([user, book_data])

predictions = np.array([a[0] for a in predictions])

recommended_book_ids = (-predictions).argsort()[:5]  # sort values in decreasing order and return index

recommended_book_ids

In [None]:
# print predicted scores
predictions[recommended_book_ids]

In [None]:
books = pd.read_csv('../data/books_dataset/books.csv')
books.iloc[:, :10].head()

In [None]:
books[books['id'].isin(recommended_book_ids)].iloc[:, :10]

### Youtube recommendation Engine
[Paper link](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf)

<img src="../data/books_dataset/images/YouTube.png" alt="YouTube RecSys" style="width: 600px;" />

### Summary

1. RecSys is a good use case in Data Science domain.
2. Dot product is easy to build and first step for Neural network.
3. Built 4 Models:
	- Model 1 : Multiply(user_vec, book_vec) + Dense NN layer
	- Model 2 : Added bias inside concatenation layer concatenate([o, book_bias, user_bias])
	- Model 3 : Added 'relu' Activation function inside Dense NN layer. 
	- Model 4 : Added more Dense layers to check if perfomance optimizes. 
4. Embedding visualization
5. Book recommendation

### Credit where it's due
 - A brilliant [fast.ai](https://course.fast.ai/videos/?lesson=4) course by Jeremy and Rachel. Refer to Lesson 4 for Collaborative Filtering lecture.
 - Deep learning class. [link](https://m2dsupsdlclass.github.io/lectures-labs/)
 - Reference: Keras Model guide. [link](http://faroit.com/keras-docs/1.0.4/getting-started/sequential-model-guide/)

### For those who'd like to get deeper
 - Deep Recommender models using PyTorch - [Spotlight](https://github.com/maciejkula/spotlight). The [Keras implementation](https://github.com/maciejkula/triplet_recommendations_keras).
 - [YouTube Recommendation Engine](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf) (Combination of techniques)
 - [RecSys conference 2017 talks](https://towardsdatascience.com/recsys-2017-2d0879351097)


<img src="https://media.tenor.com/images/69d1d66198f1aac60ad244f6c004f372/tenor.gif" alt="Thank You" style="width: 200px;"/>