# Movie Recommendations System


Recommender System is a system that seeks to predict or filter preferences according to the user’s choices. Recommender systems are utilized in a variety of areas including movies, music, news, books, research articles, search queries, social tags, and products in general. 

This Kernel make Recommendations on basis of 
* Content Based Recommendations 
* Similarity Based Recommendations 
* Collabrative Filtering Based Recommendations  
* Recommendations based on Surpise Library
* Factors Based Recommendations
* Embedding Based Recommedations

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
import ast
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel
from sklearn.metrics import pairwise_distances
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split

from scipy.spatial.distance import cosine, correlation
from surprise import Reader, Dataset, SVD, NormalPredictor, BaselineOnly, KNNBasic, NMF
from surprise.model_selection import cross_validate, KFold ,GridSearchCV , RandomizedSearchCV

from keras.models import Sequential
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
from keras.layers import  Input, dot, concatenate
from keras.models import Model
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.layers import Activation, Dense, Dropout, Embedding, Flatten, Conv1D, MaxPooling1D, LSTM

import gc
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
pd.set_option('display.max_rows',50)
pd.set_option('display.max_columns', 50)

In [None]:
def reduce_mem_usage(df):
    # iterate through all the columns of a dataframe and modify the data type
    #   to reduce memory usage.        
    
    start_mem = df.memory_usage().sum() / 1024**2
    print('Memory usage of dataframe is {:.2f} MB'.format(start_mem))

    for col in df.columns:
        col_type = df[col].dtype

        if col_type != object:
            c_min = df[col].min()
            c_max = df[col].max()
            if str(col_type)[:3] == 'int':
                if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                    df[col] = df[col].astype(np.int8)
                elif c_min > np.iinfo(np.int16).min and c_max < np.iinfo(np.int16).max:
                    df[col] = df[col].astype(np.int16)
                elif c_min > np.iinfo(np.int32).min and c_max < np.iinfo(np.int32).max:
                    df[col] = df[col].astype(np.int32)
                elif c_min > np.iinfo(np.int64).min and c_max < np.iinfo(np.int64).max:
                    df[col] = df[col].astype(np.int64)  
            else:
                if c_min > np.finfo(np.float16).min and c_max < np.finfo(np.float16).max:
                    df[col] = df[col].astype(np.float16)
                elif c_min > np.finfo(np.float32).min and c_max < np.finfo(np.float32).max:
                    df[col] = df[col].astype(np.float32)
                else:
                    df[col] = df[col].astype(np.float64)

    end_mem = df.memory_usage().sum() / 1024**2
    print('Memory usage after optimization is: {:.2f} MB'.format(end_mem))
    print('Decreased by {:.1f}%'.format(100 * (start_mem - end_mem) / start_mem))

    return df

# Content Based Recommandation System 
Content-based filtering approaches uses a series of discrete characteristics of an item in order to recommend additional items with similar properties. Content-based filtering methods are totally based on a description of the item and a profile of the user’s preferences. It recommends items based on user’s past preferences.

In [None]:
credits = pd.read_csv("../input/tmdb-movie-metadata/tmdb_5000_credits.csv")
movies = pd.read_csv("../input/tmdb-movie-metadata/tmdb_5000_movies.csv")

In [None]:
display(credits.head(5))
display(movies.head(5))

In [None]:
credits.columns = ['id','tittle','cast','crew']
movies= movies.merge(credits,on='id')
plots = movies['overview']
tfidf = TfidfVectorizer(stop_words = 'english' , max_df = 4 , min_df= 1)
plots = plots.fillna('')
tfidf_matrix = tfidf.fit_transform(plots)


In [None]:
cos_similar = linear_kernel(tfidf_matrix , tfidf_matrix)
cos_similar.shape

In [None]:
indices = pd.Series(movies.index , index = movies['title']).drop_duplicates()

In [None]:
def get_movies(title):
    idx = indices[title]
    similar = list(enumerate(cos_similar[idx]))
    similar = sorted(similar , key = lambda x: x[1] , reverse = True)
    similar = similar[:11]
    indic = []
    for i in similar:
        indic.append(i[0])
    return movies['title'].iloc[indic]


In [None]:
get_movies('Spider-Man 3')

In [None]:
get_movies('Toy Story')

# Exploring MovieLens 100K Datasets

In [None]:
readme= open('../input/movielens-100k-dataset/ml-100k/README','r') 
print(os.listdir('../input/movielens-100k-dataset/ml-100k'))
print(readme.read()) 

In [None]:
info = pd.read_csv('../input/movielens-100k-dataset/ml-100k/u.info' , sep=" ", header = None)
info.columns = ['Counts' , 'Type']

occupation = pd.read_csv('../input/movielens-100k-dataset/ml-100k/u.occupation' , header = None)
occupation.columns = ['Occupations']

items = pd.read_csv('../input/movielens-100k-dataset/ml-100k/u.item' , header = None , sep = "|" , encoding='latin-1')
items.columns = ['movie id' , 'movie title' , 'release date' , 'video release date' ,
              'IMDb URL' , 'unknown' , 'Action' , 'Adventure' , 'Animation' ,
              'Childrens' , 'Comedy' , 'Crime' , 'Documentary' , 'Drama' , 'Fantasy' ,
              'Film_Noir' , 'Horror' , 'Musical' , 'Mystery' , 'Romance' , 'Sci_Fi' ,
              'Thriller' , 'War' , 'Western']

data = pd.read_csv('../input/movielens-100k-dataset/ml-100k/u.data', header= None , sep = '\t')
user = pd.read_csv('../input/movielens-100k-dataset/ml-100k/u.user', header= None , sep = '|')
genre = pd.read_csv('../input/movielens-100k-dataset/ml-100k/u.genre', header= None , sep = '|' )

genre.columns = ['Genre' , 'genre_id']
data.columns = ['user id' , 'movie id' , 'rating' , 'timestamp']
user.columns = ['user id' , 'age' , 'gender' , 'occupation' , 'zip code']


In [None]:
display(info)
display(user.shape)
display(items.shape)
display(data.shape)

In [None]:
# Merging the columns with data table to better visualise
data = data.merge(user , on='user id')
data = data.merge(items , on='movie id')

In [None]:
# Data Cleaning for Model Based Recommandation System
def convert_time(x):
    return datetime.utcfromtimestamp(x).strftime('%d-%m-%Y')
def date_diff(date):
    d1 = date['release date'].split('-')[2]
    d2 = date['rating time'].split('-')[2]
    return abs(int(d2) - int(d1))

# data.drop(columns = ['movie title' , 'video release date' , 'IMDb URL'] , inplace = True)
data.dropna(subset = ['release date'] , inplace = True)

user_details = data.groupby('user id').size().reset_index()
user_details.columns = ['user id' , 'number of user ratings']
data = data.merge(user_details , on='user id')

movie_details = data.groupby('movie id').size().reset_index()
movie_details.columns = ['movie id' , 'number of movie ratings']
data = data.merge(movie_details , on='movie id')

user_details = data.groupby('user id')['rating'].agg('mean').reset_index()
user_details.columns = ['user id' , 'average of user ratings']
data = data.merge(user_details , on='user id')

movie_details = data.groupby('movie id')['rating'].agg('mean').reset_index()
movie_details.columns = ['movie id' , 'average of movie ratings']
data = data.merge(movie_details , on='movie id')


user_details = data.groupby('user id')['rating'].agg('std').reset_index()
user_details.columns = ['user id' , 'std of user ratings']
data = data.merge(user_details , on='user id')

movie_details = data.groupby('movie id')['rating'].agg('std').reset_index()
movie_details.columns = ['movie id' , 'std of movie ratings']
data = data.merge(movie_details , on='movie id')

data['age_group'] = data['age']//10
data['rating time'] = data.timestamp.apply(convert_time)
data['time difference'] = data[['release date' , 'rating time']].apply(date_diff, axis =1)

data['total rating'] = (data['number of user ratings']*data['average of user ratings'] + data['number of movie ratings']*data['average of movie ratings'])/(data['number of movie ratings']+data['number of user ratings'])
data['rating_new'] = data['rating'] - data['total rating']

del movie_details
del user_details

# Collaborative Filtering  Based Recommendation System
Collaborative filtering approaches build a model from user’s past behavior (i.e. items purchased or searched by the user) as well as similar decisions made by other users. This model is then used to predict items (or ratings for items) that user may have an interest in.

In [None]:
pivot_table_user = pd.pivot_table(data=data,values='rating_new',index='user id',columns='movie id')
pivot_table_user = pivot_table_user.fillna(0)
pivot_table_movie = pd.pivot_table(data=data,values='rating',index='user id',columns='movie id')
pivot_table_movie = pivot_table_movie.fillna(0)

In [None]:
user_based_similarity = 1 - pairwise_distances( pivot_table_user.values, metric="cosine" )
movie_based_similarity = 1 - pairwise_distances( pivot_table_movie.T.values, metric="cosine" )

In [None]:
user_based_similarity = pd.DataFrame(user_based_similarity)
user_based_similarity.columns = user_based_similarity.columns+1
user_based_similarity.index = user_based_similarity.index+1

movie_based_similarity = pd.DataFrame(movie_based_similarity)
movie_based_similarity.columns = movie_based_similarity.columns+1
movie_based_similarity.index = movie_based_similarity.index+1

In [None]:
# Testing movie based Recommendation

def rec_movie(movie_id):
    temp_table = pd.DataFrame(columns = items.columns)
    movies = movie_based_similarity[movie_id].sort_values(ascending = False).index.tolist()[:11]
    for mov in movies:
#         display(items[items['movie id'] == mov])
        temp_table = temp_table.append(items[items['movie id'] == mov], ignore_index=True)
    return temp_table
def rec_user(user_id):
    temp_table = pd.DataFrame(columns = user.columns)
    us = user_based_similarity[user_id].sort_values(ascending = False).index.tolist()[:101]
    for u in us:
#         display(items[items['movie id'] == mov])
        temp_table = temp_table.append(user[user['user id'] == u], ignore_index=True)
    return temp_table

In [None]:
display(rec_movie(176))
display(rec_movie(11))

In [None]:
def user_rating(x):
    similar_user = rec_user(x)
    similar_user.drop(columns= ['age' , 'gender' , 'occupation' , 'zip code'] , inplace = True)
    similar_user = similar_user.merge(pivot_table_movie , on= 'user id')
    similar_user = similar_user.set_index('user id')
    similar_user.replace(0, np.nan, inplace=True)
    u_ratings = similar_user[similar_user.index==x]
    similar_user.drop(similar_user.index[0] , inplace = True)
    return u_ratings.append(similar_user.mean(axis = 0 , skipna = True), ignore_index = True)   

In [None]:
display(user_rating(771))
display(user_rating(900))

# Surprise Library Exploration

In [None]:
reader = Reader(rating_scale=(1, 5))
sup_data = Dataset.load_from_df(data[['user id', 'movie title', 'rating']], reader)

### Different Prediction Algorithms

In [None]:
algo = NormalPredictor()
cross_validate(algo, sup_data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

In [None]:
algo = SVD()
cross_validate(algo, sup_data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

In [None]:
algo = KNNBasic(k=20)
cross_validate(algo, sup_data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

In [None]:
algo = KNNBasic(sim_options={'user_based': False} , k=20) # https://surprise.readthedocs.io/en/stable/prediction_algorithms.html#similarity-measure-configuration
cross_validate(algo, sup_data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

In [None]:
algo = NMF()
cross_validate(algo, sup_data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

Parameters:	

*     n_factors – The number of factors. Default is 100.
*     n_epochs – The number of iteration of the SGD procedure. Default is 20.
*     init_mean – The mean of the normal distribution for factor vectors initialization. Default is 0.
*     init_std_dev – The standard deviation of the normal distribution for factor vectors initialization. Default is 0.1.
*     lr_all – The learning rate for all parameters. Default is 0.005.
*     reg_all – The regularization term for all parameters. Default is 0.02.


## Predictions

In [None]:
sup_train = sup_data.build_full_trainset()
algo = SVD(n_factors = 200 , lr_all = 0.005 , reg_all = 0.02 , n_epochs = 40 , init_std_dev = 0.05)
algo.fit(sup_train)

In [None]:
def prediction_algo(uid = None , iid = None):
    predictions = []
    if uid is None:
        for ui in sup_train.all_users():
            predictions.append(algo.predict(ui, iid, verbose = False))
        return predictions
    
    if iid is None:
        for ii in sup_train.all_items():
            ii = sup_train.to_raw_iid(ii)
            predictions.append(algo.predict(uid, ii, verbose = False))
        return predictions
    return predictins.append(algo.predict(uid,iid,verbose = False))

In [None]:
predictions = prediction_algo(uid = 112)
predictions.sort(key=lambda x: x.est, reverse=True)
print('#### Best Recommanded Movies are ####')
for pred in predictions[:21]:
#     print('Movie -> {} with Score-> {}'.format(sup_train.to_raw_iid(pred.iid) , pred.est))
    print('Movie -> {} with Score-> {}'.format(pred.iid , pred.est))

# Factors Based Recommendations
#### This Methods work more by finding  similar movies rather than user ratings but work very fine 

In [None]:
meta_data = pd.read_csv('/kaggle/input/the-movies-dataset/movies_metadata.csv')
keywords = pd.read_csv('/kaggle/input/the-movies-dataset/keywords.csv')
credits = pd.read_csv('/kaggle/input/the-movies-dataset/credits.csv')

meta_data = meta_data[meta_data.id!='1997-08-20']
meta_data = meta_data[meta_data.id!='2012-09-29']
meta_data = meta_data[meta_data.id!='2014-01-01']
meta_data = meta_data.astype({'id':'int64'})

meta_data = meta_data.merge(keywords , on = 'id')
meta_data = meta_data.merge(credits , on = 'id')

In [None]:
def null_values(df):
    for col in df.columns:
        if df[col].isnull().sum() != 0:
            print('Total values missing in {} are {}'.format(col , df[col].isnull().sum()))
null_values(meta_data)

In [None]:
meta_data[meta_data['production_companies'].isnull()]
meta_data.dropna(subset=['production_companies'] , inplace = True)

In [None]:
def btc_function(data):
    if type(data) == str:
        return ast.literal_eval(data)['name'].replace(" ","")
    return data
# https://www.kaggle.com/hadasik/movies-analysis-visualization-newbie
def get_values(data_str):
    if isinstance(data_str, float):
        pass
    else:
        values = []
        data_str = ast.literal_eval(data_str)
        if isinstance(data_str, list):
            for k_v in data_str:
                values.append(k_v['name'].replace(" ",""))
            return str(values)[1:-1]
        else:
            return None


In [None]:
meta_data['btc_name'] = meta_data.belongs_to_collection.apply(btc_function)
meta_data[['genres', 'production_companies', 'production_countries', 'spoken_languages' ,'keywords','cast', 'crew']] = meta_data[['genres', 'production_companies', 'production_countries', 'spoken_languages' ,'keywords' ,'cast' , 'crew']].applymap(get_values)
meta_data['is_homepage'] = meta_data['homepage'].isnull()

In [None]:
meta_data['status'] = meta_data['status'].fillna('')
meta_data['original_language'] = meta_data['original_language'].fillna('')
meta_data['btc_name'] = meta_data['btc_name'].fillna('')

In [None]:
meta_data.drop_duplicates(inplace = True)
meta_data.drop(index = [2584 , 201 , 963 , 5769 , 5931 , 5175, 5587 , 845, 9661 ,11448 , 4145 , 4394 , 11254 , 10511 , 13335 , 13334 , 13329 , 16345 , 16348 , 16349 , 9658 , 9662 , 4391 , 4395 , 846 , 849 , 850 , 5927 , 5932 , 24363 , 33395 , 14101] , inplace = True)

In [None]:
def vector_values(df , columns , min_df_value):
    c_vector = CountVectorizer(min_df = min_df_value)
    df_1 = pd.DataFrame(index = df.index)
    for col in columns:
        print(col)
        df_1 = df_1.join(pd.DataFrame(c_vector.fit_transform(df[col]).toarray(),columns =c_vector.get_feature_names(),index= df.index).add_prefix(col+'_'))
    return df_1
meta_data_addon_1 = vector_values(meta_data , columns = ['status','original_language','genres', 'production_companies' ,'production_countries' , 'spoken_languages' , 'keywords' , 'cast' ,'crew'] ,min_df_value = 20)
meta_data_addon_2 = vector_values(meta_data , columns = ['btc_name'] , min_df_value = 2)

In [None]:
col = ['belongs_to_collection', 'genres' , 'homepage' , 'id' , 'imdb_id' , 'overview' ,'poster_path' , 'status' , 'original_language' , 
'production_companies', 'production_countries', 'spoken_languages', 'keywords',  'cast',  'crew', 'tagline','adult'  ]
meta_data.drop(columns = col , inplace = True)
col = [ 'video', 'is_homepage']
for c in col:
    meta_data[c] = meta_data[c].astype(bool)
    meta_data[c] = meta_data[c].astype(int)

In [None]:
def get_year(date):
    return str(date).split('-')[0]
meta_data['popularity'] = meta_data['popularity'].astype(float)
meta_data['budget'] = meta_data['budget'].astype(float)
meta_data['vote_average_group'] = pd.qcut(meta_data['vote_average'], q=10, precision=2,duplicates = 'drop')
meta_data['popularity_group'] = pd.qcut(meta_data['popularity'], q=10, precision=2,duplicates = 'drop')
meta_data['vote_average_group'] =pd.qcut(meta_data['vote_average'], q=10, precision=2,duplicates = 'drop')
meta_data['runtime_group'] = pd.qcut(meta_data['runtime'], q=10, precision=2,duplicates = 'drop')
meta_data['budget_group'] = pd.qcut(meta_data['budget'], q=10, precision=2,duplicates = 'drop')
meta_data['revenue_group'] = pd.qcut(meta_data['revenue'], q=10, precision=2,duplicates = 'drop')
meta_data['vote_count_group'] = pd.qcut(meta_data['vote_count'], q=10, precision=2,duplicates = 'drop')
meta_data['release_year'] = meta_data['release_date'].apply(get_year)
meta_data['release_year'] = meta_data['release_year'].fillna('')
meta_data['release_year'] = meta_data['release_year'].astype(float)
meta_data['release_year_group'] = pd.qcut(meta_data['release_year'], q=10, precision=2,duplicates = 'drop')
meta_data['title_new'] = meta_data.apply(lambda x: str(x['title'])+' ('+str(x['release_date'])+')' , axis =1)

In [None]:
meta_data_addon_3 = pd.get_dummies(meta_data[['vote_average_group' , 'popularity_group' , 'runtime_group' , 'budget_group' , 'revenue_group' , 'vote_count_group' , 'release_year_group']])
meta_data_train = pd.concat([meta_data_addon_1,meta_data_addon_2,meta_data_addon_3 , meta_data[['video' , 'is_homepage']]] , axis = 1)

In [None]:
meta_data_train.index = meta_data['title_new']

In [None]:
del meta_data_addon_1,meta_data_addon_2,meta_data_addon_3
gc.collect()

In [None]:
def get_similar_movies(movie_title , num_rec = 10):
    try:
        sample_1 = 1 - pairwise_distances([meta_data_train.loc[movie_title].values] , meta_data_train.values , metric = 'cosine')
        sample_1 = pd.DataFrame(sample_1.T , index = meta_data_train.index )
        return sample_1.sort_values(by = 0 , ascending  = False).head(num_rec).index
    except ValueError as e:
        print(e)
#         sample_1 = 1 - pairwise_distances(meta_data_train.loc[movie_title].values, meta_data_train.values , metric = 'cosine')
#         sample_1 = pd.DataFrame(sample_1.T , index = meta_data_train.index )
#         return sample_1.sort_values(by = 0 , ascending  = False).head(20).index.names

In [None]:
print(get_similar_movies('Undisputed III : Redemption (2010-05-22)'))
print(get_similar_movies('Finding Nemo (2003-05-30)'))
print(get_similar_movies('Mindhunters (2004-05-07)'))
print(get_similar_movies('Thor (2011-04-21)'))
print(get_similar_movies('Kong: Skull Island (2017-03-08)'))

As it is seen , it is working great and recommending correct movies. 

**Taking it to next Level**

In [None]:
def multi_rec(seen_movies , num_rec = 10):
    rec_movies = []
    for mov in seen_movies:
        rec_movies.append(get_similar_movies(mov , 5).values)
    return rec_movies
multi_rec(['Star Wars: The Clone Wars (2008-08-05)' , 'Marvel One-Shot: Item 47 (2012-09-13)'])

# Matrix Factorization with Keras

https://www.kaggle.com/fuzzywizard/rec-sys-collaborative-filtering-dl-techniques#4)-Matrix-Factorization-using-Deep-Learning-(Keras)

In [None]:
data = data.sample(frac = 1)
data_train_x = np.array(data[['user id' , 'movie id']].values)
data_train_y = np.array(data['rating'].values)
x_train, x_test, y_train, y_test = train_test_split(data_train_x, data_train_y, test_size = 0.2, random_state = 98)
n_factors = 50
n_users = data['user id'].max()
n_movies = data['movie id'].max()

In [None]:
user_input = Input(shape=(1,), name='User_Input')
user_embeddings = Embedding(input_dim = n_users+1, output_dim=n_factors, input_length=1,name='User_Embedding')(user_input)
user_vector = Flatten(name='User_Vector') (user_embeddings)

movie_input = Input(shape = (1,) , name = 'Movie_input')
movie_embeddings = Embedding(input_dim = n_movies+1 , output_dim = n_factors , input_length = 1 , name = 'Movie_Embedding')(movie_input)
movie_vector = Flatten(name = 'Movie_Vector')(movie_embeddings)

merged_vectors = concatenate([user_vector, movie_vector], name='Concatenation')
dense_layer_1 = Dense(100 , activation = 'relu')(merged_vectors)
dense_layer_3 = Dropout(.5)(dense_layer_1)
dense_layer_2 = Dense(1)(dense_layer_3)
model = Model([user_input, movie_input], dense_layer_2)

In [None]:
model.compile(loss='mean_squared_error', optimizer='adam' ,metrics = ['accuracy'] )
model.summary()

In [None]:
SVG(model_to_dot( model,  show_shapes=True, show_layer_names=True).create(prog='dot', format='svg'))

In [None]:
history = model.fit(x = [x_train[:,0] , x_train[:,1]] , y =y_train , batch_size = 128 , epochs = 30 , validation_data = ([x_test[:,0] , x_test[:,1]] , y_test))

In [None]:
loss , val_loss , accuracy , val_accuracy = history.history['loss'],history.history['val_loss'],history.history['accuracy'],history.history['val_accuracy']

In [None]:
plt.figure(figsize = (12,10))
plt.plot( loss, 'r--')
plt.plot(val_loss, 'b-')
plt.plot( accuracy, 'g--')
plt.plot(val_accuracy,'-')
plt.legend(['Training Loss', 'Validation Loss' , 'Training Accuracy' , 'Validation Accuracy'])
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.show()

In [None]:
score = model.evaluate([x_test[:,0], x_test[:,1]], y_test)
print(np.sqrt(score))

# Recommendation_systems_paperlist 
## Survey paper
* Recommender systems survey [Knowledge-based systems 2013]
* Deep Learning based Recommender System: A Survey and New Perspectives [2017]
* A Survey on Session-based Recommender System [2019] [[__pdf__](https://arxiv.org/pdf/1902.04864.pdf)]

## Recommendation Systems with Social Information 
* SoRec: Social Recommendation Using Probabilistic Matrix Factorization [CIKM 2008]
* A Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks [RecSys 2010]
* Recommender systems with social regularization [WSDM 2011]
* On Deep Learning for Trust-Aware Recommendations in Social Networks [IEEE 2017]
* Learning to Rank with Trust and Distrust in Recommender Systems [RecSys 2017]
* Social Attentional Memory Network: Modeling Aspect- and Friend-level Differences in Recommendation [WSDM 2019]
    - code : https://github.com/chenchongthu/SAMN
* Session-based Social Recommendation via Dynamic Graph Attention Networks [WSDM 2019]
  - code : https://github.com/DeepGraphLearning/RecommenderSystems/tree/master/socialRec
* Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems [WWW 2019]
* Heterogeneous Graph Attention Network [WWW 2019]
* Graph Neural Networks for Social Recommendation [WWW 2019]
* GhostLink: Latent Network Inference for Influence-aware Recommendation [WWW 2019]
* SamWalker: Social Recommendation with Informative Sampling Strategy [WWW 2019]
* Social Recommendation with Optimal Limited Attention [KDD 2019]

## Recommendation Systems with Text Information
  ### Topic-based approach
  * Collaborative topic modeling for recommending scientific articles [KDD 2011]
    - code : https://github.com/blei-lab/ctr
  * Hidden factors and hidden topics: understanding rating dimensions with review text [RecSys 2013]
    - code : https://github.com/lipiji/HFT
  * Jointly modeling aspects, ratings and sentiments for movie recommendation [KDD 2014]
    - code : https://github.com/nihalb/JMARS
  * Ratings meet reviews, a combined approach to recommend [RecSys 2014]
  * Exploring User-Specific Information in Music Retrieval [SIGIR 2018]
  * Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews [WWW 2018]
    - code : https://github.com/hustlingchen/ALFM
  * Exploiting Ratings, Reviews and Relationships for Item Recommendations in Topic Based Social Networks [WWW 2019]
  
  ### Deep learning-based approach
  * Collaborative deep learning for recommender systems [KDD 2015]
    - code : https://github.com/js05212/CDL
  * Convolutional Matrix Factorization for Document Context-Aware Recommendation [RecSys 2016]
    - code : https://github.com/cartopy/ConvMF
  * Joint Deep Modeling of Users and Items Using Reviews for Recommendation [WSDM 2017]
    - code : https://github.com/chenchongthu/DeepCoNN
  * Transnets: Learning to transform for recommendation [RecSys 2017]
    - code : https://github.com/rosecatherinek/TransNets
  * Latent Cross: Making Use of Context in Recurrent Recommender Systems [WSDM 2018]
  * Coevolutionary Recommendation Model: Mutual Learning between Ratings and Reviews [WWW 2018]
  * Neural Attentional Rating Regression with Review-level Explanations [WWW 2018]
    - code : https://github.com/chenchongthu/NARRE
  * Learning Personalized Topical Compositions with Item Response Theory [WSDM 2019]
  * Uncovering Hidden Structure in Sequence Data via Threading Recurrent Models [WSDM 2019]
  * Gated Attentive-Autoencoder for Content-Aware Recommendation [WSDM 2019]
    - code : https://github.com/allenjack/GATE
  * DAML: Dual Attention Mutual Learning between Ratings and Reviews for Item Recommendation [KDD 2019]   
    
## Explainable Recommendation Systems
* Social Collaborative Viewpoint Regression with Explainable Recommendations [WSDM 2017]
* Explainable Recommendation via Multi-Task Learning in Opinionated Text Data [SIGIR 2018]
* TEM: Tree-enhanced Embedding Model for Explainable Recommendation [WWW 2018]

## Session-Based Recommendation Systems
### Markov-chain based approach
* Factorizing Personalized Markov Chains for Next-Basket Recommendation [WWW 2010]
* Where You Like to Go Next: Successive Point-of-Interest Recommendation [IJCAI 2013]
* Learning Hierarchical Representation Model for NextBasket Recommendation [SIGIR 2015]
* Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation [ICDM 2016]
* Translation-based Recommendation [RecSys 2017]
    - code : https://drive.google.com/file/d/0B9Ck8jw-TZUEVmdROWZKTy1fcEE/view
    
### RNN based approach
* Session-based Recommendations with Recurrent Neural Networks [ICLR 2016]
  - code : https://github.com/hidasib/GRU4Rec
* Neural Attentive Session-based Recommendation [CIKM 2017]
  - code : https://github.com/lijingsdu/sessionRec_NARM
* Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks [RecSys 2017]
* When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation [RecSys 2017]
* Modeling User Session and Intent with an Attention-based Encoder-Decoder Architecture [RecSys 2017]
* Learning from History and Present: Next-item Recommendation via Discriminatively Exploting Users Behaviors [KDD 2018]
* Recurrent Neural Networks with Top-k Gains for Session-based Recommendations [CIKM 2018]
* Hierarchical Context enabled Recurrent Neural Network for Recommendation. [AAAI 2019] 
* RepeatNet: A Repeat Aware Neural Recommendation Machine for Session-based Recommendation [AAAI 2019]
  - code : https://github.com/PengjieRen/RepeatNet
* Time is of the Essence: a Joint Hierarchical RNN and Point Process Model for Time and Item Predictions [WSDM 2019]
  - code : https://github.com/BjornarVass/Recsys
* Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks [KDD 2019]
* AIR: Attentional Intention-Aware Recommender Systems [ICDE 2019]

### CNN based approach 
* 3D Convolutional Networks for Session-based Recommendation with Content Features [RecSys 2017]
* Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding [WSDM 2018]
  - code : https://github.com/graytowne/caser_pytorch [Pytorch]
  - code : https://github.com/graytowne/caser [Matlab]
* Hierarchical Temporal Convolutional Networks for Dynamic Recommender Systems [WWW 2019]
* A Simple Convolutional Generative Network for Next Item Recommendation [WSDM 2019]
  - code : https://github.com/graytowne/caser_pytorch
  
### Graph based approach
* Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba [KDD 2018]
* Graph Convolutional Neural Networks for Web-Scale Recommender Systems [KDD 2018]
* Session-based Recommendation with Graph Neural Networks [AAAI 2019]
  - code : https://github.com/CRIPAC-DIG/SR-GNN
* Session-based Social Recommendation via Dynamic Graph Attention Networks [WSDM 2019]
  - code : https://github.com/DeepGraphLearning/RecommenderSystems/tree/master/socialRec  
* Graph Contextualized Self-Attention Network for Session-based Recommendation [IJCAI 2019]

### Other approach
* Diversifying Personalized Recommendation with User-session Context [IJCAI 2017]
* Translation-based Factorization Machines for Sequential Recommendation [RecSys 2018]
* Attention-Based Transactional Context Embedding for Next-Item Recommendation [AAAI 2018]
* STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation [KDD 2018]
  - code : https://github.com/uestcnlp/STAMP
* Self-Attentive Sequential Recommendation [ICDM 2018]
  - code : https://github.com/kang205/SASRec
* Taxonomy-aware Multi-hop Reasoning Networks for Sequential Recommendation [WSDM 2019]
  - code : https://github.com/RUCDM/TMRN
* Hierarchical Neural Variational Model for Personalized Sequential Recommendation [WWW 2019]
* BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer [CIKM 2019]
* Hierarchical Gating Networks for Sequential Recommendation [KDD 2019]
* Online Purchase Prediction via Multi-Scale Modeling of Behavior Dynamics [KDD 2019]
* Streaming Session-based Recommendation [KDD 2019]
* Hierarchical Gating Networks for Sequential Recommendation [KDD 2019]
* Log2Intent: Towards Interpretable User Modeling via Recurrent Semantics Memory Unit [KDD 2019]

### News Recommendation
* Google news personalization: scalable online collaborative filtering [WWW 2007]
* Personalized News Recommendation Based on Click Behavior [IUI 2009]
* Personalized News Recommendation Using Twitter [IEEE 2013]
* Recommending Personalized News in Short User Sessions [RecSys 2017]
* Embedding-based News Recommendation for Millions of Users [KDD 2017]
* DKN: Deep Knowledge-Aware Network for News Recommendation [WWW 2018] 
* NPA: Neural News Recommendation with Personalized Attention [KDD 2019]

### Video Recommendation
* Video suggestion and discovery for youtube: taking random walks through the view graph [WWW 2008]
* The YouTube Video Recommendation System [RecSys 2010]
* Deep Neural Networks for YouTube Recommendations [RecSys 2016]
* Wide & Deep Learning for Recommender Systems [DLRS 2016]
* Content-based Related Video Recommendations [NIPS 2016]

### Music Recommendation
* Playlist prediction via metric embedding [KDD 2012]
* Deep content-based music recommendation [NIPS 2013]
* Improving Content-based and Hybrid Music Recommendation using Deep Learning [MM 2014]
* Content-aware collaborative music recommendation using pre-trained neural networks [ISMIR 2015] 

### Image Recommendation
* Pagerank for product image search [WWW 2008]
* Related Pins at Pinterest: The Evolution of a Real-World Recommender System [WWW 2017]
* Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time [WWW 2018]

## Time-aware Recommendation (Temporal Dynamics)
* Time Weight Collaborative Filtering [CIKM 2005]
* Collaborative Filtering with Temporal Dynamics [KDD 2009]
* Opportunity Models for E-commerce Recommendation: Right Product, Right Time [SIGIR 2013] 
* Multi-rate deep learning for temporal recommendation [SIGIR 2016]
* Recurrent Recommender Networks [WSDM 2017]
* Recurrent Recommendation with Local Coherence [WSDM 2019]

## Multi-Armed Bandit
* A Contextual-Bandit Approach to Personalized News Article Recommendation [WWW 2010]
* A survey of online experiment design with the stochastic multi-armed bandit [2015] [[__pdf__](https://arxiv.org/pdf/1510.00757.pdf)]
* Collaborative filtering as a multi-armed bandit [NIPS 2015]
* Online Context-Aware Recommendation with Time Varying Multi-Arm Bandit [KDD 2016]
* Collaborative Filtering Bandits [SIGIR 2016]

## Out of Category
* Learning Multiple Similarities of Users and Items in Recommender Systems [ICDM 2017]
* Neural Collaborative Filtering [WWW 2017]
* MRNet-Product2Vec: A Multi-task Recurrent Neural Network for Product Embeddings [ECML-PKDD 2017]
* A Gradient-based Adaptive Learning Framework for Efficient Personal Recommendation [RecSys 2017]
* IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models [SIGIR 2017]
  - code : https://github.com/geek-ai/irgan
* Collaborative Memory Network for Recommendation Systems [SIGIR 2018]
  - code : https://github.com/tebesu/CollaborativeMemoryNetwork
* Variational Autoencoders for Collaborative Filtering [WWW 2018]
* Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking [WWW 2018]
* Causal Embeddings for Recommendation [RecSys 2018] 
  - https://github.com/criteo-research/CausE
* Linked Variational AutoEncoders for Inferring Substitutable and Supplementary Items [WSDM 2019]
  - https://github.com/VRM1/WSDM19
* RecWalk: Nearly Uncoupled Random Walks for Top-N Recommendation [WSDM 2019]
  - https://github.com/nikolakopoulos/RecWalk

