# Recommender

## Load User Models data

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
import os
from ast import literal_eval
import pandas as pd
import numpy as np

In [3]:
os.chdir("../Dataset")

In [4]:
with open("ModelStorage/idToTitle.json", "r") as file:
    idToTitle = file.read()

In [5]:
ratings = pd.read_csv("ratings_small.csv")

In [6]:
idToTitle = literal_eval(idToTitle)

In [7]:
idToTitle = { int(k):v for k, v in idToTitle.items()}

In [8]:
idToTitle[949]

'Heat'

In [9]:
dataset = pd.read_csv("FeatureExtracted/dataset.csv")

In [10]:
dataset.drop('id', axis = 1, inplace = True)

In [11]:
with open("ModelStorage/indexToId.txt", "r") as file:
    indexToId = file.read()

In [12]:
indexToId = literal_eval(indexToId)

In [13]:
with open("ModelStorage/idToIndex.txt", "r") as file:
    idToIndex = file.read()

In [14]:
idToIndex = literal_eval(idToIndex)

In [15]:
U = np.fromfile("ModelStorage/U.txt").reshape((671,38))
b = np.fromfile("ModelStorage/b.txt").reshape((671,1))

## Prediction

Fist we need to calculate the matrix of ratings (Same step as in Machine Learning Phase of Classification Model)

`Content_Based_Classification.ipynb` or `Content_Based_Classification_Colab.ipynb`

In [16]:
from sklearn import preprocessing
dataset = preprocessing.scale(dataset)

In [17]:
dataset = np.concatenate((dataset, np.ones((2830,1))), axis = 1)

In [18]:
probs_matrix = np.matmul(np.concatenate((U,b), axis = 1), dataset.T)

In [19]:
def sigmoid(x):
    return 1 /(1 + 1 / np.exp(x))

In [20]:
probs_matrix = sigmoid(probs_matrix)

## Recommendation

Let's specify user id (From 0 to 670)

In [21]:
user = 22

In [22]:
probs_matrix[user - 1]

array([0.47994519, 0.46313222, 0.41660993, ..., 0.46990951, 0.46955222,
       0.49098539])

In [23]:
decision_user_df = pd.DataFrame()

In [24]:
decision_user_df['movie_Id'] = list(range(probs_matrix[user - 1].size))
decision_user_df['movie_Id'] = decision_user_df['movie_Id'].apply(lambda x: indexToId[x])
decision_user_df['shouldRecommend'] = probs_matrix[user - 1]

Transform the probabilty: 1 if p>=0.5 else 0 

1 means that user may rate 4.0 for the movie

0 means lower than 4.0

In [25]:
decision_user_df['shouldRecommend'] = decision_user_df['shouldRecommend'].apply(lambda x: 1 if x>=0.5 else 0)

In [35]:
decision_user_df

Unnamed: 0,movie_Id,shouldRecommend
0,949,0
1,710,0
2,1408,0
3,524,0
4,4584,0
...,...,...
2825,80831,0
2826,3104,0
2827,64197,0
2828,98604,0


In [33]:
decision_user_df['shouldRecommend'].value_counts()

0    2669
1      71
Name: shouldRecommend, dtype: int64

In the predicted `shouldRecommend`(pred rating >=4) for user, drop the movies that they have seen

In [27]:
movies_have_seen = ratings[ratings['userId'] == user]['movieId'].values

In [28]:
decision_user_df[decision_user_df['movie_Id'].isin(movies_have_seen)].index

Int64Index([  37,   71,  102,  134,  174,  178,  249,  304,  351,  360,  395,
             415,  416,  429,  448,  452,  456,  474,  475,  476,  491,  507,
             537,  548,  572,  601,  609,  640,  645,  650,  652,  675,  685,
             783,  802,  804,  810,  821,  824,  833,  861,  885,  898,  925,
             950,  970, 1041, 1045, 1069, 1093, 1121, 1128, 1179, 1277, 1289,
            1326, 1327, 1393, 1403, 1454, 1457, 1480, 1503, 1548, 1572, 1586,
            1609, 1623, 1645, 1695, 1719, 1763, 1798, 1803, 1816, 1823, 1854,
            1957, 1969, 1991, 2148, 2204, 2275, 2399, 2612, 2614, 2635, 2710,
            2764, 2778],
           dtype='int64')

In [29]:
decision_user_df.drop(decision_user_df[decision_user_df['movie_Id'].isin(movies_have_seen)].index, inplace = True)

In [30]:
indexToRecommend = decision_user_df['shouldRecommend'].sort_values(ascending = False)[0:20].index

Movie user have seen.

In [31]:
for x in movies_have_seen:
    if x in indexToId.keys():
        if indexToId[x] in idToTitle.keys():
            print(idToTitle[indexToId[x]])

Strange Days
Mary Shelley's Frankenstein
Natural Born Killers
Once Were Warriors
Timecop
Sabrina
Citizen Kane
Spellbound
Bringing Up Baby
One Flew Over the Cuckoo's Nest
Amadeus
Das Boot
The Manchurian Candidate
Young Frankenstein
Ben-Hur
Sneakers
Addicted to Love
Welcome to Woop Woop
Wild Things
Rush Hour
Elizabeth
Armed and Dangerous
Office Space
The Thomas Crown Affair
Only Angels Have Wings
American Beauty
Jakob the Liar
Sleepy Hollow
Babes in Toyland
Maurice
Backdraft
The Fisher King
El Dorado
Gentlemen Prefer Blondes
Lara Croft: Tomb Raider
Ocean's Eleven
All the President's Men
Mystic River
Man of Marble
In My Skin
Girl with a Pearl Earring
Van Helsing
The Rolling Stones: Gimme Shelter
Mercy
Fail-Safe
The Miracle Worker
The Wages of Fear
Same Old Song
Nostalgia
Spring, Summer, Fall, Winter... and Spring
Hiroshima Mon Amour
Zatoichi
Samsara
The Leopard
The Polar Express
Un chien andalou
The Last Hurrah
Marlowe
Claire's Knee
Zabriskie Point
Duel
The Emigrants
The Mad Adventures of

Movie to recommend user

In [32]:
for x in indexToRecommend:
    print(idToTitle[indexToId[x]])

Frankenstein 90
Buck Rogers in the 25th Century
At Risk
Duel
The Dark Knight
Godzilla vs. Mechagodzilla II
Godzilla vs. The Sea Monster
Ghost Rider
The Count of Monte-Cristo
The Bodyguard
Tuya's Marriage
Read It and Weep
Shark Kill
Titanic
The Day After
Hulk
Pirates of the Caribbean: The Curse of the Black Pearl
Dollman vs. Demonic Toys
Pirates of the Caribbean: On Stranger Tides
Performance
