# Recommender

## Load User Models data

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
import warnings
warnings.filterwarnings('ignore')

In [3]:
import os
from ast import literal_eval
import pandas as pd
import numpy as np

In [4]:
os.chdir("/content/drive/MyDrive/MovieRSystem")

In [5]:
with open("ModelStorage/idToTitle.json", "r") as file:
    idToTitle = file.read()

In [6]:
ratings = pd.read_csv("ratings_small.csv")

In [7]:
idToTitle = literal_eval(idToTitle)

In [8]:
idToTitle = { int(k):v for k, v in idToTitle.items()}

In [9]:
idToTitle[949]

'Heat'

In [10]:
dataset = pd.read_csv("FeatureExtracted/dataset.csv")

In [11]:
dataset.drop('id', axis = 1, inplace = True)

In [12]:
with open("ModelStorage/indexToId.txt", "r") as file:
    indexToId = file.read()

In [13]:
indexToId = literal_eval(indexToId)

In [14]:
with open("ModelStorage/idToIndex.txt", "r") as file:
    idToIndex = file.read()

In [15]:
idToIndex = literal_eval(idToIndex)

In [16]:
U = np.fromfile("ModelStorage/U.txt").reshape((671,38))
b = np.fromfile("ModelStorage/b.txt").reshape((671,1))

## Prediction

Fist we need to calculate the matrix of ratings (Same step as in Machine Learning Phase of Classification Model)

`Content_Based_Classification.ipynb` or `Content_Based_Classification_Colab.ipynb`

In [17]:
from sklearn import preprocessing
dataset = preprocessing.scale(dataset)

In [18]:
dataset = np.concatenate((dataset, np.ones((2830,1))), axis = 1)

In [19]:
probs_matrix = np.matmul(np.concatenate((U,b), axis = 1), dataset.T)

In [20]:
def sigmoid(x):
    return 1 /(1 + 1 / np.exp(x))

In [21]:
probs_matrix = sigmoid(probs_matrix)

## Recommendation

Let's specify user id (From 0 to 670)

In [22]:
user = 22

In [23]:
probs_matrix[user - 1]

array([0.4705338 , 0.47612069, 0.47391367, ..., 0.47825316, 0.47688227,
       0.48114768])

In [24]:
decision_user_df = pd.DataFrame()

In [25]:
decision_user_df['movie_Id'] = list(range(probs_matrix[user - 1].size))
decision_user_df['movie_Id'] = decision_user_df['movie_Id'].apply(lambda x: indexToId[x])
decision_user_df['shouldRecommend'] = probs_matrix[user - 1]

Transform the probabilty: 1 if p>=0.5 else 0 

1 means that user may rate 4.0 for the movie

0 means lower than 4.0

In [26]:
decision_user_df['shouldRecommend'] = decision_user_df['shouldRecommend'].apply(lambda x: 1 if x>=0.5 else 0)

In [27]:
decision_user_df

Unnamed: 0,movie_Id,shouldRecommend
0,949,0
1,710,0
2,1408,0
3,524,0
4,4584,0
...,...,...
2825,80831,0
2826,3104,0
2827,64197,0
2828,98604,0


In [None]:
decision_user_df['shouldRecommend'].value_counts()

In the predicted `shouldRecommend`(pred rating >=4) for user, drop the movies that they have seen

In [28]:
movies_have_seen = ratings[ratings['userId'] == user]['movieId'].values

In [29]:
decision_user_df[decision_user_df['movie_Id'].isin(movies_have_seen)].index

Int64Index([  37,   71,  102,  134,  174,  178,  249,  304,  351,  360,  395,
             415,  416,  429,  448,  452,  456,  474,  475,  476,  491,  507,
             537,  548,  572,  601,  609,  640,  645,  650,  652,  675,  685,
             783,  802,  804,  810,  821,  824,  833,  861,  885,  898,  925,
             950,  970, 1041, 1045, 1069, 1093, 1121, 1128, 1179, 1277, 1289,
            1326, 1327, 1393, 1403, 1454, 1457, 1480, 1503, 1548, 1572, 1586,
            1609, 1623, 1645, 1695, 1719, 1763, 1798, 1803, 1816, 1823, 1854,
            1957, 1969, 1991, 2148, 2204, 2275, 2399, 2612, 2614, 2635, 2710,
            2764, 2778],
           dtype='int64')

In [30]:
decision_user_df.drop(decision_user_df[decision_user_df['movie_Id'].isin(movies_have_seen)].index, inplace = True)

In [31]:
indexToRecommend = decision_user_df['shouldRecommend'].sort_values(ascending = False)[0:20].index

Movie user have seen.

In [32]:
for x in movies_have_seen:
    if x in indexToId.keys():
        if indexToId[x] in idToTitle.keys():
            print(idToTitle[indexToId[x]])

Strange Days
Mary Shelley's Frankenstein
Natural Born Killers
Once Were Warriors
Timecop
Sabrina
Citizen Kane
Spellbound
Bringing Up Baby
One Flew Over the Cuckoo's Nest
Amadeus
Das Boot
The Manchurian Candidate
Young Frankenstein
Ben-Hur
Sneakers
Addicted to Love
Welcome to Woop Woop
Wild Things
Rush Hour
Elizabeth
Armed and Dangerous
Office Space
The Thomas Crown Affair
Only Angels Have Wings
American Beauty
Jakob the Liar
Sleepy Hollow
Babes in Toyland
Maurice
Backdraft
The Fisher King
El Dorado
Gentlemen Prefer Blondes
Lara Croft: Tomb Raider
Ocean's Eleven
All the President's Men
Mystic River
Man of Marble
In My Skin
Girl with a Pearl Earring
Van Helsing
The Rolling Stones: Gimme Shelter
Mercy
Fail-Safe
The Miracle Worker
The Wages of Fear
Same Old Song
Nostalgia
Spring, Summer, Fall, Winter... and Spring
Hiroshima Mon Amour
Zatoichi
Samsara
The Leopard
The Polar Express
Un chien andalou
The Last Hurrah
Marlowe
Claire's Knee
Zabriskie Point
Duel
The Emigrants
The Mad Adventures of

Movie to recommend user

In [33]:
for x in indexToRecommend:
    print(idToTitle[indexToId[x]])

Scooby-Doo! and the Reluctant Werewolf
At Risk
Duel of Hearts
I Spy Returns
The Even Stevens Movie
Heat
The Dark Knight
The Bank Job
Hallam Foe
City of Men
The Naked Island
Downhill Racer
Mala Noche
The Promised Land
Phar Lap
The Marksman
Khadak
Marketa Lazarová
Stop-Loss
Shine a Light
