# Contents

* [Import Required Libraries](#Import-Required-Libraries) <br> <br>
* [importing Data](#Importing-Data) <br> <br>
* [Recommender System](#Recommender-System) <br>

    - [Selecting Target User](#Selecting-Target-User) <br>
    - [Filtering Dataset](#Filtering-Dataset) <br>
    - [Finding Similar Users](#Finding-Similar-Users) <br>
    - [Calculating New Rate of Each Movie](#Calculating-New-Rate-of-Each-Movie) <br>
    - [Recommending Top 5 Scored Movies](#Recommending-Top-5-Scored-Movies)

## Import Required Libraries

In [52]:
# Loading required Libraries
import pandas as pd
import numpy as np
from scipy import spatial

## Importing Data

In [53]:
# Importing Data
rating = pd.read_csv('rating.csv')
movies = pd.read_csv('movies.csv')

In [54]:
# Selecting Intended Columns
rating = rating.iloc[:, :3]

In [55]:
# Looking at Top 5 rows of rating dataset
rating.head()

Unnamed: 0,user id,item id,rating
0,196,242,3
1,186,302,3
2,22,377,1
3,244,51,2
4,166,346,1


In [56]:
# Looking at Top 5 rows of moving dataset
movies.head()

Unnamed: 0,movie id,movie name
0,1,Toy Story (1995)
1,2,GoldenEye (1995)
2,3,Four Rooms (1995)
3,4,Get Shorty (1995)
4,5,Copycat (1995)


## Recommender System

- ### Selecting Target User

In [57]:
# Selecting Target User Randomly
selected_user = rating['item id'].sample(1).iloc[0]
selected_data = rating[rating['user id'] == selected_user]

- ### Filtering Dataset

In [58]:
# Droping Selected User's Data from our Rating Data -> Because we want to find simmilar user by using this Rating dataset.
rating = rating[rating['user id'] != selected_user]
# Fiding users who has watched one of selected user's movies.
rating = rating[rating['item id'].isin(selected_data['item id'].tolist())]

In [59]:
# Grouping Rating data using user ids 
group_data = rating.groupby('user id')

- ### Finding Similar Users

In [60]:
# Creating an empty dictinary to store similarity rate of each user.
dist = {}
# Finding Degree of Similarity by calculating similarity rate
for user_id, data in group_data:
    
    data = data.sort_values('item id')
    selected_data = selected_data.sort_values('item id')
    
    temp_data = selected_data[selected_data['item id'].isin(data['item id'].tolist())]
    
    x = temp_data[' rating']
    y = data[' rating']
    
    cosine = spatial.distance.cosine(x, y)
    
    dist[user_id] = cosine

In [61]:
# Storing Similarity Rate in similarityRate column in rating dataset
rating['similarityRate'] = rating['user id'].apply(lambda x: dist[x])

- ### Calculating New Rate of Each Movie

In [62]:
# Multiplying Similarity Rate of each user in Rate of each movies the user watched
rating['simXrate'] = rating['similarityRate']*rating[' rating']

In [63]:
# Calculating Weigthed Average to Predict the rate of Target User to each movie:

## Step 1: Grouping data by item id and Sum similarityRate and simXrate columns of the dataset
recommendation = rating.groupby('item id').sum(['similarityRate', 'simXrate']).reset_index()
## Step 2: Deviding Sum of simXrate by Sum of SimilarityRate
recommendation['Predicted_Rate'] = recommendation['simXrate'] / recommendation['similarityRate']

- ### Recommending Top 5 Scored Movies

In [64]:
# Selecting Top 5 movie ids based on calculated rate
suggestions_ids = recommendation.sort_values('Predicted_Rate', ascending=False).nlargest(5, 'Predicted_Rate')['item id']

In [65]:
# Showing suggested Movies
movies[movies['movie id'].isin(suggestions_ids)]

Unnamed: 0,movie id,movie name
11,12,"Usual Suspects, The (1995)"
49,50,Star Wars (1977)
113,114,Wallace & Gromit: The Best of Aardman Animatio...
479,480,North by Northwest (1959)
602,603,Rear Window (1954)
