<h1 align='center'>Movies Recommendation System using Cosine Similarity </h1>

There are 3 types of Recoomendation systems :
 - `Content Based Recommendation system`: 
   - Utilizes user preferences and item characteristics to suggest items similar to those the user has liked. 
   - It relies on item features, such as keywords or genres, to make personalized recommendations.
 - `Popularity Based Recommendation system`: 
   - Suggests items based on their overall popularity or frequency of interaction by users.    - It recommends items that are widely liked or frequently accessed, without considering individual user preferences
 - `Collaborative Based Recommendation system`: 
   - Recommends items based on user behavior and preferences by analyzing user interactions. 
   - It can be user-user collaborative filtering (similar users' preferences) or item-item collaborative filtering (similar items liked by a user).

Here we will create **Content Based Recommendation system**.

## 1). Importing Dependencies

In [1]:
import numpy as np
import pandas as pd
import difflib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [2]:
df=pd.read_csv('../Datasets/movies.csv')
df.head()

Unnamed: 0,index,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,...,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew,director
0,0,237000000,Action Adventure Fantasy Science Fiction,http://www.avatarmovie.com/,19995,culture clash future space war space colony so...,en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,...,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,Sam Worthington Zoe Saldana Sigourney Weaver S...,"[{'name': 'Stephen E. Rivkin', 'gender': 0, 'd...",James Cameron
1,1,300000000,Adventure Fantasy Action,http://disney.go.com/disneypictures/pirates/,285,ocean drug abuse exotic island east india trad...,en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,...,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500,Johnny Depp Orlando Bloom Keira Knightley Stel...,"[{'name': 'Dariusz Wolski', 'gender': 2, 'depa...",Gore Verbinski
2,2,245000000,Action Adventure Crime,http://www.sonypictures.com/movies/spectre/,206647,spy based on novel secret agent sequel mi6,en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,...,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466,Daniel Craig Christoph Waltz L\u00e9a Seydoux ...,"[{'name': 'Thomas Newman', 'gender': 2, 'depar...",Sam Mendes
3,3,250000000,Action Crime Drama Thriller,http://www.thedarkknightrises.com/,49026,dc comics crime fighter terrorist secret ident...,en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,...,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106,Christian Bale Michael Caine Gary Oldman Anne ...,"[{'name': 'Hans Zimmer', 'gender': 2, 'departm...",Christopher Nolan
4,4,260000000,Action Adventure Science Fiction,http://movies.disney.com/john-carter,49529,based on novel mars medallion space travel pri...,en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,...,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124,Taylor Kitsch Lynn Collins Samantha Morton Wil...,"[{'name': 'Andrew Stanton', 'gender': 2, 'depa...",Andrew Stanton


## 2). Data Preprocessing

In [3]:
df.shape

(4803, 24)

In [4]:
df.isnull().sum()

index                      0
budget                     0
genres                    28
homepage                3091
id                         0
keywords                 412
original_language          0
original_title             0
overview                   3
popularity                 0
production_companies       0
production_countries       0
release_date               1
revenue                    0
runtime                    2
spoken_languages           0
status                     0
tagline                  844
title                      0
vote_average               0
vote_count                 0
cast                      43
crew                       0
director                  30
dtype: int64

### 2.1) Selecting Relevant Features

In [5]:
selected_features=['genres','keywords','tagline','cast','director']

In [6]:
selected_features

['genres', 'keywords', 'tagline', 'cast', 'director']

### 2.2) Replacing null values with null string

In [7]:
for feature in selected_features:
    df[feature]=df[feature].fillna('')

### 2.3) Combing All features

In [8]:
combined_features=df['genres']+' '+df['keywords']+' '+df['tagline']+' '+df['cast']+' '+df['director']

In [9]:
combined_features

0       Action Adventure Fantasy Science Fiction cultu...
1       Adventure Fantasy Action ocean drug abuse exot...
2       Action Adventure Crime spy based on novel secr...
3       Action Crime Drama Thriller dc comics crime fi...
4       Action Adventure Science Fiction based on nove...
                              ...                        
4798    Action Crime Thriller united states\u2013mexic...
4799    Comedy Romance  A newlywed couple's honeymoon ...
4800    Comedy Drama Romance TV Movie date love at fir...
4801      A New Yorker in Shanghai Daniel Henney Eliza...
4802    Documentary obsession camcorder crush dream gi...
Length: 4803, dtype: object

### 2.4) Convert text data to feature vectors

In [10]:
vectorizer=TfidfVectorizer()

In [11]:
feature_vectors=vectorizer.fit_transform(combined_features)

In [12]:
# print(feature_vectors)

### 2.5). Cosine Similarity

In [13]:
similarity=cosine_similarity(feature_vectors)

In [14]:
print(similarity)

[[1.         0.07219487 0.037733   ... 0.         0.         0.        ]
 [0.07219487 1.         0.03281499 ... 0.03575545 0.         0.        ]
 [0.037733   0.03281499 1.         ... 0.         0.05389661 0.        ]
 ...
 [0.         0.03575545 0.         ... 1.         0.         0.02651502]
 [0.         0.         0.05389661 ... 0.         1.         0.        ]
 [0.         0.         0.         ... 0.02651502 0.         1.        ]]


In [15]:
similarity.shape

(4803, 4803)

### 2.6). Creating List that contains Names of all movies

In [16]:
list_of_all_movies=df['title'].tolist()

In [17]:
# print(list_of_all_movies)

## 3).Recommender System
### 3.1). Taking movie name as input

In [18]:
movie_name=input("Enter your Favourite Movie : ")

Enter your Favourite Movie : sunshine


In [19]:
movie_name

'sunshine'

### 3.2). Finding Close match for movie name given by user

In [20]:
find_close_match=difflib.get_close_matches(movie_name,list_of_all_movies)

In [21]:
find_close_match

['Sunshine', 'Sunshine State', 'Shine']

In [22]:
close_match=find_close_match[0]

### 3.3). Finding index of Close_match 

In [23]:
index_of_movie=df[df['title']==close_match]['index'].values[0]

In [24]:
index_of_movie

1275

### 3.4). List of similiar movies

In [25]:
similarity_sore=list(enumerate(similarity[index_of_movie]))

In [26]:
# similarity_sore

In [27]:
len(similarity_sore)

4803

### 3.5). Sorting movies Based on similarity score

In [28]:
sorted_similar_movies=sorted(similarity_sore,key=lambda x:x[1],reverse=True)

In [29]:
# sorted_similar_movies

### 3.6). Finding movies to be Suggested

In [30]:
print('Recommended Movies :')
i=1
for movie in sorted_similar_movies:
    index=movie[0]
    movie_title=df[df.index==index]['title'].values[0]
    if(i<=20):
        print(f"{i} : {movie_title}")
        i=i+1

Recommended Movies :
1 : Sunshine
2 : Men in Black II
3 : Men in Black
4 : Silent Running
5 : Cellular
6 : AVP: Alien vs. Predator
7 : Do the Right Thing
8 : Wicker Park
9 : Collateral Damage
10 : Alien
11 : Mars Attacks!
12 : Avatar
13 : X-Men: First Class
14 : 28 Days Later
15 : Transformers: Revenge of the Fallen
16 : Space Chimps
17 : Moonraker
18 : The Shadow
19 : The Helix... Loaded
20 : Star Trek IV: The Voyage Home


# Final Predictive System

In [33]:
movie_name=input("Enter your Favourite Movie : ")
find_close_match=difflib.get_close_matches(movie_name,list_of_all_movies)
close_match=find_close_match[0]
index_of_movie=df[df['title']==close_match]['index'].values[0]
similarity_sore=list(enumerate(similarity[index_of_movie]))
sorted_similar_movies=sorted(similarity_sore,key=lambda x:x[1],reverse=True)

print('Recommended Movies :')
i=1
for movie in sorted_similar_movies:
    index=movie[0]
    movie_title=df[df.index==index]['title'].values[0]
    if(i<=20):
        print(f"{i} : {movie_title}")
        i=i+1

Enter your Favourite Movie : hercules
Recommended Movies :
1 : Hercules
2 : The Hustler
3 : The Scorpion King
4 : The Young Messiah
5 : Light Sleeper
6 : A Knight's Tale
7 : The Greatest Movie Ever Sold
8 : Red Dragon
9 : Back to the Future
10 : Dark City
11 : Gandhi, My Father
12 : Alexander
13 : The Last Emperor
14 : Illuminata
15 : X-Men: The Last Stand
16 : Sholem Aleichem: Laughing In The Darkness
17 : Extreme Ops
18 : Dragonslayer
19 : Splash
20 : Restless
