<h1>Movie Recomendation System</h1>

<strong>Recommender System</strong> is a system that seeks to predict or filiter preferences according to the user's choices. Recommender systems are
utilized in a variety of areas including movies, music, news, books, research articles, search queries, social tags, and products in general.
Recommender systems produce a list of recommendations in any of the two ways-

<strong>Collaborative filtering:</strong> Collaborative filtering approaches build a model from the user's past behavior (i.e. items purchased or searched by the
user) as well as similar decisions made by other users. This model is then used to predict items (or ratings for items) that users may have an
interest in.

<strong>Content-based filtering:</strong> Content-based filtering approaches uses a series of discrete characteristics of an item in order to recommend
additional items with similar properties. Content-based filtering methods are totally based on a description of the item and a profile of the user's
preferences. It recommends items based on the user's past preferences. Let's develop a basic recommendation system using Python and
Pandas.

Let's develop a basic recommendation system by suggesting items that are most similar to a particular item, in this case, movies. It just tells
what movies/items are most similar to the user's movie choice.

<h1>Import Library</h1>

In [1]:
import pandas as pd
import numpy as np
import difflib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

<h1>Import Data</h1>

In [2]:
movies = pd.read_csv('movies.csv')

<h1>Data Preprocessing</h1>

In [3]:
movies.head()

Unnamed: 0,index,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,...,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew,director
0,0,237000000,Action Adventure Fantasy Science Fiction,http://www.avatarmovie.com/,19995,culture clash future space war space colony so...,en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,...,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,Sam Worthington Zoe Saldana Sigourney Weaver S...,"[{'name': 'Stephen E. Rivkin', 'gender': 0, 'd...",James Cameron
1,1,300000000,Adventure Fantasy Action,http://disney.go.com/disneypictures/pirates/,285,ocean drug abuse exotic island east india trad...,en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,...,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500,Johnny Depp Orlando Bloom Keira Knightley Stel...,"[{'name': 'Dariusz Wolski', 'gender': 2, 'depa...",Gore Verbinski
2,2,245000000,Action Adventure Crime,http://www.sonypictures.com/movies/spectre/,206647,spy based on novel secret agent sequel mi6,en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,...,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466,Daniel Craig Christoph Waltz L\u00e9a Seydoux ...,"[{'name': 'Thomas Newman', 'gender': 2, 'depar...",Sam Mendes
3,3,250000000,Action Crime Drama Thriller,http://www.thedarkknightrises.com/,49026,dc comics crime fighter terrorist secret ident...,en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,...,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106,Christian Bale Michael Caine Gary Oldman Anne ...,"[{'name': 'Hans Zimmer', 'gender': 2, 'departm...",Christopher Nolan
4,4,260000000,Action Adventure Science Fiction,http://movies.disney.com/john-carter,49529,based on novel mars medallion space travel pri...,en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,...,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124,Taylor Kitsch Lynn Collins Samantha Morton Wil...,"[{'name': 'Andrew Stanton', 'gender': 2, 'depa...",Andrew Stanton


In [4]:
movies.shape

(4803, 24)

In [5]:
movies.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4803 entries, 0 to 4802
Data columns (total 24 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   index                 4803 non-null   int64  
 1   budget                4803 non-null   int64  
 2   genres                4775 non-null   object 
 3   homepage              1712 non-null   object 
 4   id                    4803 non-null   int64  
 5   keywords              4391 non-null   object 
 6   original_language     4803 non-null   object 
 7   original_title        4803 non-null   object 
 8   overview              4800 non-null   object 
 9   popularity            4803 non-null   float64
 10  production_companies  4803 non-null   object 
 11  production_countries  4803 non-null   object 
 12  release_date          4802 non-null   object 
 13  revenue               4803 non-null   int64  
 14  runtime               4801 non-null   float64
 15  spoken_languages     

<h1>Defining features </h1>

In [6]:
selected_features = ['genres', 'tagline', 'cast', 'director']

<h1>Handling mising values of selected features</h1>

In [7]:
movies[selected_features].isnull().sum()

genres       28
tagline     844
cast         43
director     30
dtype: int64

In [8]:
for i in selected_features :
     movies[i] = movies[i].fillna('')

In [9]:
#Combining all features
all_features = movies['genres']+''+movies['tagline']+''+movies['cast']+''+movies['director']

In [10]:
all_features

0       Action Adventure Fantasy Science FictionEnter ...
1       Adventure Fantasy ActionAt the end of the worl...
2       Action Adventure CrimeA Plan No One EscapesDan...
3       Action Crime Drama ThrillerThe Legend EndsChri...
4       Action Adventure Science FictionLost in our wo...
                              ...                        
4798    Action Crime ThrillerHe didn't come looking fo...
4799    Comedy RomanceA newlywed couple's honeymoon is...
4800    Comedy Drama Romance TV MovieEric Mabius Krist...
4801    A New Yorker in ShanghaiDaniel Henney Eliza Co...
4802    DocumentaryDrew Barrymore Brian Herzlinger Cor...
Length: 4803, dtype: object

In [11]:
# Converting text values to numerical values( vectors )
vectorizer = TfidfVectorizer()

In [12]:
feature_vectors = vectorizer.fit_transform(all_features)

In [13]:
feature_vectors

<4803x20854 sparse matrix of type '<class 'numpy.float64'>'
	with 89316 stored elements in Compressed Sparse Row format>

In [14]:
print(feature_vectors)

  (0, 2711)	0.20742747671670228
  (0, 15686)	0.3352969281083964
  (0, 12746)	0.19217431423971001
  (0, 11043)	0.27386294744796086
  (0, 17571)	0.19798363994787105
  (0, 19994)	0.24044536167848338
  (0, 17037)	0.24740477210837714
  (0, 16299)	0.2638482205518432
  (0, 20825)	0.24206990805515324
  (0, 20512)	0.28747761467973654
  (0, 16328)	0.19128657473178384
  (0, 14183)	0.31981988769723463
  (0, 13879)	0.12181177765323377
  (0, 20498)	0.17479927668171105
  (0, 18182)	0.09478913985239082
  (0, 6953)	0.3352969281083964
  (0, 16570)	0.12224067068628999
  (0, 6661)	0.14559379524857047
  (0, 318)	0.1115843979600231
  (0, 176)	0.09628903841708969
  (1, 19578)	0.2482812501615493
  (1, 6793)	0.29480760059336464
  (1, 20709)	0.23467316501253754
  (1, 3253)	0.2294996111482978
  (1, 19077)	0.22106507986352583
  :	:
  (4800, 5321)	0.08148493695154697
  (4801, 9443)	0.31328451191966095
  (4801, 16942)	0.31328451191966095
  (4801, 16151)	0.31328451191966095
  (4801, 4050)	0.31328451191966095
  (4801

In [15]:
# find similarity score using cosine similarity
similarity_score = cosine_similarity(feature_vectors)

In [16]:
similarity_score

array([[1.        , 0.10414422, 0.02093683, ..., 0.        , 0.        ,
        0.        ],
       [0.10414422, 1.        , 0.02110299, ..., 0.        , 0.        ,
        0.        ],
       [0.02093683, 0.02110299, 1.        , ..., 0.        , 0.        ,
        0.        ],
       ...,
       [0.        , 0.        , 0.        , ..., 1.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 1.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        1.        ]])

<h2><em>Movie Recommendation System</em></h2>

In [17]:
# Getting movie name as input from user
user_input = input('Enter movie name : ')

# Creating a list of movies
movies_name_list = movies['title'].tolist()

# finding the close match for Movie_Title given by user
find_close_match = difflib.get_close_matches(user_input, movies_name_list)

# get first value of close match
first_close_match = find_close_match[0]

#find index of first close matched movie
index_of_the_movie = movies[movies.title == first_close_match]['index'].values[0]

similarity = list(enumerate(similarity_score[index_of_the_movie]))

# sorting movies based on similarity score
sorted_similar_movies = sorted(similarity, key = lambda x:x[1], reverse = True)

# print name of similar movies
print(" Movies suggested for you : \n")
i = 1 
for movie in sorted_similar_movies :
    index = movie[0]
    title_from_index = movies[movies.index== index]['title'].values[0]
    if i<11:
        print(i,".",title_from_index)
        i+=1

Enter movie name :  Rambo


 Movies suggested for you : 

1 . Rambo III
2 . First Blood
3 . Patriot Games
4 . Rambo: First Blood Part II
5 . The Four Feathers
6 . Zookeeper
7 . The Expendables
8 . The Expendables 2
9 . Death Race 2000
10 . Me You and Five Bucks
