Movie Recommendation System

Recommendation System:The objective of this project is to develop a movie recommendation system that provides personalized movie recommendations to users based on their past viewing history and preferences. The system will combine two popular machine learning techniques: Collaborative Filtering and Content-Based Filtering.

Collaborative Filtering: This approach is based on the idea that users who have similar movie preferences in the past will have similar preferences in the future. Collaborative filtering uses the past movie ratings of users to identify similar users and recommend movies that were liked by those similar users but not yet watched by the active user.

Content-Based Filtering: This approach is based on the idea that movies with similar attributes and characteristics will be liked by the same users. Content-based filtering uses movie metadata, such as genre, actors, directors, and plot summaries, to recommend similar movies to the ones that the user has already watched.

In [8]:
import pandas as pd
import numpy as np

In [9]:
df=pd.read_csv("/movie.csv")

In [10]:
df.head()

Unnamed: 0,title,year,certificate,duration,genre,rating,description,stars,votes
0,Cobra Kai,(2018â€“ ),TV-14,30 min,"Action, Comedy, Drama",8.5,Decades after their 1984 All Valley Karate Tou...,"['Ralph Macchio, ', 'William Zabka, ', 'Courtn...",177031
1,The Crown,(2016â€“ ),TV-MA,58 min,"Biography, Drama, History",8.7,Follows the political rivalries and romance of...,"['Claire Foy, ', 'Olivia Colman, ', 'Imelda St...",199885
2,Better Call Saul,(2015â€“2022),TV-MA,46 min,"Crime, Drama",8.9,The trials and tribulations of criminal lawyer...,"['Bob Odenkirk, ', 'Rhea Seehorn, ', 'Jonathan...",501384
3,Devil in Ohio,-2022,TV-MA,356 min,"Drama, Horror, Mystery",5.9,When a psychiatrist shelters a mysterious cult...,"['Emily Deschanel, ', 'Sam Jaeger, ', 'Gerardo...",9773
4,Cyberpunk: Edgerunners,(2022â€“ ),TV-MA,24 min,"Animation, Action, Adventure",8.6,A Street Kid trying to survive in a technology...,"['Zach Aguilar, ', 'Kenichiro Ohashi, ', 'Emi ...",15413


In [11]:
df.shape

(9957, 9)

In [12]:
df.info

<bound method DataFrame.info of                        title           year certificate duration  \
0                  Cobra Kai     (2018â€“ )       TV-14   30 min   
1                  The Crown     (2016â€“ )       TV-MA   58 min   
2           Better Call Saul  (2015â€“2022)       TV-MA   46 min   
3              Devil in Ohio          -2022       TV-MA  356 min   
4     Cyberpunk: Edgerunners     (2022â€“ )       TV-MA   24 min   
...                      ...            ...         ...      ...   
9952          The Imperfects     (2022â€“ )       TV-MA   45 min   
9953        The Walking Dead  (2010â€“2022)       TV-MA   44 min   
9954               The Crown     (2016â€“ )       TV-MA   58 min   
9955            Supernatural  (2005â€“2020)       TV-14   44 min   
9956           Devil in Ohio          -2022       TV-MA  356 min   

                             genre  rating  \
0            Action, Comedy, Drama     8.5   
1        Biography, Drama, History     8.7   
2            

In [13]:
df.columns

Index(['title', 'year', 'certificate', 'duration', 'genre', 'rating',
       'description', 'stars', 'votes'],
      dtype='object')

Get Features Selection

In [14]:
df_features = df[['title',	'year',	'certificate',	'duration',	'genre',	'rating',	'description',	'stars',	'votes']]

In [15]:
df_features.shape

(9957, 9)

In [51]:
df_features


Unnamed: 0,title,year,certificate,duration,genre,rating,description,stars,votes
0,Cobra Kai,(2018â€“ ),TV-14,30 min,"Action, Comedy, Drama",8.5,Decades after their 1984 All Valley Karate Tou...,"['Ralph Macchio, ', 'William Zabka, ', 'Courtn...",177031
1,The Crown,(2016â€“ ),TV-MA,58 min,"Biography, Drama, History",8.7,Follows the political rivalries and romance of...,"['Claire Foy, ', 'Olivia Colman, ', 'Imelda St...",199885
2,Better Call Saul,(2015â€“2022),TV-MA,46 min,"Crime, Drama",8.9,The trials and tribulations of criminal lawyer...,"['Bob Odenkirk, ', 'Rhea Seehorn, ', 'Jonathan...",501384
3,Devil in Ohio,-2022,TV-MA,356 min,"Drama, Horror, Mystery",5.9,When a psychiatrist shelters a mysterious cult...,"['Emily Deschanel, ', 'Sam Jaeger, ', 'Gerardo...",9773
4,Cyberpunk: Edgerunners,(2022â€“ ),TV-MA,24 min,"Animation, Action, Adventure",8.6,A Street Kid trying to survive in a technology...,"['Zach Aguilar, ', 'Kenichiro Ohashi, ', 'Emi ...",15413
...,...,...,...,...,...,...,...,...,...
9952,The Imperfects,(2022â€“ ),TV-MA,45 min,"Action, Adventure, Drama",6.3,After an experimental gene therapy turns them ...,"['Morgan Taylor Campbell, ', 'Italia Ricci, ',...",3130
9953,The Walking Dead,(2010â€“2022),TV-MA,44 min,"Drama, Horror, Thriller",8.1,Sheriff Deputy Rick Grimes wakes up from a com...,"['Andrew Lincoln, ', 'Norman Reedus, ', 'Melis...",970067
9954,The Crown,(2016â€“ ),TV-MA,58 min,"Biography, Drama, History",8.7,Follows the political rivalries and romance of...,"['Claire Foy, ', 'Olivia Colman, ', 'Imelda St...",199898
9955,Supernatural,(2005â€“2020),TV-14,44 min,"Drama, Fantasy, Horror",8.4,Two brothers follow their father's footsteps a...,"['Jared Padalecki, ', 'Jensen Ackles, ', 'Jim ...",439601


In [61]:
X = df_features['title']

In [62]:
X.head()

0                 Cobra Kai
1                 The Crown
2          Better Call Saul
3             Devil in Ohio
4    Cyberpunk: Edgerunners
Name: title, dtype: object

In [63]:
X.shape

(9957,)

Get features text conversion to token 

In [64]:
from sklearn.feature_extraction.text import TfidfVectorizer 

In [65]:
tfidf = TfidfVectorizer ()

In [None]:
 X = tfidf.fit_transform(X)
type(X)

In [70]:
X.shape

(9957, 8670)

In [71]:
print(X)

  (0, 4128)	0.7167784832688866
  (0, 1616)	0.6973009435837259
  (1, 1860)	0.9614181752085453
  (1, 7683)	0.2750910619752497
  (2, 6727)	0.6334294400878693
  (2, 1261)	0.5038595037500173
  (2, 853)	0.5872756975311976
  (3, 5550)	0.7459627623404198
  (3, 3762)	0.37619569385405416
  (3, 2122)	0.5495601487800196
  (4, 2419)	0.7071067811865476
  (4, 1909)	0.7071067811865476
  (5, 6688)	0.9537758905016662
  (5, 7683)	0.3005188025694122
  (6, 5175)	0.6565492635686828
  (6, 381)	0.3713302155963619
  (6, 6452)	0.6565492635686828
  (7, 656)	0.6552070931374833
  (7, 1098)	0.7554493133906003
  (8, 3754)	0.9667893483053073
  (8, 7683)	0.25557456055601313
  (9, 952)	1.0
  (10, 7703)	0.6910232228332032
  (10, 7381)	0.7228325570318572
  (11, 6493)	0.6556754872617221
  :	:
  (9946, 4742)	0.5450398079667372
  (9946, 3240)	0.8088165998909551
  (9946, 7683)	0.22078794231664173
  (9947, 5055)	0.810590026329203
  (9947, 920)	0.5856140445853583
  (9948, 7703)	0.6910232228332032
  (9948, 7381)	0.7228325570318

Get similarity score using cosine similarity

In [75]:
from sklearn.metrics.pairwise import cosine_similarity

Get movie name as input from user and validate for closest spelling

In [76]:
Favourite_Movie_Name = input('Enter your favourite movie name:')

Enter your favourite movie name:Cobra Kai


In [77]:
All_movie_title_list = df['title'].tolist()

In [78]:
import difflib

In [79]:
Movie_Recommendation = difflib.get_close_matches(Favourite_Movie_Name, All_movie_title_list)
print(Movie_Recommendation)

['Cobra Kai', 'Cobra Kai', 'Cobra Kai']


In [80]:
Close_match = Movie_Recommendation[0]
print(Close_match)

Cobra Kai


In [82]:
Index_of_close_match_movie = df[df.title == Close_match]
print(Index_of_close_match_movie)

          title        year certificate duration                  genre  \
0     Cobra Kai  (2018â€“ )       TV-14   30 min  Action, Comedy, Drama   
8609  Cobra Kai  (2018â€“ )       TV-14   32 min  Action, Comedy, Drama   
8610  Cobra Kai  (2018â€“ )       TV-14   36 min  Action, Comedy, Drama   
8611  Cobra Kai  (2018â€“ )       TV-14   29 min  Action, Comedy, Drama   
8612  Cobra Kai  (2018â€“ )       TV-14   38 min  Action, Comedy, Drama   
8613  Cobra Kai  (2018â€“ )       TV-14   39 min  Action, Comedy, Drama   
8614  Cobra Kai  (2018â€“ )       TV-14   32 min  Action, Comedy, Drama   
8615  Cobra Kai  (2018â€“ )       TV-14   34 min  Action, Comedy, Drama   
8616  Cobra Kai  (2018â€“ )       TV-14   30 min  Action, Comedy, Drama   
8617  Cobra Kai  (2018â€“ )       TV-14   28 min  Action, Comedy, Drama   
8618  Cobra Kai  (2018â€“ )       TV-14   41 min  Action, Comedy, Drama   

      rating                                        description  \
0        8.5  Decades after thei