# **Movie Recommendation System**

## **Objective**

The objective of a movie recommendation system is to enhance user experience by providing personalized movie suggestions that align with individual preferences. By analyzing a user's viewing history, ratings, and interactions, the system predicts movies that the user is likely to enjoy. This personalization helps users discover new content effortlessly, increasing their engagement and satisfaction with the platform.

In addition to improving user experience, the recommendation system aims to optimize content discovery and reduce the time users spend searching for movies. By delivering relevant suggestions, the system not only keeps users engaged but also boosts user retention, ensuring they return to the platform for more personalized entertainment options.

## **Data Source**

## **Import Library**

In [1]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import difflib

## **Import Data**

In [2]:
movie = pd.read_csv('https://raw.githubusercontent.com/YBI-Foundation/Dataset/main/Movies%20Recommendation.csv')
movie.head()

Unnamed: 0,Movie_ID,Movie_Title,Movie_Genre,Movie_Language,Movie_Budget,Movie_Popularity,Movie_Release_Date,Movie_Revenue,Movie_Runtime,Movie_Vote,...,Movie_Homepage,Movie_Keywords,Movie_Overview,Movie_Production_House,Movie_Production_Country,Movie_Spoken_Language,Movie_Tagline,Movie_Cast,Movie_Crew,Movie_Director
0,1,Four Rooms,Crime Comedy,en,4000000,22.87623,09-12-1995,4300000,98.0,6.5,...,,hotel new year's eve witch bet hotel room,It's Ted the Bellhop's first night on the job....,"[{""name"": ""Miramax Films"", ""id"": 14}, {""name"":...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...","[{""iso_639_1"": ""en"", ""name"": ""English""}]",Twelve outrageous guests. Four scandalous requ...,Tim Roth Antonio Banderas Jennifer Beals Madon...,"[{'name': 'Allison Anders', 'gender': 1, 'depa...",Allison Anders
1,2,Star Wars,Adventure Action Science Fiction,en,11000000,126.393695,25-05-1977,775398007,121.0,8.1,...,http://www.starwars.com/films/star-wars-episod...,android galaxy hermit death star lightsaber,Princess Leia is captured and held hostage by ...,"[{""name"": ""Lucasfilm"", ""id"": 1}, {""name"": ""Twe...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...","[{""iso_639_1"": ""en"", ""name"": ""English""}]","A long time ago in a galaxy far, far away...",Mark Hamill Harrison Ford Carrie Fisher Peter ...,"[{'name': 'George Lucas', 'gender': 2, 'depart...",George Lucas
2,3,Finding Nemo,Animation Family,en,94000000,85.688789,30-05-2003,940335536,100.0,7.6,...,http://movies.disney.com/finding-nemo,father son relationship harbor underwater fish...,"Nemo, an adventurous young clownfish, is unexp...","[{""name"": ""Pixar Animation Studios"", ""id"": 3}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...","[{""iso_639_1"": ""en"", ""name"": ""English""}]","There are 3.7 trillion fish in the ocean, they...",Albert Brooks Ellen DeGeneres Alexander Gould ...,"[{'name': 'Andrew Stanton', 'gender': 2, 'depa...",Andrew Stanton
3,4,Forrest Gump,Comedy Drama Romance,en,55000000,138.133331,06-07-1994,677945399,142.0,8.2,...,,vietnam veteran hippie mentally disabled runni...,A man with a low IQ has accomplished great thi...,"[{""name"": ""Paramount Pictures"", ""id"": 4}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...","[{""iso_639_1"": ""en"", ""name"": ""English""}]","The world will never be the same, once you've ...",Tom Hanks Robin Wright Gary Sinise Mykelti Wil...,"[{'name': 'Alan Silvestri', 'gender': 2, 'depa...",Robert Zemeckis
4,5,American Beauty,Drama,en,15000000,80.878605,15-09-1999,356296601,122.0,7.9,...,http://www.dreamworks.com/ab/,male nudity female nudity adultery midlife cri...,"Lester Burnham, a depressed suburban father in...","[{""name"": ""DreamWorks SKG"", ""id"": 27}, {""name""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...","[{""iso_639_1"": ""en"", ""name"": ""English""}]",Look closer.,Kevin Spacey Annette Bening Thora Birch Wes Be...,"[{'name': 'Thomas Newman', 'gender': 2, 'depar...",Sam Mendes


## **Describe Data**

In [3]:
movie_features = movie[['Movie_Genre', 'Movie_Keywords', 'Movie_Tagline', 'Movie_Cast', 'Movie_Director']].fillna('')

## **Data Visualization**

## **Data Preprocessing**

## **Define Target Variable (y) and Feature Variables (X)**

In [4]:
x = movie_features['Movie_Genre'] + ' ' + movie_features['Movie_Keywords'] + ' ' + movie_features['Movie_Tagline'] + ' ' + movie_features['Movie_Cast'] + ' ' + movie_features['Movie_Director']
tfidf = TfidfVectorizer()
x = tfidf.fit_transform(x)

## **Train Test Split**

In [5]:
Similarity_Score = cosine_similarity(x)

## **Modeling**

In [6]:
Favourite_Movie_Name = input('Enter your favourite movie name : ')
All_Movies_Title_List = movie['Movie_Title'].tolist()

Enter your favourite movie name : avtar


## **Model Evaluation**

In [11]:
Movie_Recommendation = difflib.get_close_matches(Favourite_Movie_Name, All_Movies_Title_List)
if len(Movie_Recommendation) > 0:
    Close_Match = Movie_Recommendation[0]
    print('Close Match Found:', Close_Match)
    Index_of_Close_Match_Movie = movie[movie.Movie_Title == Close_Match]['Movie_ID'].values[0]
    Recommendation_Score = list(enumerate(Similarity_Score[Index_of_Close_Match_Movie]))
    Index_of_Close_Match_Movie = movie[movie.Movie_Title == Close_Match]['Movie_ID'].values[0]

Movie_Recommendation = difflib.get_close_matches(Favourite_Movie_Name, All_Movies_Title_List)
if len(Movie_Recommendation) > 0:
    Close_Match = Movie_Recommendation[0]

Close Match Found: Avatar


## **Prediction**

In [15]:
    Sorted_Similar_Movies = sorted(Recommendation_Score, key=lambda x: x[1], reverse=True)
print('Top 30 Movies Suggested for You:\n')
    i = 1
    for movie_index, score in Sorted_Similar_Movies:
        title_from_index = movie.iloc[movie_index]['Movie_Title']  # Access title by index
        if i <= 30:
            print(i, '.', title_from_index)
            i += 1


Top 30 Movies Suggested for You:

1 . Niagara
2 . Caravans
3 . My Week with Marilyn
4 . Brokeback Mountain
5 . Harry Brown
6 . Night of the Living Dead
7 . The Curse of Downers Grove
8 . The Boy Next Door
9 . Back to the Future
10 . The Juror
11 . Some Like It Hot
12 . Enough
13 . The Kentucky Fried Movie
14 . Eye for an Eye
15 . Welcome to the Sticks
16 . Alice Through the Looking Glass
17 . Superman III
18 . The Misfits
19 . Premium Rush
20 . Duel in the Sun
21 . Sabotage
22 . Small Soldiers
23 . All That Jazz
24 . Camping Sauvage
25 . The Raid
26 . Beyond the Black Rainbow
27 . To Kill a Mockingbird
28 . World Trade Center
29 . The Dark Knight Rises
30 . Tora! Tora! Tora!


## **Explaination**

This code implements a simple movie recommendation system that suggests movies based on content similarity. Here's a detailed explanation:

1. **Library Imports and Data Loading:**
The code begins by importing the necessary libraries: pandas, numpy TfidfVectorizer from sklearn for text processing, and cosine_similarity for measuring similarity between movies. difflib is used to find close matches for movie names.
A dataset of movies is loaded using pd.read_csv() from a URL containing various features like movie genres, keywords, taglines, cast, and director.
The movie.head() function previews the first five rows of the dataset.
2. **Feature Extraction and TF-IDF Transformation:**
The important features for recommendation are selected: Movie_Genre, Movie_Keywords, Movie_Tagline, Movie_Cast, and Movie_Director. These columns are combined into a single feature vector x after handling missing values using .fillna('').
The TfidfVectorizer() is applied to transform this combined textual data into numerical vectors (TF-IDF), which represent the importance of each word within the context of a document.
The cosine similarity is computed between the TF-IDF vectors, resulting in a similarity matrix (Similarity_Score), where each entry represents the similarity score between two movies.
3. **User Input and Closest Match:**
The user is prompted to input their favorite movie name, and all movie titles from the dataset are stored in a list All_Movies_Title_List.
Using difflib.get_close_matches(), the code finds the closest movie title match to the user's input. If a close match is found, it identifies the matching movie and retrieves its corresponding index (Index_of_Close_Match_Movie).
4. **Recommendation Based on Similarity:**
Once the closest match is found, the recommendation scores of other movies in relation to this matched movie are calculated by retrieving the similarity scores from Similarity_Score.
The movies are then sorted in descending order based on their similarity to the matched movie (Sorted_Similar_Movies).
5. **Displaying the Recommendations:**
Finally, the top 30 most similar movies are displayed to the user. The code iterates through the sorted movie list, retrieving the title of each recommended movie by its index and printing the recommendations.
This system allows users to input a movie they like, and based on the features of that movie, it recommends similar ones from the dataset. The use of cosine similarity ensures that movies with similar content, keywords, and cast are suggested to the user.