# **Title of Project**

Movies Recommendation

-------------

## **Objective**

The objective of this project is to develop a content-based movie recommendation system that suggests movies to users based on their favorite movie. By leveraging text features such as genre, keywords, tagline, cast, and director, the system will identify and recommend movies that are similar to the user's input. This aims to enhance the user's movie-watching experience by providing personalized recommendations that match their preferences

## **Data Source**

https://github.com/YBI-Foundation/Dataset

## **Import Library**

In [None]:
import pandas as pd


In [None]:

import numpy as np


In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer


In [None]:
from sklearn.metrics.pairwise import cosine_similarity


In [None]:
import difflib


## **Import Data**

In [None]:
df = pd.read_csv(r'https://raw.githubusercontent.com/YBI-Foundation/Dataset/main/Movies%20Recommendation.csv')

## **Describe Data**

In [None]:
print(df.head())


In [None]:
print(df.info())


In [None]:
print(df.shape)


In [None]:
print(df.columns)


## **Data Visualization**

for a recommendation system, this step might not be crucial.

## **Data Preprocessing**

In [None]:
df_features = df[['Movie_Genre', 'Movie_Keywords', 'Movie_Tagline', 'Movie_Cast', 'Movie_Director']].fillna('')


In [None]:
print(df_features.shape)


In [None]:
print(df_features)


In [None]:
x = df_features['Movie_Genre'] + ' ' + df_features['Movie_Keywords'] + ' ' + df_features['Movie_Tagline'] + ' ' + df_features['Movie_Cast'] + ' ' + df_features['Movie_Director']


In [None]:
print(x.shape)


## **Define Target Variable (y) and Feature Variables (X)**

Not applicable here as we are not doing supervised learning, but defining the combined feature variable x is relevant

## **Train Test Split**

In [None]:
Not applicable here as we are not doing supervised learning, but rather building a recommendation system based on similarity.

## **Modeling**

In [None]:
tfidf = TfidfVectorizer()


In [None]:
x = tfidf.fit_transform(x)


In [None]:
print(x.shape)


In [None]:
print(x)


In [None]:
similarity_score = cosine_similarity(x)


In [None]:
print(similarity_score)


In [None]:
print(similarity_score.shape)


## **Model Evaluation**

In [None]:
Not applicable here as recommendation systems don't follow traditional model evaluation metrics. Instead, we use the recommendation accuracy which can be subjectively evaluated

## **Prediction**

In [None]:
# Function to get movie recommendations based on a favorite movie
def get_movie_recommendations(favourite_movie_name, df, similarity_score):
    all_movies_title_list = df['Movie_Title'].tolist()
    movies_recommendation = difflib.get_close_matches(favourite_movie_name, all_movies_title_list)
    if not movies_recommendation:
        return "No match found."
    
    close_match = movies_recommendation[0]
    index_of_close_match = df[df.Movie_Title == close_match]['Movie_ID'].values[0]
    


In [None]:
    recommendation_score = list(enumerate(similarity_score[index_of_close_match]))
    sorted_similar_movies = sorted(recommendation_score, key=lambda x: x[1], reverse=True)
    
    print(f"Top 30 Movies Suggested for you based on '{close_match}':\n")
    
    i = 1
    for movie in sorted_similar_movies:
        index = movie[0]
        title_from_index = df[df.index == index]['Movie_Title'].values[0]
        if i <= 30:
            print(f"{i}. {title_from_index}")
            i += 1


In [None]:
favourite_movie_name = input('Enter your favourite movie name: ')
get_movie_recommendations(favourite_movie_name, df, similarity_score)

## **Explaination**

This project builds a content-based movie recommendation system that uses the TF-IDF vectorizer to convert movie-related text features into numerical vectors and then calculates the cosine similarity between these vectors to recommend movies. Given a user's favorite movie, the system finds the closest match from the dataset and suggests the top 30 similar movies based on the combined features of genre, keywords, tagline, cast, and director.

This approach leverages the concept of content-based filtering, which assumes that users will like items similar to what they have liked in the past, based on item features