<a href="https://colab.research.google.com/github/nnm2602/MoviesSuggestion/blob/main/MovieSuggestion.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

First, we'll import all of the necessary libraries.

In [6]:
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import difflib

Reading the data file and selecting the relevant features.

Specifically, we will choose the follwing features: generes, keywords, tagline, cast, and director.

And combine all of this features into a single column by merging everything separated by a space.

In [7]:
# Load the data from the CSV file into a pandas dataframe
movies_data = pd.read_csv('/content/movies.csv')

# Select relevant features for recommendation
selected_features = ['genres', 'keywords', 'tagline', 'cast', 'director']

# Replace null values with an empty string for selected features
movies_data[selected_features] = movies_data[selected_features].fillna('')

# Combine all selected features into a single column
movies_data['combined_features'] = movies_data[selected_features].agg(' '.join, axis=1)

Next, we will convert the text data to feature vectors using TF-IDF. This will distill its semantic meaning into comparable vectors.

In [8]:
vectorizer = TfidfVectorizer()
feature_vectors = vectorizer.fit_transform(movies_data['combined_features'])

we will get the similarity scores using cosine similarity.

In [9]:
similarity = cosine_similarity(feature_vectors)

We'll move on to the recommendation process.
+ we'll first prompt the user for their movie's name
+ then we'll find the closest match for the movie name given by the user
+ from that movie title we will use it to locate the index of the movie and get a list of other similar movies based no its similarity score.  

In [10]:
# Get the movie name from the user
movie_name = input('Enter your favorite movie name: ')

# Find the closest match for the movie name given by the user
find_close_match = difflib.get_close_matches(movie_name, movies_data['title'])
close_match = find_close_match[0]

# Find the index of the movie with the title
index_of_the_movie = movies_data[movies_data['title'] == close_match].index[0]

# Get a list of similar movies
similarity_score = list(enumerate(similarity[index_of_the_movie]))

# Sort the movies based on their similarity score
sorted_similar_movies = sorted(similarity_score, key=lambda x: x[1], reverse=True)

Enter your favorite movie name: iron mna


This is the part where we present our findings to the user.

In [11]:
print('Movies suggested for you:\n')

for i, movie in enumerate(sorted_similar_movies[:30], 1):
    index = movie[0]
    title_from_index = movies_data.loc[index, 'title']
    print(i, '.', title_from_index)

Movies suggested for you:

1 . Iron Man
2 . Iron Man 2
3 . Iron Man 3
4 . Avengers: Age of Ultron
5 . The Avengers
6 . Captain America: Civil War
7 . Captain America: The Winter Soldier
8 . Ant-Man
9 . X-Men
10 . Made
11 . X-Men: Apocalypse
12 . X2
13 . The Incredible Hulk
14 . The Helix... Loaded
15 . X-Men: First Class
16 . X-Men: Days of Future Past
17 . Captain America: The First Avenger
18 . Kick-Ass 2
19 . Guardians of the Galaxy
20 . Deadpool
21 . Thor: The Dark World
22 . G-Force
23 . X-Men: The Last Stand
24 . Duets
25 . Mortdecai
26 . The Last Airbender
27 . Southland Tales
28 . Zathura: A Space Adventure
29 . Sky Captain and the World of Tomorrow
30 . The Amazing Spider-Man 2
