<a href="https://colab.research.google.com/github/Rajadurai97/movie_recommentation/blob/main/Movie_recommentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**#Step 1: importing libraries and dataset**

In [1]:
import pandas as pd 
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity


In [2]:
url = 'https://raw.githubusercontent.com/WidhyaOrg/datasets/master/movie_dataset.csv'
dataset = pd.read_csv(url)

In [3]:
dataset.head()

Unnamed: 0,index,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew,director
0,0,237000000,Action Adventure Fantasy Science Fiction,http://www.avatarmovie.com/,19995,culture clash future space war space colony so...,en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,Sam Worthington Zoe Saldana Sigourney Weaver S...,"[{'name': 'Stephen E. Rivkin', 'gender': 0, 'd...",James Cameron
1,1,300000000,Adventure Fantasy Action,http://disney.go.com/disneypictures/pirates/,285,ocean drug abuse exotic island east india trad...,en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2007-05-19,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500,Johnny Depp Orlando Bloom Keira Knightley Stel...,"[{'name': 'Dariusz Wolski', 'gender': 2, 'depa...",Gore Verbinski
2,2,245000000,Action Adventure Crime,http://www.sonypictures.com/movies/spectre/,206647,spy based on novel secret agent sequel mi6,en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2015-10-26,880674609,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466,Daniel Craig Christoph Waltz L\u00e9a Seydoux ...,"[{'name': 'Thomas Newman', 'gender': 2, 'depar...",Sam Mendes
3,3,250000000,Action Crime Drama Thriller,http://www.thedarkknightrises.com/,49026,dc comics crime fighter terrorist secret ident...,en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-07-16,1084939099,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106,Christian Bale Michael Caine Gary Oldman Anne ...,"[{'name': 'Hans Zimmer', 'gender': 2, 'departm...",Christopher Nolan
4,4,260000000,Action Adventure Science Fiction,http://movies.disney.com/john-carter,49529,based on novel mars medallion space travel pri...,en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-03-07,284139100,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124,Taylor Kitsch Lynn Collins Samantha Morton Wil...,"[{'name': 'Andrew Stanton', 'gender': 2, 'depa...",Andrew Stanton


**#Step 2: feature selection and data cleaning**

In [4]:
df =dataset[['title','keywords','cast','genres','director']]
df.isnull().sum()

title         0
keywords    412
cast         43
genres       28
director     30
dtype: int64

In [5]:
df = df.replace(np.nan, '', regex=True)
df.isnull().sum()

title       0
keywords    0
cast        0
genres      0
director    0
dtype: int64

**#Step 3: combining the features into combine_feature as new column to find similiarity**

In [6]:
df['combine_feature'] = ''
columns = ['title','genres','director','cast','keywords']
for index, row in df.iterrows():
    words = ''
    for col in columns:
        words += ''.join(row[col])+ ' '
    row['combine_feature'] = words
df2 = df[['title','combine_feature']]

**#Step 4: Find similiarity**

In [7]:
count = CountVectorizer()
count_matrix = count.fit_transform(df2['combine_feature'])
cosine_sim = cosine_similarity(count_matrix, count_matrix)
print(cosine_sim)

[[1.         0.09078413 0.11572751 ... 0.         0.         0.        ]
 [0.09078413 1.         0.06537205 ... 0.06052275 0.         0.        ]
 [0.11572751 0.06537205 1.         ... 0.         0.10206207 0.        ]
 ...
 [0.         0.06052275 0.         ... 1.         0.         0.07142857]
 [0.         0.         0.10206207 ... 0.         1.         0.        ]
 [0.         0.         0.         ... 0.07142857 0.         1.        ]]


**The series of movie title, so series index can match the row and column index of similiarity matrix**

In [8]:
indices = pd.Series(df['title'])

**#Step 5: run and test the recommender model**

In [9]:
# Step 5: run and test the recommender model
def recommend(title, cosine_sim = cosine_sim):
    recommended_movies = []
    idx = indices[indices == title].index[0]
    score_series = pd.Series(cosine_sim[idx]).sort_values(ascending = False)
    top_10_indices = list(score_series.iloc[1:11].index)
    
    for i in top_10_indices:
        recommended_movies.append(list(df['title'])[i])
    return recommended_movies

**#Step 6: Checking the similiar movies**

In [10]:
recommend('Avatar')

['Guardians of the Galaxy',
 'Aliens',
 'Alien',
 'Zathura: A Space Adventure',
 'Star Trek Into Darkness',
 'Star Trek Beyond',
 'Lockout',
 'Jason X',
 'Star Wars: Clone Wars: Volume 1',
 'Lost in Space']