## Recommendation
- Recommendation systems can be categorized into three types:
1. Content-Based: This approach recommends movies based on the content you have previously watched. For instance, if you watch a comedy or action movie, it will suggest other movies in the same genre.
2. Collaborative Filtering: Used by platforms like Netflix, this method leverages the preferences of similar users to make recommendations. For example, if you and Kevin both watched "BMB," the algorithm recognizes your similar tastes. Therefore, if you watch "ZNMD," it will likely recommend the same movie to Kevin.
3. Hybrid: This technique combines both content-based and collaborative filtering methods. YouTube, for example, uses this approach by recommending videos based on both the context of the videos you watch and the viewing habits of others with similar interests.

### Flow of doing project
- It's very important to get your project flow right. Here's a structured approach:
1. Data: First, we collect the data.
2. Preprocessing: In this step, we perform data cleaning and other necessary preprocessing tasks.
3. Model: Next, we train the model. Once we are satisfied with its performance, we proceed to the next step.
4. Save: We save the trained model.
5. Testing: We conduct dry run testing to ensure everything works as expected.

In [1]:
#### 1. Import liabraries or packages
import pandas as pd
import numpy as np

In [2]:
### 2. Load the data (Got the data from 'TMDB 5000 movie dataset')
movies = pd.read_csv('movie_recomm_Dataset/tmdb_5000_movies.csv')
credits = pd.read_csv("movie_recomm_Dataset/tmdb_5000_credits.csv")


In [3]:
movies.head()

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2007-05-19,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
2,245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2015-10-26,880674609,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466
3,250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-07-16,1084939099,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
4,260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...",en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-03-07,284139100,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124


In [4]:
movies.shape

(4803, 20)

In [5]:
credits.head()

Unnamed: 0,movie_id,title,cast,crew
0,19995,Avatar,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,285,Pirates of the Caribbean: At World's End,"[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,206647,Spectre,"[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,49026,The Dark Knight Rises,"[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,49529,John Carter,"[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."


In [6]:
credits.shape

(4803, 4)

In [7]:
### To see the data in particular column

credits.iloc[0].crew

'[{"credit_id": "52fe48009251416c750aca23", "department": "Editing", "gender": 0, "id": 1721, "job": "Editor", "name": "Stephen E. Rivkin"}, {"credit_id": "539c47ecc3a36810e3001f87", "department": "Art", "gender": 2, "id": 496, "job": "Production Design", "name": "Rick Carter"}, {"credit_id": "54491c89c3a3680fb4001cf7", "department": "Sound", "gender": 0, "id": 900, "job": "Sound Designer", "name": "Christopher Boyes"}, {"credit_id": "54491cb70e0a267480001bd0", "department": "Sound", "gender": 0, "id": 900, "job": "Supervising Sound Editor", "name": "Christopher Boyes"}, {"credit_id": "539c4a4cc3a36810c9002101", "department": "Production", "gender": 1, "id": 1262, "job": "Casting", "name": "Mali Finn"}, {"credit_id": "5544ee3b925141499f0008fc", "department": "Sound", "gender": 2, "id": 1729, "job": "Original Music Composer", "name": "James Horner"}, {"credit_id": "52fe48009251416c750ac9c3", "department": "Directing", "gender": 2, "id": 2710, "job": "Director", "name": "James Cameron"},

In [8]:
### as now we got some idea about the data in both dataset
### but we cannot apply logic on both the data so we have to merge it on the basis of common column

In [8]:
credits.columns

Index(['movie_id', 'title', 'cast', 'crew'], dtype='object')

In [9]:
movies.columns

Index(['budget', 'genres', 'homepage', 'id', 'keywords', 'original_language',
       'original_title', 'overview', 'popularity', 'production_companies',
       'production_countries', 'release_date', 'revenue', 'runtime',
       'spoken_languages', 'status', 'tagline', 'title', 'vote_average',
       'vote_count'],
      dtype='object')

In [11]:
### in movies we have "id", 'title' and in credits we have 'movie_id', 'title'
### above are the 2 column from either of the column we can merge the data

In [10]:
### merge the data

movies = movies.merge(credits, on = 'title')

In [11]:
movies.head()

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,movie_id,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,19995,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...",...,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500,285,"[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...",...,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466,206647,"[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...",...,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106,49026,"[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...",en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]",...,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124,49529,"[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."


In [12]:
movies.shape

(4809, 23)

In [13]:
movies.columns

Index(['budget', 'genres', 'homepage', 'id', 'keywords', 'original_language',
       'original_title', 'overview', 'popularity', 'production_companies',
       'production_countries', 'release_date', 'revenue', 'runtime',
       'spoken_languages', 'status', 'tagline', 'title', 'vote_average',
       'vote_count', 'movie_id', 'cast', 'crew'],
      dtype='object')

In [16]:
### now after merging we have 23 columns, and from this many columns we have to remove the unwanted columns
### lets prepare the list of columns required for creating recommendation model


In [14]:
# Required columns
columns_list = ['movie_id','title','overview','genres','keywords','cast','crew']

movies = movies[columns_list]

movies

Unnamed: 0,movie_id,title,overview,genres,keywords,cast,crew
0,19995,Avatar,"In the 22nd century, a paraplegic Marine is di...","[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...","[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...","[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,285,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...","[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...","[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...","[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,206647,Spectre,A cryptic message from Bond’s past sends him o...,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...","[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...","[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,49026,The Dark Knight Rises,Following the death of District Attorney Harve...,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...","[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...","[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,49529,John Carter,"John Carter is a war-weary, former military ca...","[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...","[{""id"": 818, ""name"": ""based on novel""}, {""id"":...","[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."
...,...,...,...,...,...,...,...
4804,9367,El Mariachi,El Mariachi just wants to play his guitar and ...,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...","[{""id"": 5616, ""name"": ""united states\u2013mexi...","[{""cast_id"": 1, ""character"": ""El Mariachi"", ""c...","[{""credit_id"": ""52fe44eec3a36847f80b280b"", ""de..."
4805,72766,Newlyweds,A newlywed couple's honeymoon is upended by th...,"[{""id"": 35, ""name"": ""Comedy""}, {""id"": 10749, ""...",[],"[{""cast_id"": 1, ""character"": ""Buzzy"", ""credit_...","[{""credit_id"": ""52fe487dc3a368484e0fb013"", ""de..."
4806,231617,"Signed, Sealed, Delivered","""Signed, Sealed, Delivered"" introduces a dedic...","[{""id"": 35, ""name"": ""Comedy""}, {""id"": 18, ""nam...","[{""id"": 248, ""name"": ""date""}, {""id"": 699, ""nam...","[{""cast_id"": 8, ""character"": ""Oliver O\u2019To...","[{""credit_id"": ""52fe4df3c3a36847f8275ecf"", ""de..."
4807,126186,Shanghai Calling,When ambitious New York attorney Sam is sent t...,[],[],"[{""cast_id"": 3, ""character"": ""Sam"", ""credit_id...","[{""credit_id"": ""52fe4ad9c3a368484e16a36b"", ""de..."


In [None]:
### now we have to perform the transformation task

### dataset --> movie_id, title, tag('overview','keywords','release_date','genres','keywords','cast','crew')

### we have to compress the 5 columns in one column, basically it will append data in same column    

In [15]:
### check for null values
movies.isnull().sum()

movie_id    0
title       0
overview    3
genres      0
keywords    0
cast        0
crew        0
dtype: int64

In [16]:
### we got null in overview column. now we will remove rows having null value

movies.dropna(inplace = True)

In [17]:
movies.isnull().sum()

# now we dont have any missing data

movie_id    0
title       0
overview    0
genres      0
keywords    0
cast        0
crew        0
dtype: int64

In [18]:
### check for duplicate

movies.duplicated().sum()

# so there are 0 duplicates in data

0

### analyse column by column
## Genre

In [19]:

movies.iloc[0].genres

'[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]'

In [20]:
### only wanted data from genres [Action, Adventure, Fantasy, Science Fiction]

def context(obj):
    L = []
    for i in obj:
        L.append(i['name'])
    return L


In [21]:
movies['genres'].apply(context)

### if we observe closely the output we got in genres is in string datatype so showing error
### to overcome it we have to convert it into list

TypeError: string indices must be integers, not 'str'

In [22]:
import ast
### this is used to convert string to list data type

ast.literal_eval('[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]')

[{'id': 28, 'name': 'Action'},
 {'id': 12, 'name': 'Adventure'},
 {'id': 14, 'name': 'Fantasy'},
 {'id': 878, 'name': 'Science Fiction'}]

In [23]:
### lets try it with function
def context(obj):
    L = []
    for i in ast.literal_eval(obj):
        L.append(i['name'])
    return L

In [24]:
movies['genres'] = movies['genres'].apply(context)
movies['genres']

0       [Action, Adventure, Fantasy, Science Fiction]
1                        [Adventure, Fantasy, Action]
2                          [Action, Adventure, Crime]
3                    [Action, Crime, Drama, Thriller]
4                [Action, Adventure, Science Fiction]
                            ...                      
4804                        [Action, Crime, Thriller]
4805                                [Comedy, Romance]
4806               [Comedy, Drama, Romance, TV Movie]
4807                                               []
4808                                    [Documentary]
Name: genres, Length: 4806, dtype: object

## Keywords

In [25]:
movies.iloc[0].keywords

### same scenerio for keywords as well, datatype is in object and we want it in list so we have to follow the same approach as genre


'[{"id": 1463, "name": "culture clash"}, {"id": 2964, "name": "future"}, {"id": 3386, "name": "space war"}, {"id": 3388, "name": "space colony"}, {"id": 3679, "name": "society"}, {"id": 3801, "name": "space travel"}, {"id": 9685, "name": "futuristic"}, {"id": 9840, "name": "romance"}, {"id": 9882, "name": "space"}, {"id": 9951, "name": "alien"}, {"id": 10148, "name": "tribe"}, {"id": 10158, "name": "alien planet"}, {"id": 10987, "name": "cgi"}, {"id": 11399, "name": "marine"}, {"id": 13065, "name": "soldier"}, {"id": 14643, "name": "battle"}, {"id": 14720, "name": "love affair"}, {"id": 165431, "name": "anti war"}, {"id": 193554, "name": "power relations"}, {"id": 206690, "name": "mind and soul"}, {"id": 209714, "name": "3d"}]'

In [26]:
def key_context(obj):
    k=[]
    for i in ast.literal_eval(obj):
        k.append(i['name'])
    return k

In [27]:
movies['keywords'] = movies['keywords'].apply(key_context)

In [28]:
movies['keywords']

0       [culture clash, future, space war, space colon...
1       [ocean, drug abuse, exotic island, east india ...
2       [spy, based on novel, secret agent, sequel, mi...
3       [dc comics, crime fighter, terrorist, secret i...
4       [based on novel, mars, medallion, space travel...
                              ...                        
4804    [united states–mexico barrier, legs, arms, pap...
4805                                                   []
4806    [date, love at first sight, narration, investi...
4807                                                   []
4808            [obsession, camcorder, crush, dream girl]
Name: keywords, Length: 4806, dtype: object

In [29]:
movies

Unnamed: 0,movie_id,title,overview,genres,keywords,cast,crew
0,19995,Avatar,"In the 22nd century, a paraplegic Marine is di...","[Action, Adventure, Fantasy, Science Fiction]","[culture clash, future, space war, space colon...","[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,285,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...","[Adventure, Fantasy, Action]","[ocean, drug abuse, exotic island, east india ...","[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,206647,Spectre,A cryptic message from Bond’s past sends him o...,"[Action, Adventure, Crime]","[spy, based on novel, secret agent, sequel, mi...","[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,49026,The Dark Knight Rises,Following the death of District Attorney Harve...,"[Action, Crime, Drama, Thriller]","[dc comics, crime fighter, terrorist, secret i...","[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,49529,John Carter,"John Carter is a war-weary, former military ca...","[Action, Adventure, Science Fiction]","[based on novel, mars, medallion, space travel...","[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."
...,...,...,...,...,...,...,...
4804,9367,El Mariachi,El Mariachi just wants to play his guitar and ...,"[Action, Crime, Thriller]","[united states–mexico barrier, legs, arms, pap...","[{""cast_id"": 1, ""character"": ""El Mariachi"", ""c...","[{""credit_id"": ""52fe44eec3a36847f80b280b"", ""de..."
4805,72766,Newlyweds,A newlywed couple's honeymoon is upended by th...,"[Comedy, Romance]",[],"[{""cast_id"": 1, ""character"": ""Buzzy"", ""credit_...","[{""credit_id"": ""52fe487dc3a368484e0fb013"", ""de..."
4806,231617,"Signed, Sealed, Delivered","""Signed, Sealed, Delivered"" introduces a dedic...","[Comedy, Drama, Romance, TV Movie]","[date, love at first sight, narration, investi...","[{""cast_id"": 8, ""character"": ""Oliver O\u2019To...","[{""credit_id"": ""52fe4df3c3a36847f8275ecf"", ""de..."
4807,126186,Shanghai Calling,When ambitious New York attorney Sam is sent t...,[],[],"[{""cast_id"": 3, ""character"": ""Sam"", ""credit_id...","[{""credit_id"": ""52fe4ad9c3a368484e16a36b"", ""de..."


## cast

In [30]:
movies.iloc[0].cast

'[{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_id": "52fe48009251416c750ac9cb", "gender": 1, "id": 8691, "name": "Zoe Saldana", "order": 1}, {"cast_id": 25, "character": "Dr. Grace Augustine", "credit_id": "52fe48009251416c750aca39", "gender": 1, "id": 10205, "name": "Sigourney Weaver", "order": 2}, {"cast_id": 4, "character": "Col. Quaritch", "credit_id": "52fe48009251416c750ac9cf", "gender": 2, "id": 32747, "name": "Stephen Lang", "order": 3}, {"cast_id": 5, "character": "Trudy Chacon", "credit_id": "52fe48009251416c750ac9d3", "gender": 1, "id": 17647, "name": "Michelle Rodriguez", "order": 4}, {"cast_id": 8, "character": "Selfridge", "credit_id": "52fe48009251416c750ac9e1", "gender": 2, "id": 1771, "name": "Giovanni Ribisi", "order": 5}, {"cast_id": 7, "character": "Norm Spellman", "credit_id": "52fe48009251416c750ac9dd", "gender": 

In [31]:
def cast_context(obj):
    c=[]
    counter = 0
    for i in ast.literal_eval(obj):
        if counter < 3:
            c.append(i['name'])
            counter+=1
        else:
            break
    return c

In [32]:
movies['cast']=movies['cast'].apply(cast_context)

In [33]:
movies['cast']

0        [Sam Worthington, Zoe Saldana, Sigourney Weaver]
1           [Johnny Depp, Orlando Bloom, Keira Knightley]
2            [Daniel Craig, Christoph Waltz, Léa Seydoux]
3            [Christian Bale, Michael Caine, Gary Oldman]
4          [Taylor Kitsch, Lynn Collins, Samantha Morton]
                              ...                        
4804    [Carlos Gallardo, Jaime de Hoyos, Peter Marqua...
4805         [Edward Burns, Kerry Bishé, Marsha Dietlein]
4806           [Eric Mabius, Kristin Booth, Crystal Lowe]
4807            [Daniel Henney, Eliza Coupe, Bill Paxton]
4808    [Drew Barrymore, Brian Herzlinger, Corey Feldman]
Name: cast, Length: 4806, dtype: object

In [34]:
movies

Unnamed: 0,movie_id,title,overview,genres,keywords,cast,crew
0,19995,Avatar,"In the 22nd century, a paraplegic Marine is di...","[Action, Adventure, Fantasy, Science Fiction]","[culture clash, future, space war, space colon...","[Sam Worthington, Zoe Saldana, Sigourney Weaver]","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,285,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...","[Adventure, Fantasy, Action]","[ocean, drug abuse, exotic island, east india ...","[Johnny Depp, Orlando Bloom, Keira Knightley]","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,206647,Spectre,A cryptic message from Bond’s past sends him o...,"[Action, Adventure, Crime]","[spy, based on novel, secret agent, sequel, mi...","[Daniel Craig, Christoph Waltz, Léa Seydoux]","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,49026,The Dark Knight Rises,Following the death of District Attorney Harve...,"[Action, Crime, Drama, Thriller]","[dc comics, crime fighter, terrorist, secret i...","[Christian Bale, Michael Caine, Gary Oldman]","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,49529,John Carter,"John Carter is a war-weary, former military ca...","[Action, Adventure, Science Fiction]","[based on novel, mars, medallion, space travel...","[Taylor Kitsch, Lynn Collins, Samantha Morton]","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."
...,...,...,...,...,...,...,...
4804,9367,El Mariachi,El Mariachi just wants to play his guitar and ...,"[Action, Crime, Thriller]","[united states–mexico barrier, legs, arms, pap...","[Carlos Gallardo, Jaime de Hoyos, Peter Marqua...","[{""credit_id"": ""52fe44eec3a36847f80b280b"", ""de..."
4805,72766,Newlyweds,A newlywed couple's honeymoon is upended by th...,"[Comedy, Romance]",[],"[Edward Burns, Kerry Bishé, Marsha Dietlein]","[{""credit_id"": ""52fe487dc3a368484e0fb013"", ""de..."
4806,231617,"Signed, Sealed, Delivered","""Signed, Sealed, Delivered"" introduces a dedic...","[Comedy, Drama, Romance, TV Movie]","[date, love at first sight, narration, investi...","[Eric Mabius, Kristin Booth, Crystal Lowe]","[{""credit_id"": ""52fe4df3c3a36847f8275ecf"", ""de..."
4807,126186,Shanghai Calling,When ambitious New York attorney Sam is sent t...,[],[],"[Daniel Henney, Eliza Coupe, Bill Paxton]","[{""credit_id"": ""52fe4ad9c3a368484e16a36b"", ""de..."


## crew

In [35]:
movies.iloc[0].crew

'[{"credit_id": "52fe48009251416c750aca23", "department": "Editing", "gender": 0, "id": 1721, "job": "Editor", "name": "Stephen E. Rivkin"}, {"credit_id": "539c47ecc3a36810e3001f87", "department": "Art", "gender": 2, "id": 496, "job": "Production Design", "name": "Rick Carter"}, {"credit_id": "54491c89c3a3680fb4001cf7", "department": "Sound", "gender": 0, "id": 900, "job": "Sound Designer", "name": "Christopher Boyes"}, {"credit_id": "54491cb70e0a267480001bd0", "department": "Sound", "gender": 0, "id": 900, "job": "Supervising Sound Editor", "name": "Christopher Boyes"}, {"credit_id": "539c4a4cc3a36810c9002101", "department": "Production", "gender": 1, "id": 1262, "job": "Casting", "name": "Mali Finn"}, {"credit_id": "5544ee3b925141499f0008fc", "department": "Sound", "gender": 2, "id": 1729, "job": "Original Music Composer", "name": "James Horner"}, {"credit_id": "52fe48009251416c750ac9c3", "department": "Directing", "gender": 2, "id": 2710, "job": "Director", "name": "James Cameron"},

In [36]:
def crew_context(obj):
    cr = []
     
# not working for me  
# for i in ast.literal_eval(obj):
#         if i['job'] == "Director":
#             cr.append(i['name'])
#         return cr
    
    for i in ast.literal_eval(obj):
        cr.append(i['name'])
    return cr

In [37]:
movies['crew']=movies['crew'].apply(crew_context)


In [38]:
movies['crew']

0       [Stephen E. Rivkin, Rick Carter, Christopher B...
1       [Dariusz Wolski, Gore Verbinski, Jerry Bruckhe...
2       [Thomas Newman, Sam Mendes, Anna Pinnock, John...
3       [Hans Zimmer, Charles Roven, Christopher Nolan...
4       [Andrew Stanton, Andrew Stanton, John Lasseter...
                              ...                        
4804    [Robert Rodriguez, Robert Rodriguez, Robert Ro...
4805    [Edward Burns, Edward Burns, Edward Burns, Wil...
4806    [Carla Hetland, Harvey Kahn, Adam Sliwinski, M...
4807                           [Daniel Hsia, Daniel Hsia]
4808    [Clark Peterson, Andrew Reimer, Brian Herzling...
Name: crew, Length: 4806, dtype: object

In [39]:
movies

Unnamed: 0,movie_id,title,overview,genres,keywords,cast,crew
0,19995,Avatar,"In the 22nd century, a paraplegic Marine is di...","[Action, Adventure, Fantasy, Science Fiction]","[culture clash, future, space war, space colon...","[Sam Worthington, Zoe Saldana, Sigourney Weaver]","[Stephen E. Rivkin, Rick Carter, Christopher B..."
1,285,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...","[Adventure, Fantasy, Action]","[ocean, drug abuse, exotic island, east india ...","[Johnny Depp, Orlando Bloom, Keira Knightley]","[Dariusz Wolski, Gore Verbinski, Jerry Bruckhe..."
2,206647,Spectre,A cryptic message from Bond’s past sends him o...,"[Action, Adventure, Crime]","[spy, based on novel, secret agent, sequel, mi...","[Daniel Craig, Christoph Waltz, Léa Seydoux]","[Thomas Newman, Sam Mendes, Anna Pinnock, John..."
3,49026,The Dark Knight Rises,Following the death of District Attorney Harve...,"[Action, Crime, Drama, Thriller]","[dc comics, crime fighter, terrorist, secret i...","[Christian Bale, Michael Caine, Gary Oldman]","[Hans Zimmer, Charles Roven, Christopher Nolan..."
4,49529,John Carter,"John Carter is a war-weary, former military ca...","[Action, Adventure, Science Fiction]","[based on novel, mars, medallion, space travel...","[Taylor Kitsch, Lynn Collins, Samantha Morton]","[Andrew Stanton, Andrew Stanton, John Lasseter..."
...,...,...,...,...,...,...,...
4804,9367,El Mariachi,El Mariachi just wants to play his guitar and ...,"[Action, Crime, Thriller]","[united states–mexico barrier, legs, arms, pap...","[Carlos Gallardo, Jaime de Hoyos, Peter Marqua...","[Robert Rodriguez, Robert Rodriguez, Robert Ro..."
4805,72766,Newlyweds,A newlywed couple's honeymoon is upended by th...,"[Comedy, Romance]",[],"[Edward Burns, Kerry Bishé, Marsha Dietlein]","[Edward Burns, Edward Burns, Edward Burns, Wil..."
4806,231617,"Signed, Sealed, Delivered","""Signed, Sealed, Delivered"" introduces a dedic...","[Comedy, Drama, Romance, TV Movie]","[date, love at first sight, narration, investi...","[Eric Mabius, Kristin Booth, Crystal Lowe]","[Carla Hetland, Harvey Kahn, Adam Sliwinski, M..."
4807,126186,Shanghai Calling,When ambitious New York attorney Sam is sent t...,[],[],"[Daniel Henney, Eliza Coupe, Bill Paxton]","[Daniel Hsia, Daniel Hsia]"


In [None]:
### Now we will be removing the space present in names of columns genres, keywords, cast, crew
### becasue there are many first or last name whcih are similar to other names, which can create confusion
### while running the model.

In [40]:
def merg_name(obj):
    n = []
    for i in obj:
        n.append(i.replace(" ",""))
    return n

In [41]:
movies['keywords']=movies['keywords'].apply(merg_name)
movies['genres']=movies['genres'].apply(merg_name)
movies['cast']=movies['cast'].apply(merg_name)
movies['crew']=movies['crew'].apply(merg_name)

In [42]:
movies

Unnamed: 0,movie_id,title,overview,genres,keywords,cast,crew
0,19995,Avatar,"In the 22nd century, a paraplegic Marine is di...","[Action, Adventure, Fantasy, ScienceFiction]","[cultureclash, future, spacewar, spacecolony, ...","[SamWorthington, ZoeSaldana, SigourneyWeaver]","[StephenE.Rivkin, RickCarter, ChristopherBoyes..."
1,285,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...","[Adventure, Fantasy, Action]","[ocean, drugabuse, exoticisland, eastindiatrad...","[JohnnyDepp, OrlandoBloom, KeiraKnightley]","[DariuszWolski, GoreVerbinski, JerryBruckheime..."
2,206647,Spectre,A cryptic message from Bond’s past sends him o...,"[Action, Adventure, Crime]","[spy, basedonnovel, secretagent, sequel, mi6, ...","[DanielCraig, ChristophWaltz, LéaSeydoux]","[ThomasNewman, SamMendes, AnnaPinnock, JohnLog..."
3,49026,The Dark Knight Rises,Following the death of District Attorney Harve...,"[Action, Crime, Drama, Thriller]","[dccomics, crimefighter, terrorist, secretiden...","[ChristianBale, MichaelCaine, GaryOldman]","[HansZimmer, CharlesRoven, ChristopherNolan, C..."
4,49529,John Carter,"John Carter is a war-weary, former military ca...","[Action, Adventure, ScienceFiction]","[basedonnovel, mars, medallion, spacetravel, p...","[TaylorKitsch, LynnCollins, SamanthaMorton]","[AndrewStanton, AndrewStanton, JohnLasseter, C..."
...,...,...,...,...,...,...,...
4804,9367,El Mariachi,El Mariachi just wants to play his guitar and ...,"[Action, Crime, Thriller]","[unitedstates–mexicobarrier, legs, arms, paper...","[CarlosGallardo, JaimedeHoyos, PeterMarquardt]","[RobertRodriguez, RobertRodriguez, RobertRodri..."
4805,72766,Newlyweds,A newlywed couple's honeymoon is upended by th...,"[Comedy, Romance]",[],"[EdwardBurns, KerryBishé, MarshaDietlein]","[EdwardBurns, EdwardBurns, EdwardBurns, Willia..."
4806,231617,"Signed, Sealed, Delivered","""Signed, Sealed, Delivered"" introduces a dedic...","[Comedy, Drama, Romance, TVMovie]","[date, loveatfirstsight, narration, investigat...","[EricMabius, KristinBooth, CrystalLowe]","[CarlaHetland, HarveyKahn, AdamSliwinski, Mart..."
4807,126186,Shanghai Calling,When ambitious New York attorney Sam is sent t...,[],[],"[DanielHenney, ElizaCoupe, BillPaxton]","[DanielHsia, DanielHsia]"


In [None]:
### Now we will add a'Tag' column in which we will merge 
### for that we need overview column, but overview is in string . we have to convert it into list

In [43]:
movies['overview'] = movies['overview'].apply(lambda x : x.split())

In [44]:
movies.head()

### with the help of split overview got converted into list

Unnamed: 0,movie_id,title,overview,genres,keywords,cast,crew
0,19995,Avatar,"[In, the, 22nd, century,, a, paraplegic, Marin...","[Action, Adventure, Fantasy, ScienceFiction]","[cultureclash, future, spacewar, spacecolony, ...","[SamWorthington, ZoeSaldana, SigourneyWeaver]","[StephenE.Rivkin, RickCarter, ChristopherBoyes..."
1,285,Pirates of the Caribbean: At World's End,"[Captain, Barbossa,, long, believed, to, be, d...","[Adventure, Fantasy, Action]","[ocean, drugabuse, exoticisland, eastindiatrad...","[JohnnyDepp, OrlandoBloom, KeiraKnightley]","[DariuszWolski, GoreVerbinski, JerryBruckheime..."
2,206647,Spectre,"[A, cryptic, message, from, Bond’s, past, send...","[Action, Adventure, Crime]","[spy, basedonnovel, secretagent, sequel, mi6, ...","[DanielCraig, ChristophWaltz, LéaSeydoux]","[ThomasNewman, SamMendes, AnnaPinnock, JohnLog..."
3,49026,The Dark Knight Rises,"[Following, the, death, of, District, Attorney...","[Action, Crime, Drama, Thriller]","[dccomics, crimefighter, terrorist, secretiden...","[ChristianBale, MichaelCaine, GaryOldman]","[HansZimmer, CharlesRoven, ChristopherNolan, C..."
4,49529,John Carter,"[John, Carter, is, a, war-weary,, former, mili...","[Action, Adventure, ScienceFiction]","[basedonnovel, mars, medallion, spacetravel, p...","[TaylorKitsch, LynnCollins, SamanthaMorton]","[AndrewStanton, AndrewStanton, JohnLasseter, C..."


In [45]:
movies['Tag'] = movies['overview']+movies['keywords']+movies['genres']+movies['cast']+movies['crew']

In [46]:
movies.iloc[0].Tag
### we have all below words in Tag, now we dont need columns which are merged in Tag column,
### so we will delete them

['In',
 'the',
 '22nd',
 'century,',
 'a',
 'paraplegic',
 'Marine',
 'is',
 'dispatched',
 'to',
 'the',
 'moon',
 'Pandora',
 'on',
 'a',
 'unique',
 'mission,',
 'but',
 'becomes',
 'torn',
 'between',
 'following',
 'orders',
 'and',
 'protecting',
 'an',
 'alien',
 'civilization.',
 'cultureclash',
 'future',
 'spacewar',
 'spacecolony',
 'society',
 'spacetravel',
 'futuristic',
 'romance',
 'space',
 'alien',
 'tribe',
 'alienplanet',
 'cgi',
 'marine',
 'soldier',
 'battle',
 'loveaffair',
 'antiwar',
 'powerrelations',
 'mindandsoul',
 '3d',
 'Action',
 'Adventure',
 'Fantasy',
 'ScienceFiction',
 'SamWorthington',
 'ZoeSaldana',
 'SigourneyWeaver',
 'StephenE.Rivkin',
 'RickCarter',
 'ChristopherBoyes',
 'ChristopherBoyes',
 'MaliFinn',
 'JamesHorner',
 'JamesCameron',
 'JamesCameron',
 'JamesCameron',
 'JamesCameron',
 'JamesCameron',
 'AndrewMenzies',
 'JillBrooks',
 'MargerySimkin',
 'KevinIshioka',
 'DickBernstein',
 'ShannonMills',
 'DennieThorpe',
 'JanaVance',
 'Debora

In [47]:
new_movie_data = movies.drop(columns=['overview',"keywords",'cast','crew','genres'])

In [48]:
new_movie_data

Unnamed: 0,movie_id,title,Tag
0,19995,Avatar,"[In, the, 22nd, century,, a, paraplegic, Marin..."
1,285,Pirates of the Caribbean: At World's End,"[Captain, Barbossa,, long, believed, to, be, d..."
2,206647,Spectre,"[A, cryptic, message, from, Bond’s, past, send..."
3,49026,The Dark Knight Rises,"[Following, the, death, of, District, Attorney..."
4,49529,John Carter,"[John, Carter, is, a, war-weary,, former, mili..."
...,...,...,...
4804,9367,El Mariachi,"[El, Mariachi, just, wants, to, play, his, gui..."
4805,72766,Newlyweds,"[A, newlywed, couple's, honeymoon, is, upended..."
4806,231617,"Signed, Sealed, Delivered","[""Signed,, Sealed,, Delivered"", introduces, a,..."
4807,126186,Shanghai Calling,"[When, ambitious, New, York, attorney, Sam, is..."


In [49]:
### Tag is in list so we have to convert it back to string

new_movie_data['Tag'] = new_movie_data["Tag"].apply(lambda x : " ".join(x))

In [50]:
new_movie_data # it has 2 commas in between

Unnamed: 0,movie_id,title,Tag
0,19995,Avatar,"In the 22nd century, a paraplegic Marine is di..."
1,285,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha..."
2,206647,Spectre,A cryptic message from Bond’s past sends him o...
3,49026,The Dark Knight Rises,Following the death of District Attorney Harve...
4,49529,John Carter,"John Carter is a war-weary, former military ca..."
...,...,...,...
4804,9367,El Mariachi,El Mariachi just wants to play his guitar and ...
4805,72766,Newlyweds,A newlywed couple's honeymoon is upended by th...
4806,231617,"Signed, Sealed, Delivered","""Signed, Sealed, Delivered"" introduces a dedic..."
4807,126186,Shanghai Calling,When ambitious New York attorney Sam is sent t...


In [51]:
new_movie_data.iloc[0].Tag

"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization. cultureclash future spacewar spacecolony society spacetravel futuristic romance space alien tribe alienplanet cgi marine soldier battle loveaffair antiwar powerrelations mindandsoul 3d Action Adventure Fantasy ScienceFiction SamWorthington ZoeSaldana SigourneyWeaver StephenE.Rivkin RickCarter ChristopherBoyes ChristopherBoyes MaliFinn JamesHorner JamesCameron JamesCameron JamesCameron JamesCameron JamesCameron AndrewMenzies JillBrooks MargerySimkin KevinIshioka DickBernstein ShannonMills DennieThorpe JanaVance DeborahLynnScott JonLandau SeanHaworth KimSinclair KimSinclair RichardF.Mays LaetaKalogridis MayesC.Rubeo MauroFiore ScottHerbertson WoodySchultz LindaDeVetta LindaDeVetta RichardBluck SimonBright RichardMartin SteveR.Moore JohnRefoua KarlJ.Martin ChilingLin IlramChoi StevenQuale CarlaMeyer NickBassett Jil

## word to vector
#### embedding
- for example:
- if I have Man, Women, Dog and asked to take out 2 similar words, in such what can be the amswer
- Man and Women
- we can easily identify the difference but we have to make computer as well to do that.
- in such case **embedding** comes into picture
- we know that computer can understand in much better way if we can give input in numbers.

In [None]:
# for example let's give some random vector to Man, Women, Dog
man - [1,1,1,0,2]   
women - [1,1,1,3,2] 
dog - [10,34,12,0,0]

# so if we ask machine which 2 words are similar by observing all 3 vector machine can say man and women
# this is word to vector

### how will be performing this --> BOW (bag of words)

In [None]:
# lets say we want to check the sentiments

i loved it | positive
i hated it | negative
i like it | positve

In [None]:
# so first it will take out the vocabulary - i.e. unique words (BOW)

[i, loved, it, hated, like]
# now this vacabulary will become features

In [None]:
# features
# which word is present in sentence give 1 else 0
           i. loved. it. hated. like
sentence_1 1   1      1    0      0   
sentence_2 1   0      1    1      0
sentence_3 1   0      1    0      1

# now we converted sentence to vector, this is what we are trying to do
# whatever words we have in Tag field we will convert them into vector

In [52]:
# now we will import word to vector liabrary

from sklearn.feature_extraction.text import CountVectorizer

In [53]:
cv = CountVectorizer(max_features = 10000)

# maxfeature = in above 3 statements we got max 5 features, so as per our data there can be many features but
# for now we have considered take 10000 features not more then that.

In [54]:
vector = cv.fit_transform(new_movie_data['Tag']).toarray()

In [55]:
vector  #this is how the vector data looks like

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int64)

In [56]:
vector.shape

(4806, 10000)

### how it will find similarity --> cosine similarity

In [57]:
from sklearn.metrics.pairwise import cosine_similarity

In [58]:
#for example to test
# cosine always needs data in 2D, so data is given inside double bracket

man = np.array([[10,20,10,10]])
women = np.array([[10,15,10,10]])
dog = np.array([[100,15,22,20]])
boy = np.array([[10,20,10,10]])

In [59]:
cosine_similarity(man, women)

# 98% similarity

array([[0.98974332]])

In [60]:
cosine_similarity(women, dog)
# 68% similarity

array([[0.68115942]])

In [61]:
cosine_similarity(man, dog)
# 61% similarity

array([[0.61679656]])

In [62]:
# if data is 100% similar
cosine_similarity(man, boy)


array([[1.]])

In [63]:
# now if we apply it on actual data

similarity = cosine_similarity(vector)

In [64]:
similarity

array([[1.        , 0.15365907, 0.09130823, ..., 0.14935187, 0.11651035,
        0.08738276],
       [0.15365907, 1.        , 0.17771762, ..., 0.22224556, 0.25064021,
        0.15515822],
       [0.09130823, 0.17771762, 1.        , ..., 0.20335959, 0.15071001,
        0.14066268],
       ...,
       [0.14935187, 0.22224556, 0.20335959, ..., 1.        , 0.3425257 ,
        0.20759133],
       [0.11651035, 0.25064021, 0.15071001, ..., 0.3425257 , 1.        ,
        0.28205128],
       [0.08738276, 0.15515822, 0.14066268, ..., 0.20759133, 0.28205128,
        1.        ]])

In [65]:
similarity.shape

(4806, 4806)

In [None]:
# so the machine is trying to create a box and trying to compare it with each sentence
            sent_1     sent_2    sent_3
sent_1       1          0         0
sent_2       0           1        0
sent_3       0           0         1
# if there is any similarity in sentence_1 with other sentences then it will show how much % of similarity is there

In [66]:
def recommend(movie):
    index = new_movie_data[new_movie_data['title'] == movie].index[0]
    distances = sorted(list(enumerate(similarity[index])),reverse=True, key = lambda x:x[1])
    print(distances)
    for i in distances[1:6]:
 #       print(new_movie_data.iloc[i[0]].title)
        pass

In [67]:
# let's understand line by line

#### index = new_movie_data[new_movie_data['title'] == movie].index[0]

#this line will scan whole data and compare it with the name we have given in movie and it will provide index for that movie
# index is important for us
new_movie_data[new_movie_data['title'] == 'Gandhi'] #this is how it will give the output

Unnamed: 0,movie_id,title,Tag
2030,783,Gandhi,"In the early years of the 20th century, Mohand..."


In [None]:
#### distances = sorted(list(enumerate(similarity[index])),reverse=True, key = lambda x:x[1])

# we taking the similarity variable which is basically a square matrix and inside that we are giving the index
# we got on first line explanation, then we are arranging it in descending order to get the most similar % of movie
# with the index 2030
# then we are doing a collab with enumerate so it gives index to all the % we are getting
# now what lambda is doing, it is taking the index of all the data

In [68]:
sorted(similarity[2030],reverse = True)

[1.0,
 0.5493274114646123,
 0.5104812758612466,
 0.5030977485864635,
 0.4994391348697552,
 0.4928053803045812,
 0.49253182747733903,
 0.48763326991479866,
 0.4860169569321385,
 0.4832852663179642,
 0.48135668755261835,
 0.4740978342319576,
 0.4738445616552978,
 0.47216152737861644,
 0.47216152737861644,
 0.471719105090836,
 0.4716502359605923,
 0.4697538642833272,
 0.46636389230979947,
 0.46609159969939906,
 0.46584809351110307,
 0.4652672527414932,
 0.4650088757552127,
 0.4647728745148459,
 0.4642437193886896,
 0.4626245672504971,
 0.4620051660412181,
 0.4618027971989033,
 0.4604027941678177,
 0.46013216766665177,
 0.4598709300997809,
 0.4590300531075913,
 0.4583570501370459,
 0.45658616498845617,
 0.45610862410814645,
 0.45610152809768234,
 0.45573271518765,
 0.4555610056468035,
 0.45525553465709245,
 0.4547940268270977,
 0.45325311587896006,
 0.45261539906029336,
 0.4519439697862639,
 0.4514983745154931,
 0.45135378373869667,
 0.45117557952464976,
 0.4507489358552089,
 0.45023455826

In [None]:
#### for i in distances[1:6]
   # print(new_movie_data.iloc[i[0]].title)
    
# for loop is taking top six indexes of most near % to the movie
# it will one by one iterate the index from top 6 and provide the title for the same

In [69]:
recommend('Gandhi')

# just for observation we can see that each % is having some index
# from extracting this index of particular we get to know the title of movie

[(2030, 1.0), (4388, 0.5493274114646123), (3808, 0.5104812758612466), (3518, 0.5030977485864635), (599, 0.4994391348697552), (3047, 0.4928053803045812), (4719, 0.49253182747733903), (2904, 0.48763326991479866), (3436, 0.4860169569321385), (4387, 0.4832852663179642), (4471, 0.48135668755261835), (1991, 0.4740978342319576), (4068, 0.4738445616552978), (2400, 0.47216152737861644), (4273, 0.47216152737861644), (3231, 0.471719105090836), (3520, 0.4716502359605923), (1793, 0.4697538642833272), (4325, 0.46636389230979947), (4500, 0.46609159969939906), (3547, 0.46584809351110307), (3233, 0.4652672527414932), (610, 0.4650088757552127), (1352, 0.4647728745148459), (3807, 0.4642437193886896), (274, 0.4626245672504971), (1959, 0.4620051660412181), (1007, 0.4618027971989033), (860, 0.4604027941678177), (3909, 0.46013216766665177), (4244, 0.4598709300997809), (2663, 0.4590300531075913), (4600, 0.4583570501370459), (3583, 0.45658616498845617), (2026, 0.45610862410814645), (1309, 0.45610152809768234),

In [71]:
def recommend(movie):
    index = new_movie_data[new_movie_data['title'] == movie].index[0]
    distances = sorted(list(enumerate(similarity[index])),reverse=True, key = lambda x:x[1])
    for i in distances[1:6]:
        print(new_movie_data.iloc[i[0]].title)
        

In [72]:
recommend('Gandhi')

Guiana 1838
1776
Winter in Wartime
Hart's War
End of the Spear


In [73]:
recommend('Batman Begins')

The Dark Knight
The Dark Knight Rises
Ironclad
The Midnight Meat Train
Gladiator
