# Project: Investigate a Dataset (TMDb_Movies Dataset)

## Table of Contents
<ul>
<li><a href="#intro">Introduction</a></li>
<li><a href="#wrangling">Data Wrangling</a></li>
<li><a href="#similarity">Similarity Matrix</a></li>
<li><a href="#conclusions">Conclusions</a></li>
</ul>

<a id='intro'></a>
## Introduction

>### **Overview**
>To complete my Data Analysis project I am using TMDb movies dataset. 

>This data set contains information about 10 thousand movies collected from The Movie Database (TMDb), including user ratings and revenue. It consist of 21 columns such as imdb_id, revenue, budget, vote_count etc.   

>#### **Question that can analyised from this data set**
> 1. Movies which release in 2015 only
> 2. Successful genres 
> 3. Similarity Matrix from features vote_average and Successful Genre without normalization
> 4. Similarity Matrix from features vote_average and Successful Genre with normalization
> 5. Get Top 10 movies similar to each movie. 


In [1]:
#importing important files 
import pandas as pd
import numpy as np
import csv
from datetime import datetime
import matplotlib.pyplot as plt
% matplotlib inline


<a id='wrangling'></a>
## Data Wrangling

> After observing the dataset and proposed questions for the analysis we will be keeping only relevent data deleting the unsued data so that we can make our calculation easy and understandable. 
> .

### General Properties

In [10]:
#loading the csv file and storing it in the variable "tmbd_data"
tmdb_data = pd.read_csv('tmdb-movies.csv')

#printing first five rows with defined columns of tmdb-movies database
tmdb_data.head()

Unnamed: 0,id,imdb_id,popularity,budget,revenue,original_title,cast,homepage,director,tagline,...,overview,runtime,genres,production_companies,release_date,vote_count,vote_average,release_year,budget_adj,revenue_adj
0,135397,tt0369610,32.985763,150000000,1513528810,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,http://www.jurassicworld.com/,Colin Trevorrow,The park is open.,...,Twenty-two years after the events of Jurassic ...,124,Action|Adventure|Science Fiction|Thriller,Universal Studios|Amblin Entertainment|Legenda...,6/9/15,5562,6.5,2015,137999900.0,1392446000.0
1,76341,tt1392190,28.419936,150000000,378436354,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,http://www.madmaxmovie.com/,George Miller,What a Lovely Day.,...,An apocalyptic story set in the furthest reach...,120,Action|Adventure|Science Fiction|Thriller,Village Roadshow Pictures|Kennedy Miller Produ...,5/13/15,6185,7.1,2015,137999900.0,348161300.0
2,262500,tt2908446,13.112507,110000000,295238201,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,http://www.thedivergentseries.movie/#insurgent,Robert Schwentke,One Choice Can Destroy You,...,Beatrice Prior must confront her inner demons ...,119,Adventure|Science Fiction|Thriller,Summit Entertainment|Mandeville Films|Red Wago...,3/18/15,2480,6.3,2015,101200000.0,271619000.0
3,140607,tt2488496,11.173104,200000000,2068178225,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,http://www.starwars.com/films/star-wars-episod...,J.J. Abrams,Every generation has a story.,...,Thirty years after defeating the Galactic Empi...,136,Action|Adventure|Science Fiction|Fantasy,Lucasfilm|Truenorth Productions|Bad Robot,12/15/15,5292,7.5,2015,183999900.0,1902723000.0
4,168259,tt2820852,9.335014,190000000,1506249360,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,http://www.furious7.com/,James Wan,Vengeance Hits Home,...,Deckard Shaw seeks revenge against Dominic Tor...,137,Action|Crime|Thriller,Universal Pictures|Original Film|Media Rights ...,4/1/15,2947,7.3,2015,174799900.0,1385749000.0


### Research Question 1 : Movies which release in 2015 only

In [11]:
# take movies in 2015 only
tmdb_data=tmdb_data[(tmdb_data.release_year==2015)]
tmdb_data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 629 entries, 0 to 628
Data columns (total 21 columns):
id                      629 non-null int64
imdb_id                 628 non-null object
popularity              629 non-null float64
budget                  629 non-null int64
revenue                 629 non-null int64
original_title          629 non-null object
cast                    621 non-null object
homepage                264 non-null object
director                626 non-null object
tagline                 435 non-null object
keywords                480 non-null object
overview                628 non-null object
runtime                 629 non-null int64
genres                  627 non-null object
production_companies    565 non-null object
release_date            629 non-null object
vote_count              629 non-null int64
vote_average            629 non-null float64
release_year            629 non-null int64
budget_adj              629 non-null float64
revenue_adj       

### Data Cleaning (Removing the unused information from the dataset )

> **Important observation regarding this process**
>
> 1. We need to remove unused column such as id, imdb_id, vote_count, production_company, keywords, homepage etc.
> 2. Removing the duplicacy in the rows(if any).


>** 1. Removing Unused columns**
>
> **Columns that we need to delete are**  -  id, imdb_id, popularity, budget_adj, revenue_adj, homepage, keywords, overview, production_companies, vote_count and vote_average.

In [12]:
#creating a list of columb to be deleted
del_col=[ 'id', 'imdb_id', 'popularity', 'budget_adj', 'revenue_adj', 'homepage', 'keywords', 'overview', 'production_companies', 'vote_count', 'tagline','release_date','release_year','budget','revenue']

#deleting the columns
tmdb_data= tmdb_data.drop(del_col,1)

#previewing the new dataset
tmdb_data.head(4)

Unnamed: 0,original_title,cast,director,runtime,genres,vote_average
0,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,Colin Trevorrow,124,Action|Adventure|Science Fiction|Thriller,6.5
1,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,George Miller,120,Action|Adventure|Science Fiction|Thriller,7.1
2,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,Robert Schwentke,119,Adventure|Science Fiction|Thriller,6.3
3,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,J.J. Abrams,136,Action|Adventure|Science Fiction|Fantasy,7.5


> **2. Removing the duplicacy in the rows(if any).**
>
>Lets see how many entries we have in the database

In [13]:
rows, col = tmdb_data.shape
#We need to reduce the count of row by one as contain header row also.
print('There are {} total entries of movies and {} no.of columns in it.'.format(rows-1, col))

There are 628 total entries of movies and 6 no.of columns in it.


> Now removing the duplicate rows if any!

In [14]:
tmdb_data.drop_duplicates(keep ='first', inplace=True)
rows, col = tmdb_data.shape

print('There are now {} total entries of movies and {} no.of columns in it.'.format(rows-1, col))

There are now 628 total entries of movies and 6 no.of columns in it.


> So there was a duplicate row and it has been removed now. 

In [15]:
#printing the data type of the data set
tmdb_data.dtypes

original_title     object
cast               object
director           object
runtime             int64
genres             object
vote_average      float64
dtype: object

### Research Question 2 : Succesful Genres

In [16]:
#function which will take any column as argument from and keep its track 
def data(column):
    #will take a column, and separate the string by '|'
    data = tmdb_data[column].str.cat(sep = '|')
    
    #giving pandas series and storing the values separately
    data = pd.Series(data.split('|'))
    
    #arranging in descending order
    count = data.value_counts(ascending = False)
    
    return count

In [17]:
#variable to store the retured value
count = data('genres')
#printing top 5 values
count.head()

Drama       260
Thriller    171
Comedy      162
Horror      125
Action      107
dtype: int64

In [18]:
#seperate the genre of movies

df = tmdb_data['genres'].str.get_dummies('|')
print (df)



     Action  Adventure  Animation  Comedy  Crime  Documentary  Drama  Family  \
0         1          1          0       0      0            0      0       0   
1         1          1          0       0      0            0      0       0   
2         0          1          0       0      0            0      0       0   
3         1          1          0       0      0            0      0       0   
4         1          0          0       0      1            0      0       0   
5         0          1          0       0      0            0      1       0   
6         1          1          0       0      0            0      0       0   
7         0          1          0       0      0            0      1       0   
8         0          1          1       1      0            0      0       1   
9         0          0          1       1      0            0      0       1   
10        1          1          0       0      1            0      0       0   
11        1          1          0       

In [19]:
tmdb_data=pd.concat([tmdb_data, df], axis=1)
tmdb_data.head()

Unnamed: 0,original_title,cast,director,runtime,genres,vote_average,Action,Adventure,Animation,Comedy,...,History,Horror,Music,Mystery,Romance,Science Fiction,TV Movie,Thriller,War,Western
0,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,Colin Trevorrow,124,Action|Adventure|Science Fiction|Thriller,6.5,1,1,0,0,...,0,0,0,0,0,1,0,1,0,0
1,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,George Miller,120,Action|Adventure|Science Fiction|Thriller,7.1,1,1,0,0,...,0,0,0,0,0,1,0,1,0,0
2,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,Robert Schwentke,119,Adventure|Science Fiction|Thriller,6.3,0,1,0,0,...,0,0,0,0,0,1,0,1,0,0
3,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,J.J. Abrams,136,Action|Adventure|Science Fiction|Fantasy,7.5,1,1,0,0,...,0,0,0,0,0,1,0,0,0,0
4,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,James Wan,137,Action|Crime|Thriller,7.3,1,0,0,0,...,0,0,0,0,0,0,0,1,0,0


In [20]:
# remove movie which is not in Successful Genre(Drama/Comedy/Thriller/Action/Horror).

tmdb_data_new=tmdb_data[(tmdb_data.Drama==1) | (tmdb_data.Comedy==1) | (tmdb_data.Thriller==1 )| (tmdb_data.Action==1) | (tmdb_data.Horror==1)]
tmdb_data_new

Unnamed: 0,original_title,cast,director,runtime,genres,vote_average,Action,Adventure,Animation,Comedy,...,History,Horror,Music,Mystery,Romance,Science Fiction,TV Movie,Thriller,War,Western
0,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,Colin Trevorrow,124,Action|Adventure|Science Fiction|Thriller,6.5,1,1,0,0,...,0,0,0,0,0,1,0,1,0,0
1,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,George Miller,120,Action|Adventure|Science Fiction|Thriller,7.1,1,1,0,0,...,0,0,0,0,0,1,0,1,0,0
2,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,Robert Schwentke,119,Adventure|Science Fiction|Thriller,6.3,0,1,0,0,...,0,0,0,0,0,1,0,1,0,0
3,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,J.J. Abrams,136,Action|Adventure|Science Fiction|Fantasy,7.5,1,1,0,0,...,0,0,0,0,0,1,0,0,0,0
4,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,James Wan,137,Action|Crime|Thriller,7.3,1,0,0,0,...,0,0,0,0,0,0,0,1,0,0
5,The Revenant,Leonardo DiCaprio|Tom Hardy|Will Poulter|Domhn...,Alejandro GonzÃ¡lez IÃ±Ã¡rritu,156,Western|Drama|Adventure|Thriller,7.2,0,1,0,0,...,0,0,0,0,0,0,0,1,0,1
6,Terminator Genisys,Arnold Schwarzenegger|Jason Clarke|Emilia Clar...,Alan Taylor,125,Science Fiction|Action|Thriller|Adventure,5.8,1,1,0,0,...,0,0,0,0,0,1,0,1,0,0
7,The Martian,Matt Damon|Jessica Chastain|Kristen Wiig|Jeff ...,Ridley Scott,141,Drama|Adventure|Science Fiction,7.6,0,1,0,0,...,0,0,0,0,0,1,0,0,0,0
8,Minions,Sandra Bullock|Jon Hamm|Michael Keaton|Allison...,Kyle Balda|Pierre Coffin,91,Family|Animation|Adventure|Comedy,6.5,0,1,1,1,...,0,0,0,0,0,0,0,0,0,0
9,Inside Out,Amy Poehler|Phyllis Smith|Richard Kind|Bill Ha...,Pete Docter,94,Comedy|Animation|Family,8.0,0,0,1,1,...,0,0,0,0,0,0,0,0,0,0


In [21]:
#remove other genres. 
# We only explore movies Drama/Comedy/Thriller/Action/Horror in Genre.
tmdb_data_new=tmdb_data_new[['original_title','cast','director','runtime','vote_average','Drama','Comedy','Thriller','Action','Horror']]
tmdb_data_new.head()

Unnamed: 0,original_title,cast,director,runtime,vote_average,Drama,Comedy,Thriller,Action,Horror
0,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,Colin Trevorrow,124,6.5,0,0,1,1,0
1,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,George Miller,120,7.1,0,0,1,1,0
2,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,Robert Schwentke,119,6.3,0,0,1,0,0
3,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,J.J. Abrams,136,7.5,0,0,0,1,0
4,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,James Wan,137,7.3,0,0,1,1,0


In [82]:
tmdb_data_new.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 558 entries, 0 to 625
Data columns (total 10 columns):
original_title    558 non-null object
cast              557 non-null object
director          556 non-null object
runtime           558 non-null int64
vote_average      558 non-null float64
Drama             558 non-null int64
Comedy            558 non-null int64
Thriller          558 non-null int64
Action            558 non-null int64
Horror            558 non-null int64
dtypes: float64(1), int64(6), object(3)
memory usage: 48.0+ KB


<a id='similarity'></a>
## Similarity Matrix

### Research Question 3 : Similarity Matrix from features vote_average and Successful Genre without normalization

In [22]:
tmdb_data_new.index=tmdb_data_new['original_title']
tmdb_data_new.head()

Unnamed: 0_level_0,original_title,cast,director,runtime,vote_average,Drama,Comedy,Thriller,Action,Horror
original_title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Jurassic World,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,Colin Trevorrow,124,6.5,0,0,1,1,0
Mad Max: Fury Road,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,George Miller,120,7.1,0,0,1,1,0
Insurgent,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,Robert Schwentke,119,6.3,0,0,1,0,0
Star Wars: The Force Awakens,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,J.J. Abrams,136,7.5,0,0,0,1,0
Furious 7,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,James Wan,137,7.3,0,0,1,1,0


In [23]:
similarity=tmdb_data_new.iloc[:,4:10]

In [25]:
similarity.head(10)

Unnamed: 0_level_0,vote_average,Drama,Comedy,Thriller,Action,Horror
original_title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Jurassic World,6.5,0,0,1,1,0
Mad Max: Fury Road,7.1,0,0,1,1,0
Insurgent,6.3,0,0,1,0,0
Star Wars: The Force Awakens,7.5,0,0,0,1,0
Furious 7,7.3,0,0,1,1,0
The Revenant,7.2,1,0,1,0,0
Terminator Genisys,5.8,0,0,1,1,0
The Martian,7.6,1,0,0,0,0
Minions,6.5,0,1,0,0,0
Inside Out,8.0,0,1,0,0,0


In [26]:
from sklearn.metrics.pairwise import cosine_similarity
similarity_matrix=pd.DataFrame(cosine_similarity(similarity,similarity), columns=similarity.index, index=similarity.index)

In [27]:
similarity_matrix

original_title,Jurassic World,Mad Max: Fury Road,Insurgent,Star Wars: The Force Awakens,Furious 7,The Revenant,Terminator Genisys,The Martian,Minions,Inside Out,...,King Jack,The Suicide Theory,Stalked By My Neighbor,The Outfield,Buzzard,The Gamechangers,Residue,Always Watching: A Marble Hornets Story,Once I Was a Beehive,John Mulaney: The Comeback Kid
original_title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Jurassic World,1.000000,0.999845,0.988625,0.988436,0.999738,0.979307,0.999689,0.968789,0.965777,0.969594,...,0.966742,0.977401,0.988270,0.955452,0.939982,0.967319,0.964059,0.950568,0.954802,0.966435
Mad Max: Fury Road,0.999845,1.000000,0.990262,0.990387,0.999986,0.981171,0.999095,0.972353,0.969330,0.973161,...,0.970298,0.979080,0.989624,0.958966,0.943440,0.970877,0.964902,0.954064,0.958314,0.969990
Insurgent,0.988625,0.990262,1.000000,0.978972,0.990691,0.990483,0.985784,0.979195,0.976151,0.980009,...,0.977126,0.988625,0.999748,0.965715,0.950079,0.977709,0.975437,0.960778,0.965058,0.976815
Star Wars: The Force Awakens,0.988436,0.990387,0.978972,1.000000,0.990909,0.972643,0.985152,0.982757,0.979702,0.983574,...,0.980680,0.968568,0.975239,0.969227,0.953534,0.981266,0.945630,0.964273,0.968568,0.980368
Furious 7,0.999738,0.999986,0.990691,0.990909,1.000000,0.981668,0.998858,0.973357,0.970331,0.974166,...,0.971300,0.979521,0.989969,0.959957,0.944414,0.971880,0.965096,0.955050,0.959304,0.970991
The Revenant,0.979307,0.981171,0.990483,0.972643,0.981668,1.000000,0.976149,0.990644,0.969840,0.973673,...,0.990638,0.999794,0.989802,0.979662,0.965429,0.990662,0.965004,0.954567,0.979307,0.970500
Terminator Genisys,0.999689,0.999095,0.985784,0.985152,0.998858,0.976149,1.000000,0.963234,0.960239,0.964034,...,0.961199,0.974508,0.985830,0.949973,0.934592,0.961772,0.962355,0.945117,0.949327,0.960893
The Martian,0.968789,0.972353,0.979195,0.982757,0.973357,0.990644,0.963234,1.000000,0.979925,0.983798,...,0.999885,0.988401,0.975462,0.988776,0.974325,0.999939,0.945846,0.964493,0.988401,0.980592
Minions,0.965777,0.969330,0.976151,0.979702,0.970331,0.969840,0.960239,0.979925,1.000000,0.999600,...,0.977855,0.965777,0.972429,0.988962,0.974766,0.978438,0.942905,0.961494,0.988636,0.999990
Inside Out,0.969594,0.973161,0.980009,0.983574,0.974166,0.973673,0.964034,0.983798,0.999600,1.000000,...,0.981719,0.969594,0.976272,0.988630,0.974105,0.982305,0.946631,0.965294,0.988240,0.999717


### Research Question 4 : Similarity Matrix from features vote_average and Successful Genre with normalization

In [29]:
#Normalization the data

from sklearn import preprocessing

x = similarity.values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
datanorm = pd.DataFrame(x_scaled, index=similarity.index)
datanorm.columns={'vote_average', 'Drama', 'Comedy', 'Thriller','Action','Horror'}
datanorm

Unnamed: 0_level_0,vote_average,Action,Thriller,Drama,Horror,Comedy
original_title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Jurassic World,0.706897,0.0,0.0,1.0,1.0,0.0
Mad Max: Fury Road,0.810345,0.0,0.0,1.0,1.0,0.0
Insurgent,0.672414,0.0,0.0,1.0,0.0,0.0
Star Wars: The Force Awakens,0.879310,0.0,0.0,0.0,1.0,0.0
Furious 7,0.844828,0.0,0.0,1.0,1.0,0.0
The Revenant,0.827586,1.0,0.0,1.0,0.0,0.0
Terminator Genisys,0.586207,0.0,0.0,1.0,1.0,0.0
The Martian,0.896552,1.0,0.0,0.0,0.0,0.0
Minions,0.706897,0.0,1.0,0.0,0.0,0.0
Inside Out,0.965517,0.0,1.0,0.0,0.0,0.0


In [30]:
#from sklearn.metrics.pairwise import cosine_similarity
similarity_matrix_norm=pd.DataFrame(cosine_similarity(datanorm,datanorm), columns=datanorm.index, index=datanorm.index)

In [32]:
similarity_matrix_norm.head(10)

original_title,Jurassic World,Mad Max: Fury Road,Insurgent,Star Wars: The Force Awakens,Furious 7,The Revenant,Terminator Genisys,The Martian,Minions,Inside Out,...,King Jack,The Suicide Theory,Stalked By My Neighbor,The Outfield,Buzzard,The Gamechangers,Residue,Always Watching: A Marble Hornets Story,Once I Was a Beehive,John Mulaney: The Comeback Kid
original_title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Jurassic World,1.0,0.998387,0.774354,0.770225,0.997191,0.611823,0.997511,0.298464,0.258086,0.310558,...,0.270226,0.599952,0.768572,0.203777,0.154526,0.27783,0.544158,0.132522,0.199905,0.266279
Mad Max: Fury Road,0.998387,1.0,0.786547,0.789036,0.999835,0.62553,0.991898,0.331881,0.286982,0.345329,...,0.300481,0.610337,0.77544,0.226593,0.171827,0.308937,0.543581,0.14736,0.222287,0.296092
Insurgent,0.774354,0.786547,1.0,0.368466,0.789912,0.788271,0.755731,0.372489,0.322096,0.387582,...,0.337246,0.774354,0.994892,0.254318,0.192851,0.346737,0.706848,0.16539,0.249485,0.33232
Star Wars: The Force Awakens,0.770225,0.789036,0.368466,1.0,0.794517,0.333514,0.743398,0.440804,0.38117,0.458666,...,0.399098,0.295241,0.311267,0.30096,0.228221,0.410329,0.163778,0.195723,0.295241,0.393269
Furious 7,0.997191,0.999835,0.789912,0.794517,1.0,0.62949,0.989428,0.342346,0.296031,0.356218,...,0.309955,0.613244,0.777109,0.233738,0.177245,0.318678,0.543026,0.152006,0.229296,0.305428
The Revenant,0.611823,0.62553,0.788271,0.333514,0.62949,1.0,0.592048,0.791558,0.291543,0.350817,...,0.791468,0.997827,0.776311,0.614307,0.505196,0.792007,0.543324,0.149702,0.611823,0.300797
Terminator Genisys,0.997511,0.991898,0.755731,0.743398,0.989428,0.592048,1.0,0.255615,0.221034,0.265973,...,0.23143,0.584358,0.756588,0.174522,0.132342,0.237943,0.542431,0.113497,0.171205,0.22805
The Martian,0.298464,0.331881,0.372489,0.440804,0.342346,0.791558,0.255615,1.0,0.385331,0.463673,...,0.996647,0.769399,0.314666,0.772874,0.634099,0.998177,0.165566,0.19786,0.769399,0.397562
Minions,0.258086,0.286982,0.322096,0.38117,0.296031,0.291543,0.221034,0.385331,1.0,0.988392,...,0.348874,0.258086,0.272096,0.777035,0.641899,0.358691,0.143167,0.171092,0.774566,0.999744
Inside Out,0.310558,0.345329,0.387582,0.458666,0.356218,0.350817,0.265973,0.463673,0.988392,1.0,...,0.419804,0.310558,0.327416,0.769362,0.629813,0.431617,0.172275,0.205877,0.765575,0.991575


### Research Question 5 : Get Top 10 movies similar to each movie. 

In [None]:
# diambil dari setiap rows yang memiliki nilai 10 tertinggi
# output :
# Jurasic world paling similar dengan Mad Max: Fury Road, Furious 7, Terminator Genisys, ... , .... 


<a id='conclusions'></a>
## Conclusions
> 
