# Top Earners in Movie Industry

## Table of Contents

<ul>
    <li><a href="#intro">Introduction</a></li>
    <li><a href="#eda">Exploratory Data Analysis</a></li>
    <li><a href="#conclusion">Conclusion</a></li>
</ul>

<a id="#intro"></a>
## Introduction

> This analysis project is to be done using the imdb movie data. When the analysis is completed, you should be able to find the top 5 highest grossing directors, the top 5 highest grossing movie genres of all time, comparing the revenue of the highest grossing movies and which companies released the most movies. 

> There are 10 columns that will not be needed for the analysis. Use pandas to drop these columns. HINT: Only the columns pertaining to revenue will be needed.

> To get you started, I've already placed the needed code for getting the packages and datafile that you will be using for the project. 

In [93]:
!pip install numpy
!pip install pandas
import csv
import pandas as pd
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plt



In [94]:
def open_csv(filename, d = ','):
    data = []
    with open(filename, encoding = 'utf-8') as mData:
        info = csv.reader(mData, delimiter = d)
        for row in info:
            data.append(row)
    return data
csv_data = open_csv('moviecsv/imdb-movies.csv')
print(csv_data[5:6])

[['168259', 'tt2820852', '9.335014', '190000000', '1506249360', 'Furious 7', 'Vin Diesel|Paul Walker|Jason Statham|Michelle Rodriguez|Dwayne Johnson', 'http://www.furious7.com/', 'James Wan', 'Vengeance Hits Home', 'car race|speed|revenge|suspense|car', 'Deckard Shaw seeks revenge against Dominic Toretto and his family for his comatose brother.', '137', 'Action|Crime|Thriller', 'Universal Pictures|Original Film|Media Rights Capital|Dentsu|One Race Films', '4/1/2015', '2947', '7.3', '2015', '174799923.1', '1385748801']]


In [95]:
movielist = pd.read_csv('moviecsv/imdb-movies.csv', sep=',')
movielist.head(5)

Unnamed: 0,id,imdb_id,popularity,budget,revenue,original_title,cast,homepage,director,tagline,...,overview,runtime,genres,production_companies,release_date,vote_count,vote_average,release_year,budget_adj,revenue_adj
0,135397,tt0369610,32.985763,150000000,1513528810,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,http://www.jurassicworld.com/,Colin Trevorrow,The park is open.,...,Twenty-two years after the events of Jurassic ...,124,Action|Adventure|Science Fiction|Thriller,Universal Studios|Amblin Entertainment|Legenda...,6/9/2015,5562,6.5,2015,137999939.3,1392446000.0
1,76341,tt1392190,28.419936,150000000,378436354,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,http://www.madmaxmovie.com/,George Miller,What a Lovely Day.,...,An apocalyptic story set in the furthest reach...,120,Action|Adventure|Science Fiction|Thriller,Village Roadshow Pictures|Kennedy Miller Produ...,5/13/2015,6185,7.1,2015,137999939.3,348161300.0
2,262500,tt2908446,13.112507,110000000,295238201,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,http://www.thedivergentseries.movie/#insurgent,Robert Schwentke,One Choice Can Destroy You,...,Beatrice Prior must confront her inner demons ...,119,Adventure|Science Fiction|Thriller,Summit Entertainment|Mandeville Films|Red Wago...,3/18/2015,2480,6.3,2015,101199955.5,271619000.0
3,140607,tt2488496,11.173104,200000000,2068178225,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,http://www.starwars.com/films/star-wars-episod...,J.J. Abrams,Every generation has a story.,...,Thirty years after defeating the Galactic Empi...,136,Action|Adventure|Science Fiction|Fantasy,Lucasfilm|Truenorth Productions|Bad Robot,12/15/2015,5292,7.5,2015,183999919.0,1902723000.0
4,168259,tt2820852,9.335014,190000000,1506249360,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,http://www.furious7.com/,James Wan,Vengeance Hits Home,...,Deckard Shaw seeks revenge against Dominic Tor...,137,Action|Crime|Thriller,Universal Pictures|Original Film|Media Rights ...,4/1/2015,2947,7.3,2015,174799923.1,1385749000.0


### Drop columns without neccesary information and remove all records with no financial information -- Pay close attention to things that don't tell you anything regarding financial data

In [96]:
film_financial_data = movielist[['revenue_adj','original_title']].sort_values(['revenue_adj'], ascending=True).reset_index(drop=True)
film_financial_data

Unnamed: 0,revenue_adj,original_title
0,0.000000e+00,Manos: The Hands of Fate
1,0.000000e+00,Kontroll
2,0.000000e+00,Dead End
3,0.000000e+00,The Statement
4,0.000000e+00,Scooby-Doo! And the Legend of the Vampire
...,...,...
10861,1.907006e+09,Jaws
10862,2.167325e+09,The Exorcist
10863,2.506406e+09,Titanic
10864,2.789712e+09,Star Wars


### Data Cleaning

In [97]:
movielist_wipe = movielist[movielist['revenue'] >= 1]
movielist_wipe

Unnamed: 0,id,imdb_id,popularity,budget,revenue,original_title,cast,homepage,director,tagline,...,overview,runtime,genres,production_companies,release_date,vote_count,vote_average,release_year,budget_adj,revenue_adj
0,135397,tt0369610,32.985763,150000000,1513528810,Jurassic World,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,http://www.jurassicworld.com/,Colin Trevorrow,The park is open.,...,Twenty-two years after the events of Jurassic ...,124,Action|Adventure|Science Fiction|Thriller,Universal Studios|Amblin Entertainment|Legenda...,6/9/2015,5562,6.5,2015,1.379999e+08,1.392446e+09
1,76341,tt1392190,28.419936,150000000,378436354,Mad Max: Fury Road,Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic...,http://www.madmaxmovie.com/,George Miller,What a Lovely Day.,...,An apocalyptic story set in the furthest reach...,120,Action|Adventure|Science Fiction|Thriller,Village Roadshow Pictures|Kennedy Miller Produ...,5/13/2015,6185,7.1,2015,1.379999e+08,3.481613e+08
2,262500,tt2908446,13.112507,110000000,295238201,Insurgent,Shailene Woodley|Theo James|Kate Winslet|Ansel...,http://www.thedivergentseries.movie/#insurgent,Robert Schwentke,One Choice Can Destroy You,...,Beatrice Prior must confront her inner demons ...,119,Adventure|Science Fiction|Thriller,Summit Entertainment|Mandeville Films|Red Wago...,3/18/2015,2480,6.3,2015,1.012000e+08,2.716190e+08
3,140607,tt2488496,11.173104,200000000,2068178225,Star Wars: The Force Awakens,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,http://www.starwars.com/films/star-wars-episod...,J.J. Abrams,Every generation has a story.,...,Thirty years after defeating the Galactic Empi...,136,Action|Adventure|Science Fiction|Fantasy,Lucasfilm|Truenorth Productions|Bad Robot,12/15/2015,5292,7.5,2015,1.839999e+08,1.902723e+09
4,168259,tt2820852,9.335014,190000000,1506249360,Furious 7,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,http://www.furious7.com/,James Wan,Vengeance Hits Home,...,Deckard Shaw seeks revenge against Dominic Tor...,137,Action|Crime|Thriller,Universal Pictures|Original Film|Media Rights ...,4/1/2015,2947,7.3,2015,1.747999e+08,1.385749e+09
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10822,396,tt0061184,0.670274,7500000,33736689,Who's Afraid of Virginia Woolf?,Elizabeth Taylor|Richard Burton|George Segal|S...,,Mike Nichols,You are cordially invited to George and Martha...,...,Mike Nicholsâ€™ film from Edward Albee's play ...,131,Drama,Chenault Productions,6/21/1966,74,7.5,1966,5.038511e+07,2.266436e+08
10828,5780,tt0061107,0.402730,3000000,13000000,Torn Curtain,Paul Newman|Julie Andrews|Lila Kedrova|HansjÃ¶...,,Alfred Hitchcock,It tears you apart with suspense!,...,An American scientist publicly defects to East...,128,Mystery|Thriller,Universal Pictures,7/13/1966,46,6.3,1966,2.015404e+07,8.733419e+07
10829,6644,tt0061619,0.395668,4653000,6000000,El Dorado,John Wayne|Robert Mitchum|James Caan|Charlene ...,,Howard Hawks,It's the Big One with the Big Two,...,"Cole Thornton, a gunfighter for hire, joins fo...",120,Action|Western,Paramount Pictures|Laurel Productions,12/17/1966,36,6.9,1966,3.125892e+07,4.030809e+07
10835,5923,tt0060934,0.299911,12000000,20000000,The Sand Pebbles,Steve McQueen|Richard Attenborough|Richard Cre...,,Robert Wise,This is the heroic story of the men on the U.S...,...,Engineer Jake Holman arrives aboard the gunboa...,182,Action|Adventure|Drama|War|Romance,Twentieth Century Fox Film Corporation|Solar P...,12/20/1966,28,7.0,1966,8.061618e+07,1.343603e+08


#### Here's a helpful hint from my own analysis when I ran this the first time. This may help shed light on what your data set should look like.

#### If I created one record for each the `production_companies` a movie was release under and one record each for `genres`<br>and tried to run calculations, it wouldn't work because for many records, the amount of `production_companies`<br>and `genres` aren't the same, so I'll create 2 dataframes; one w/o a `production_companies` column and one w/o a `genres` columns

In [98]:
films = movielist['production_companies']
films

#genres_movies = movies['genres']
#genres_movies

0        Universal Studios|Amblin Entertainment|Legenda...
1        Village Roadshow Pictures|Kennedy Miller Produ...
2        Summit Entertainment|Mandeville Films|Red Wago...
3                Lucasfilm|Truenorth Productions|Bad Robot
4        Universal Pictures|Original Film|Media Rights ...
                               ...                        
10861                                    Bruce Brown Films
10862    Cherokee Productions|Joel Productions|Douglas ...
10863                                              Mosfilm
10864                              Benedict Pictures Corp.
10865                                            Norm-Iris
Name: production_companies, Length: 10866, dtype: object

<a id="eda"></a>
## Exploratory Data Analysis

> Use Matplotlib to display your data analysis

### Which production companies released the most movies in the last 10 years? Display the top 5 production companies.

<ol>
    <li>Ingenious Film Partners|Twentieth Century Fox Film Corporation|Dune Entertainment|Lightstorm Entertainment</li>
    <li>Lucasfilm|Twentieth Century Fox Film Corporation</li>
    <li>Paramount Pictures|Twentieth Century Fox Film Corporation|Lightstorm Entertainment</li>
    <li>Warner Bros.|Hoya Productions</li>
    <li>Universal Pictures|Zanuck/Brown Productions</li>
</ol>

In [99]:
highestfilm = movielist[['revenue_adj', 'production_companies','original_title']].sort_values(['revenue_adj'], ascending=False).reset_index(drop=True)
highestfilm.head(5)

Unnamed: 0,revenue_adj,production_companies,original_title
0,2827124000.0,Ingenious Film Partners|Twentieth Century Fox ...,Avatar
1,2789712000.0,Lucasfilm|Twentieth Century Fox Film Corporation,Star Wars
2,2506406000.0,Paramount Pictures|Twentieth Century Fox Film ...,Titanic
3,2167325000.0,Warner Bros.|Hoya Productions,The Exorcist
4,1907006000.0,Universal Pictures|Zanuck/Brown Productions,Jaws


### What 5 movie genres grossed the highest all-time?

<ol>
    <li>Action|Adventure|Fantasy|Science Fiction</li>
    <li>Adventure|Action|Science Fiction</li>
    <li>Drama|Romance|Thriller</li>
    <li>Drama|Horror|Thriller</li>
    <li>Horror|Thriller|Adventure</li>
</ol>

In [100]:
bestgenre = movielist[['revenue_adj', 'genres','imdb_id' ]].sort_values(['revenue_adj','genres'], ascending=False).reset_index(drop=True)
bestgenre.head(5)

Unnamed: 0,revenue_adj,genres,imdb_id
0,2827124000.0,Action|Adventure|Fantasy|Science Fiction,tt0499549
1,2789712000.0,Adventure|Action|Science Fiction,tt0076759
2,2506406000.0,Drama|Romance|Thriller,tt0120338
3,2167325000.0,Drama|Horror|Thriller,tt0070047
4,1907006000.0,Horror|Thriller|Adventure,tt0073195


### Who are the top 5 grossing directors?

<ol>
    <li>James Cameron</li>
    <li>George Lucas</li>
    <li>William Friedkin</li>
    <li>Steven Spielberg</li>
    <li>J.J. Abrams </li>
</ol>

In [101]:
bestdirector= movielist[['revenue_adj', 'director', 'imdb_id', ]].sort_values(['revenue_adj'], ascending=False).reset_index(drop=True)
bestdirector.head(10)

Unnamed: 0,revenue_adj,director,imdb_id
0,2827124000.0,James Cameron,tt0499549
1,2789712000.0,George Lucas,tt0076759
2,2506406000.0,James Cameron,tt0120338
3,2167325000.0,William Friedkin,tt0070047
4,1907006000.0,Steven Spielberg,tt0073195
5,1902723000.0,J.J. Abrams,tt2488496
6,1791694000.0,Steven Spielberg,tt0083866
7,1583050000.0,Irwin Winkler,tt0113957
8,1574815000.0,Clyde Geronimi|Hamilton Luske|Wolfgang Reitherman,tt0055254
9,1443191000.0,Joss Whedon,tt0848228


### Compare the revenue of the highest grossing movies of all time.

In [102]:
movielist[['original_title','genres','revenue_adj']].describe()
revcompare = movielist[['original_title','revenue_adj','imdb_id']].sort_values(['revenue_adj'], ascending=False).reset_index(drop=True)
revcompare
revcompare.head(5)

Unnamed: 0,original_title,revenue_adj,imdb_id
0,Avatar,2827124000.0,tt0499549
1,Star Wars,2789712000.0,tt0076759
2,Titanic,2506406000.0,tt0120338
3,The Exorcist,2167325000.0,tt0070047
4,Jaws,1907006000.0,tt0073195


<a id="conclusions"></a>
## Conclusions

> Using the cell below, write a brief conclusion of what you have found from the anaylsis of the data. The Cell below will allow you to write plan text instead of code.

In conclusion the highest profiting director is James Cameron, bringing in the highest grossing movie of 'Avatar,' in the genres Action, Adventure, Fantasy, and Science Fiction.

In [103]:
movielist[['original_title','director','genres','production_companies','revenue_adj']].describe()
revcompare = movielist[['original_title','revenue_adj','director', 'genres','imdb_id']].sort_values(['revenue_adj'], ascending=False).reset_index(drop=True)
revcompare
revcompare.head(5)

Unnamed: 0,original_title,revenue_adj,director,genres,imdb_id
0,Avatar,2827124000.0,James Cameron,Action|Adventure|Fantasy|Science Fiction,tt0499549
1,Star Wars,2789712000.0,George Lucas,Adventure|Action|Science Fiction,tt0076759
2,Titanic,2506406000.0,James Cameron,Drama|Romance|Thriller,tt0120338
3,The Exorcist,2167325000.0,William Friedkin,Drama|Horror|Thriller,tt0070047
4,Jaws,1907006000.0,Steven Spielberg,Horror|Thriller|Adventure,tt0073195
