# Project 1: Explanatory Data Analysis & Data Presentation (Movies Dataset)

# Project Brief for Self-Coders

Here you´ll have the opportunity to code major parts of Project 1 on your own. If you need any help or inspiration, have a look at the Videos or the Jupyter Notebook with the full code. <br> <br>
Keep in mind that it´s all about __getting the right results/conclusions__. It´s not about finding the identical code. Things can be coded in many different ways. Even if you come to the same conclusions, it´s very unlikely that we have the very same code. 

## Data Import and first Inspection

1. __Import__ the movies dataset from the CSV file "movies_complete.csv". __Inspect__ the data.

__Some additional information on Features/Columns__:

* **id:** The ID of the movie (clear/unique identifier).
* **title:** The Official Title of the movie.
* **tagline:** The tagline of the movie.
* **release_date:** Theatrical Release Date of the movie.
* **genres:** Genres associated with the movie.
* **belongs_to_collection:** Gives information on the movie series/franchise the particular film belongs to.
* **original_language:** The language in which the movie was originally shot in.
* **budget_musd:** The budget of the movie in million dollars.
* **revenue_musd:** The total revenue of the movie in million dollars.
* **production_companies:** Production companies involved with the making of the movie.
* **production_countries:** Countries where the movie was shot/produced in.
* **vote_count:** The number of votes by users, as counted by TMDB.
* **vote_average:** The average rating of the movie.
* **popularity:** The Popularity Score assigned by TMDB.
* **runtime:** The runtime of the movie in minutes.
* **overview:** A brief blurb of the movie.
* **spoken_languages:** Spoken languages in the film.
* **poster_path:** The URL of the poster image.
* **cast:** (Main) Actors appearing in the movie.
* **cast_size:** number of Actors appearing in the movie.
* **director:** Director of the movie.
* **crew_size:** Size of the film crew (incl. director, excl. actors).

In [32]:
import pandas as pd
movies = pd.read_csv('movies_complete.csv')

In [33]:
movies.head()


Unnamed: 0,id,title,tagline,release_date,genres,belongs_to_collection,original_language,budget_musd,revenue_musd,production_companies,...,vote_average,popularity,runtime,overview,spoken_languages,poster_path,cast,cast_size,crew_size,director
0,862,Toy Story,,1995-10-30,Animation|Comedy|Family,Toy Story Collection,en,30.0,373.554033,Pixar Animation Studios,...,7.7,21.946943,81.0,"Led by Woody, Andy's toys live happily in his ...",English,<img src='http://image.tmdb.org/t/p/w185//uXDf...,Tom Hanks|Tim Allen|Don Rickles|Jim Varney|Wal...,13,106,John Lasseter
1,8844,Jumanji,Roll the dice and unleash the excitement!,1995-12-15,Adventure|Fantasy|Family,,en,65.0,262.797249,TriStar Pictures|Teitler Film|Interscope Commu...,...,6.9,17.015539,104.0,When siblings Judy and Peter discover an encha...,English|Français,<img src='http://image.tmdb.org/t/p/w185//vgpX...,Robin Williams|Jonathan Hyde|Kirsten Dunst|Bra...,26,16,Joe Johnston
2,15602,Grumpier Old Men,Still Yelling. Still Fighting. Still Ready for...,1995-12-22,Romance|Comedy,Grumpy Old Men Collection,en,,,Warner Bros.|Lancaster Gate,...,6.5,11.7129,101.0,A family wedding reignites the ancient feud be...,English,<img src='http://image.tmdb.org/t/p/w185//1FSX...,Walter Matthau|Jack Lemmon|Ann-Margret|Sophia ...,7,4,Howard Deutch
3,31357,Waiting to Exhale,Friends are the people who let you be yourself...,1995-12-22,Comedy|Drama|Romance,,en,16.0,81.452156,Twentieth Century Fox Film Corporation,...,6.1,3.859495,127.0,"Cheated on, mistreated and stepped on, the wom...",English,<img src='http://image.tmdb.org/t/p/w185//4wjG...,Whitney Houston|Angela Bassett|Loretta Devine|...,10,10,Forest Whitaker
4,11862,Father of the Bride Part II,Just When His World Is Back To Normal... He's ...,1995-02-10,Comedy,Father of the Bride Collection,en,,76.578911,Sandollar Productions|Touchstone Pictures,...,5.7,8.387519,106.0,Just when George Banks has recovered from his ...,English,<img src='http://image.tmdb.org/t/p/w185//lf9R...,Steve Martin|Diane Keaton|Martin Short|Kimberl...,12,7,Charles Shyer


In [34]:
movies.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 44691 entries, 0 to 44690
Data columns (total 22 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   id                     44691 non-null  int64  
 1   title                  44691 non-null  object 
 2   tagline                20284 non-null  object 
 3   release_date           44657 non-null  object 
 4   genres                 42586 non-null  object 
 5   belongs_to_collection  4463 non-null   object 
 6   original_language      44681 non-null  object 
 7   budget_musd            8854 non-null   float64
 8   revenue_musd           7385 non-null   float64
 9   production_companies   33356 non-null  object 
 10  production_countries   38835 non-null  object 
 11  vote_count             44691 non-null  float64
 12  vote_average           42077 non-null  float64
 13  popularity             44691 non-null  float64
 14  runtime                43179 non-null  float64
 15  ov

## The best and the worst movies...

2. __Filter__ the Dataset and __find the best/worst n Movies__ with the

- Highest Revenue
- Highest Budget
- Highest Profit (=Revenue - Budget)
- Lowest Profit (=Revenue - Budget)
- Highest Return on Investment (=Revenue / Budget) (only movies with Budget >= 10) 
- Lowest Return on Investment (=Revenue / Budget) (only movies with Budget >= 10)
- Highest number of Votes
- Highest Rating (only movies with 10 or more Ratings)
- Lowest Rating (only movies with 10 or more Ratings)
- Highest Popularity

__Define__ an appropriate __user-defined function__ to reuse code.

__Movies Top 5 - Highest Revenue__

In [35]:
top5revenue = movies.nlargest(5, 'revenue_musd')
top5revenue[['title','revenue_musd']]

Unnamed: 0,title,revenue_musd
14448,Avatar,2787.965087
26265,Star Wars: The Force Awakens,2068.223624
1620,Titanic,1845.034188
17669,The Avengers,1519.55791
24812,Jurassic World,1513.52881


__Movies Top 5 - Highest Budget__

In [36]:
top5budget = movies.nlargest(5, 'budget_musd')
top5budget[['title','budget_musd']]

Unnamed: 0,title,budget_musd
16986,Pirates of the Caribbean: On Stranger Tides,380.0
11743,Pirates of the Caribbean: At World's End,300.0
26268,Avengers: Age of Ultron,280.0
10985,Superman Returns,270.0
16006,Tangled,260.0


__Movies Top 5 - Highest Profit__

In [37]:
movies['profit']=  movies['revenue_musd'] - movies['budget_musd']

In [38]:
top5profit = movies.nlargest(5, 'profit')
top5profit[['title','profit']]

Unnamed: 0,title,profit
14448,Avatar,2550.965087
26265,Star Wars: The Force Awakens,1823.223624
1620,Titanic,1645.034188
24812,Jurassic World,1363.52881
28501,Furious 7,1316.24936


__Movies Top 5 - Lowest Profit__

In [39]:
top5lowprofit = movies.nsmallest(5, 'profit')
top5lowprofit[['title','profit']]

Unnamed: 0,title,profit
20959,The Lone Ranger,-165.71009
7164,The Alamo,-119.180039
16659,Mars Needs Moms,-111.007242
43611,Valerian and the City of a Thousand Planets,-107.447384
2684,The 13th Warrior,-98.301101


__Movies Top 5 - Highest ROI__

In [84]:

movies['ROI']= (movies['profit'] / movies['budget_musd'])
moviesROI =  movies[movies['budget_musd']>=10]


In [85]:
top5ROI = moviesROI.nlargest(5, 'ROI')
top5ROI[['title','ROI','budget_musd']]

Unnamed: 0,title,ROI,budget_musd
1055,E.T. the Extra-Terrestrial,74.520507,10.5
255,Star Wars,69.490728,11.0
588,Pretty Woman,32.071429,14.0
18300,The Intouchables,31.806221,13.0
1144,The Empire Strikes Back,28.911111,18.0


__Movies Top 5 - Lowest ROI__

In [86]:
top5lowROI = moviesROI.nsmallest(5, 'ROI')
top5lowROI[['title','ROI','budget_musd']]

Unnamed: 0,title,ROI,budget_musd
6955,Chasing Liberty,-0.999999,23.0
8041,The Cookout,-0.999999,16.0
17381,Deadfall,-0.999998,10.0
6678,In the Cut,-0.999998,12.0
20015,The Samaritan,-0.99979,12.0


__Movies Top 5 - Most Votes__

In [87]:
top5votes = movies.nlargest(5, 'vote_count')
top5votes[['title','vote_count']]

Unnamed: 0,title,vote_count
15368,Inception,14075.0
12396,The Dark Knight,12269.0
14448,Avatar,12114.0
17669,The Avengers,12000.0
26272,Deadpool,11444.0


__Movies Top 5 - Highest Rating__

In [44]:
movies_vote = movies[movies['vote_count']>=10]


In [45]:
top5rating = movies_vote.nlargest(5, 'vote_average')
top5rating[['title','vote_average','vote_count']]

Unnamed: 0,title,vote_average,vote_count
20787,As I Was Moving Ahead Occasionally I Saw Brief...,9.5,10.0
42626,Planet Earth II,9.5,50.0
18462,The Civil War,9.2,15.0
10233,Dilwale Dulhania Le Jayenge,9.1,661.0
42822,Cosmos,9.1,41.0


__Movies Top 5 - Lowest Rating__

In [46]:
top5lowrating = movies_vote.nsmallest(5, 'vote_average')
top5lowrating[['title','vote_average','vote_count']]

Unnamed: 0,title,vote_average,vote_count
30618,Extinction: Nature Has Evolved,0.0,10.0
41602,Call Me by Your Name,0.0,18.0
43576,How to Talk to Girls at Parties,0.0,10.0
7030,The Beast of Yucca Flats,1.6,18.0
25418,Santa Claus,1.6,12.0


__Movies Top 5 - Most Popular__

In [47]:
top5popular = movies.nlargest(5, 'popularity')
top5popular[['title','popularity']]

Unnamed: 0,title,popularity
30330,Minions,547.488298
32927,Wonder Woman,294.337037
41556,Beauty and the Beast,287.253654
42940,Baby Driver,228.032744
24187,Big Hero 6,213.849907


## Find your next Movie

3. __Filter__ the Dataset for movies that meet the following conditions:

__Search 1: Science Fiction Action Movie with Bruce Willis (sorted from high to low Rating)__

__Search 2: Movies with Uma Thurman and directed by Quentin Tarantino (sorted from short to long runtime)__

__Search 3: Most Successful Pixar Studio Movies between 2010 and 2015 (sorted from high to low Revenue)__

__Search 4: Action or Thriller Movie with original language English and minimum Rating of 7.5 (most recent movies first)__

In [48]:
# Search 1:
movie1 = movies[(movies['cast'].astype(str).str.contains('Bruce Willis')) & (movies['genres'].astype(str).str.contains('Action'))& (movies['genres'].astype(str).str.contains('Science Fiction'))]

movie1.sort_values('vote_average',ascending=False)[['title','vote_average','genres','cast']]

Unnamed: 0,title,vote_average,genres,cast
1448,The Fifth Element,7.3,Adventure|Fantasy|Action|Thriller|Science Fiction,Bruce Willis|Gary Oldman|Ian Holm|Milla Jovovi...
19218,Looper,6.6,Action|Thriller|Science Fiction,Joseph Gordon-Levitt|Bruce Willis|Emily Blunt|...
1786,Armageddon,6.5,Action|Thriller|Science Fiction|Adventure,Bruce Willis|Billy Bob Thornton|Ben Affleck|Li...
14135,Surrogates,5.9,Action|Science Fiction|Thriller,Bruce Willis|Radha Mitchell|Rosamund Pike|Jame...
20333,G.I. Joe: Retaliation,5.4,Adventure|Action|Science Fiction|Thriller,Dwayne Johnson|D.J. Cotrona|Adrianne Palicki|B...
27619,Vice,4.1,Thriller|Science Fiction|Action|Adventure,Ambyr Childers|Thomas Jane|Bryan Greenberg|Bru...


In [49]:
# Search 2:
movie2 = movies[movies['cast'].astype(str).str.contains('Uma Thurman') & (movies['director'].astype(str).str.contains('Quentin Tarantino'))]

movie2.sort_values('runtime',ascending=True)[['title','runtime','director','cast']]

Unnamed: 0,title,runtime,director,cast
6667,Kill Bill: Vol. 1,111.0,Quentin Tarantino,Uma Thurman|Lucy Liu|Vivica A. Fox|Daryl Hanna...
7208,Kill Bill: Vol. 2,136.0,Quentin Tarantino,Uma Thurman|David Carradine|Daryl Hannah|Micha...
291,Pulp Fiction,154.0,Quentin Tarantino,John Travolta|Samuel L. Jackson|Uma Thurman|Br...


In [50]:
# Search 3:
movie3 = movies[movies['production_companies'].astype(str).str.contains('Pixar') & (movies['release_date'].astype(str).str.contains('2010|2011|2012|2013|2014|2015'))]

movie3.sort_values('revenue_musd',ascending=False)[['title','release_date','production_companies','revenue_musd']]

Unnamed: 0,title,release_date,production_companies,revenue_musd
15236,Toy Story 3,2010-06-16,Walt Disney Pictures|Pixar Animation Studios,1066.969703
29957,Inside Out,2015-06-09,Walt Disney Pictures|Pixar Animation Studios,857.611174
20888,Monsters University,2013-06-20,Walt Disney Pictures|Pixar Animation Studios,743.559607
17220,Cars 2,2011-06-11,Walt Disney Pictures|Pixar Animation Studios,559.852396
18900,Brave,2012-06-21,Walt Disney Pictures|Pixar Animation Studios,538.983207
30388,The Good Dinosaur,2015-11-14,Walt Disney Pictures|Pixar Animation Studios,331.926147
16392,Day & Night,2010-06-17,Walt Disney Pictures|Pixar Animation Studios,
21694,The Blue Umbrella,2013-02-12,Pixar Animation Studios,
21697,Toy Story of Terror!,2013-10-15,Walt Disney Pictures|Pixar Animation Studios,
22489,La luna,2011-01-01,Pixar Animation Studios,


In [51]:
#search 4
movie4 = movies[(movies['original_language'].astype(str).str.contains('en')) & (movies['genres'].astype(str).str.contains('Action|Thriller')) & (movies['vote_average'].astype(float) > 7.4)]

movie4.sort_values('release_date',ascending=False)[['title','vote_average','genres','release_date','original_language']]

Unnamed: 0,title,vote_average,genres,release_date,original_language
44490,Descendants 2,7.5,TV Movie|Family|Action|Comedy|Music|Adventure,2017-07-21,en
43941,Dunkirk,7.5,Action|Drama|History|Thriller|War,2017-07-19,en
42624,The Book of Henry,7.6,Thriller|Drama|Crime,2017-06-16,en
26273,Guardians of the Galaxy Vol. 2,7.6,Action|Adventure|Comedy|Science Fiction,2017-04-19,en
43467,Revengeance,8.0,Comedy|Action|Animation,2017-04-05,en
...,...,...,...,...,...
11135,The Music Box,7.5,Action|Comedy,1932-04-16,en
8268,Scarface,7.5,Action|Adventure|Crime|Drama|Thriller,1932-04-09,en
8255,"Steamboat Bill, Jr.",7.9,Action|Comedy,1928-02-14,en
2879,The General,8.0,Action|Adventure|Comedy|Drama,1926-12-31,en


## Are Franchises more successful?

4. __Analyze__ the Dataset and __find out whether Franchises (Movies that belong to a collection) are more successful than stand-alone movies__ in terms of:

- mean revenue
- median Return on Investment
- mean budget raised
- mean popularity
- mean rating

hint: use groupby()

__Franchise vs. Stand-alone: Average Revenue__

In [52]:
alone= movies[movies['belongs_to_collection'].isnull()]


In [53]:
alone['revenue_musd'].mean()

44.7428140060954

In [54]:
fran= movies[movies['belongs_to_collection'].notnull()]


In [55]:
fran['revenue_musd'].mean()

165.70819260175796

==> Franchise movies have larger Average Revenue than Stand-alone movies (165.7 vs 44.7) nearly 4 times.

__Franchise vs. Stand-alone: Return on Investment / Profitability (median)__

In [56]:

fran= movies[movies['belongs_to_collection'].notnull()]
alone= movies[movies['belongs_to_collection'].isnull()]


In [57]:
fran['ROI'].median() 

3.7091950783422467

In [58]:
alone['ROI'].median() 

1.6196993333333334

==> Franchise movies have larger median of ROI than Stand-alone movies (3.7 vs 1.61).

In [59]:
fran['profit'].median()

64.234017

In [60]:
alone['profit'].median()

5.0

==> Franchise movies have larger median of profit than Stand-alone movies (64.2 vs 5).

__Franchise vs. Stand-alone: Average Budget__

In [61]:
fran['budget_musd'].mean()

38.31984712958281

In [62]:
alone['budget_musd'].mean()

18.047741174779947

==>Franchise movies have larger average budget than Stand-alone movies (38.3>18.04) about twice.


__Franchise vs. Stand-alone: Average Popularity__

In [63]:
fran['popularity'].mean()

6.245051188662314

In [64]:
alone['popularity'].mean()

2.592726062543528

==>Franchise movies have larger average popularity than Stand-alone movies (6.2>2.6) about twice.

__Franchise vs. Stand-alone: Average Rating__

In [65]:
fran['vote_average'].mean()

5.956805807622486

In [66]:
alone['vote_average'].mean()

6.008787066287642

==>Franchise movies have lower average rating than Stand-alone movies (5.95<6).

## Most Successful Franchises

5. __Find__ the __most successful Franchises__ in terms of

- __total number of movies__
- __total & mean budget__
- __total & mean revenue__
- __mean rating__

__The Most total number of Franchises movies__

In [67]:
fran_num_movies = fran.groupby('belongs_to_collection').count()


In [68]:
fran_num_most_movies = fran_num_movies[fran_num_movies['title']==fran_num_movies['title'].max()]
fran_num_most_movies

Unnamed: 0_level_0,id,title,tagline,release_date,genres,original_language,budget_musd,revenue_musd,production_companies,production_countries,...,runtime,overview,spoken_languages,poster_path,cast,cast_size,crew_size,director,profit,ROI
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
The Bowery Boys,29,29,24,29,29,29,0,0,25,28,...,28,29,29,29,29,29,29,29,0,0


==> the most successful Franchises in terms of total number of movies is The Bowery Boys with 29 movies

__The most total & mean budget Franchises__

In [69]:
fran_total_movies = fran.groupby('belongs_to_collection').sum('budget_musd')


In [70]:
fran_most_totalbudget_movies = fran_total_movies[fran_total_movies['budget_musd']==fran_total_movies['budget_musd'].max()]
fran_most_totalbudget_movies

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
James Bond Collection,425790,1539.65,7106.970239,33392.0,164.8,349.791063,3309.0,888,743,5567.320239,342.147299


__The Most mean budget franchise movie__


In [71]:
fran_meanbudget_movies = fran.groupby('belongs_to_collection').mean('budget_musd')


In [72]:
fran_mostmeanbudget_movies = fran_meanbudget_movies[fran_meanbudget_movies['budget_musd']==fran_meanbudget_movies['budget_musd'].max()]
fran_mostmeanbudget_movies

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Tangled Collection,60819.0,260.0,591.794936,1901.0,7.25,12.319364,53.0,12.0,45.5,331.794936,2.276134


==>The most total budget Franchises is James Bond Collection with 1539 MUSD.

The most mean budget Franchises is Tangled Collection with 260 MUSD per movies.

__Total & mean revenue Franchise Movies__

In [73]:
fran_most_totalrevenue_movies = fran_total_movies[fran_total_movies['revenue_musd']==fran_total_movies['revenue_musd'].max()]
fran_most_totalrevenue_movies

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Harry Potter Collection,29021,1280.0,7707.367425,47866.0,60.3,210.031146,1178.0,550,533,6427.367425,53.170728


__The most mean revenue__

In [74]:

fran_meanrevenue_movies = fran.groupby('belongs_to_collection').mean('revenue_musd')


In [75]:
fran_mostmeanrevenue_movies = fran_meanrevenue_movies[fran_meanrevenue_movies['revenue_musd']==fran_meanrevenue_movies['revenue_musd'].max()]
fran_mostmeanrevenue_movies

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Avatar Collection,19995.0,237.0,2787.965087,12114.0,7.2,185.070892,162.0,83.0,153.0,2550.965087,11.763566


==> The most successful Franchises in terms of total revenue is Harry Potter Collection with 7707MUSD.

The most successful Franchises in terms of mean revenue is Avatar Collection with 2788 MUSD

__The Most mean rating franchise movie__


In [76]:
fran_meanrating_movies = fran.groupby('belongs_to_collection').mean('vote_average')

In [77]:
fran_mostmeanrating_movies = fran_meanrating_movies[fran_meanrating_movies['vote_average']==fran_meanrating_movies['vote_average'].max()]
fran_mostmeanrating_movies

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Argo Collection,338312.0,,,2.0,9.3,0.4994,,12.0,8.0,,


==>The most successful Franchises in terms of mean rating is Argo Collection with 9.3

## Most Successful Directors

6. __Find__ the __most successful Directors__ in terms of

- __total number of movies__
- __total revenue__
- __mean rating__

__Find the most successful Directors in terms of
total number of movies__

In [78]:
fran_di_totalnum_movies = fran.groupby('director').count()

In [79]:
fran_di_mosttotalnum_movies = fran_di_totalnum_movies[fran_di_totalnum_movies['id']==fran_di_totalnum_movies['id'].max()]
fran_di_mosttotalnum_movies

Unnamed: 0_level_0,id,title,tagline,release_date,genres,belongs_to_collection,original_language,budget_musd,revenue_musd,production_companies,...,popularity,runtime,overview,spoken_languages,poster_path,cast,cast_size,crew_size,profit,ROI
director,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Gerald Thomas,25,25,18,25,25,25,25,0,0,23,...,25,25,25,25,25,25,25,25,0,0


==> The most successful Directors in terms of total number of movies is Gerald Thomas with 25 movies.

__Find the most successful Directors in terms of
total revenue__

In [80]:
fran_di_totalrevenue_movies = fran.groupby('director').sum('revenue_musd')

In [81]:
fran_di_mosttotalrevenue_movies = fran_di_totalrevenue_movies[fran_di_totalrevenue_movies['revenue_musd']==fran_di_totalrevenue_movies['revenue_musd'].max()]
fran_di_mosttotalrevenue_movies 

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
director,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Peter Jackson,229489,1016.0,5852.068099,42703.0,45.8,166.435462,1032.0,193,380,4836.068099,44.739877


==> The most successful Directors in terms of total revenue : Peter Jackson with sum of Revenue is 5852.07 MUSD

__Find the most successful Directors in terms of
mean rating__

In [82]:
fran_di_meanrating_movies = fran.groupby('director').mean('vote_average')

In [83]:
fran_di_mostmeanrating_movies = fran_di_meanrating_movies[fran_di_meanrating_movies['vote_average']==fran_di_meanrating_movies['vote_average'].max()]
fran_di_mostmeanrating_movies

Unnamed: 0_level_0,id,budget_musd,revenue_musd,vote_count,vote_average,popularity,runtime,cast_size,crew_size,profit,ROI
director,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Todd Grimes,460135.0,,,2.0,10.0,8.413734,,17.0,6.0,,


==> The most successful Directors in terms of mean rating is Todd Grimes with 10 rating but it only has 2 votes
