# Project 1: Explanatory Data Analysis & Data Presentation (Movies Dataset)

# Project Brief for Self-Coders

Here you´ll have the opportunity to code major parts of Project 1 on your own. If you need any help or inspiration, have a look at the Videos or the Jupyter Notebook with the full code. <br> <br>
Keep in mind that it´s all about __getting the right results/conclusions__. It´s not about finding the identical code. Things can be coded in many different ways. Even if you come to the same conclusions, it´s very unlikely that we have the very same code. 

## Data Import and first Inspection

1. __Import__ the movies dataset from the CSV file "movies_complete.csv". __Inspect__ the data.

__Some additional information on Features/Columns__:

* **id:** The ID of the movie (clear/unique identifier).
* **title:** The Official Title of the movie.
* **tagline:** The tagline of the movie.
* **release_date:** Theatrical Release Date of the movie.
* **genres:** Genres associated with the movie.
* **belongs_to_collection:** Gives information on the movie series/franchise the particular film belongs to.
* **original_language:** The language in which the movie was originally shot in.
* **budget_musd:** The budget of the movie in million dollars.
* **revenue_musd:** The total revenue of the movie in million dollars.
* **production_companies:** Production companies involved with the making of the movie.
* **production_countries:** Countries where the movie was shot/produced in.
* **vote_count:** The number of votes by users, as counted by TMDB.
* **vote_average:** The average rating of the movie.
* **popularity:** The Popularity Score assigned by TMDB.
* **runtime:** The runtime of the movie in minutes.
* **overview:** A brief blurb of the movie.
* **spoken_languages:** Spoken languages in the film.
* **poster_path:** The URL of the poster image.
* **cast:** (Main) Actors appearing in the movie.
* **cast_size:** number of Actors appearing in the movie.
* **director:** Director of the movie.
* **crew_size:** Size of the film crew (incl. director, excl. actors).

In [1]:
import numpy as np
import pandas as pd

In [2]:
movie_df = pd.read_csv('movies_complete.csv')

In [6]:
movie_df.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,44681,44682,44683,44684,44685,44686,44687,44688,44689,44690
id,862,8844,15602,31357,11862,949,11860,45325,9091,710,...,84419,390959,289923,222848,30840,439050,111109,67758,227506,461257
title,Toy Story,Jumanji,Grumpier Old Men,Waiting to Exhale,Father of the Bride Part II,Heat,Sabrina,Tom and Huck,Sudden Death,GoldenEye,...,House of Horrors,Shadow of the Blair Witch,The Burkittsville 7,Caged Heat 3000,Robin Hood,Subdue,Century of Birthing,Betrayal,Satan Triumphant,Queerama
tagline,,Roll the dice and unleash the excitement!,Still Yelling. Still Fighting. Still Ready for...,Friends are the people who let you be yourself...,Just When His World Is Back To Normal... He's ...,A Los Angeles Crime Saga,You are cordially invited to the most surprisi...,The Original Bad Boys.,Terror goes into overtime.,No limits. No fears. No substitutes.,...,Meet...The CREEPER!,,"Do you know what happened 50 years before ""The...",,,Rising and falling between a man and woman,,A deadly game of wits.,,
release_date,1995-10-30,1995-12-15,1995-12-22,1995-12-22,1995-02-10,1995-12-15,1995-12-15,1995-12-22,1995-12-22,1995-11-16,...,1946-03-29,2000-10-22,2000-10-03,1995-01-01,1991-05-13,,2011-11-17,2003-08-01,1917-10-21,2017-06-09
genres,Animation|Comedy|Family,Adventure|Fantasy|Family,Romance|Comedy,Comedy|Drama|Romance,Comedy,Action|Crime|Drama|Thriller,Comedy|Romance,Action|Adventure|Drama|Family,Action|Adventure|Thriller,Adventure|Action|Thriller,...,Horror|Mystery|Thriller,Mystery|Horror,Horror,Science Fiction,Drama|Action|Romance,Drama|Family,Drama,Action|Drama|Thriller,,
belongs_to_collection,Toy Story Collection,,Grumpy Old Men Collection,,Father of the Bride Collection,,,,,James Bond Collection,...,,,,,,,,,,
original_language,en,en,en,en,en,en,en,en,en,en,...,en,en,en,en,en,fa,tl,en,en,en
budget_musd,30.0,65.0,,16.0,,60.0,58.0,,35.0,58.0,...,,,,,,,,,,
revenue_musd,373.554033,262.797249,,81.452156,76.578911,187.436818,,,64.350171,352.194034,...,,,,,,,,,,
production_companies,Pixar Animation Studios,TriStar Pictures|Teitler Film|Interscope Commu...,Warner Bros.|Lancaster Gate,Twentieth Century Fox Film Corporation,Sandollar Productions|Touchstone Pictures,Regency Enterprises|Forward Pass|Warner Bros.,Paramount Pictures|Scott Rudin Productions|Mir...,Walt Disney Pictures,Universal Pictures|Imperial Entertainment|Sign...,United Artists|Eon Productions,...,Universal Pictures,,Neptune Salad Entertainment|Pirie Productions,Concorde-New Horizons,Westdeutscher Rundfunk (WDR)|Working Title Fil...,,Sine Olivia,American World Pictures,Yermoliev,


In [13]:
movie_df.director.value_counts().head()

John Ford           66
Michael Curtiz      65
Werner Herzog       54
Alfred Hitchcock    53
Georges Méliès      49
Name: director, dtype: int64

## The best and the worst movies...

2. __Filter__ the Dataset and __find the best/worst n Movies__ with the

- Highest Revenue
- Highest Budget
- Highest Profit (=Revenue - Budget)
- Lowest Profit (=Revenue - Budget)
- Highest Return on Investment (=Revenue / Budget) (only movies with Budget >= 10) 
- Lowest Return on Investment (=Revenue / Budget) (only movies with Budget >= 10)
- Highest number of Votes
- Highest Rating (only movies with 10 or more Ratings)
- Lowest Rating (only movies with 10 or more Ratings)
- Highest Popularity

__Define__ an appropriate __user-defined function__ to reuse code.

__Movies Top 5 - Highest Revenue__

In [19]:
movie_df.sort_values(by='revenue_musd',ascending=False)[['title','revenue_musd','release_date']].head(5)

Unnamed: 0,title,revenue_musd,release_date
14448,Avatar,2787.965087,2009-12-10
26265,Star Wars: The Force Awakens,2068.223624,2015-12-15
1620,Titanic,1845.034188,1997-11-18
17669,The Avengers,1519.55791,2012-04-25
24812,Jurassic World,1513.52881,2015-06-09


__Movies Top 5 - Highest Budget__

In [17]:
movie_df.sort_values(by='budget_musd',ascending=False)[['title','budget_musd']].head(5)

Unnamed: 0,title,budget_musd
16986,Pirates of the Caribbean: On Stranger Tides,380.0
11743,Pirates of the Caribbean: At World's End,300.0
26268,Avengers: Age of Ultron,280.0
10985,Superman Returns,270.0
18517,John Carter,260.0


__Movies Top 5 - Highest Profit__

In [22]:
movie_df['profit_musd'] = movie_df.revenue_musd - movie_df.budget_musd

In [23]:
movie_df.sort_values(by='profit_musd',ascending=False)[['title','profit_musd','release_date']].head(5)

Unnamed: 0,title,profit_musd,release_date
14448,Avatar,2550.965087,2009-12-10
26265,Star Wars: The Force Awakens,1823.223624,2015-12-15
1620,Titanic,1645.034188,1997-11-18
24812,Jurassic World,1363.52881,2015-06-09
28501,Furious 7,1316.24936,2015-04-01


__Movies Top 5 - Lowest Profit__

In [24]:
movie_df.sort_values(by='profit_musd',ascending=True)[['title','profit_musd','release_date']].head(5)

Unnamed: 0,title,profit_musd,release_date
20959,The Lone Ranger,-165.71009,2013-07-03
7164,The Alamo,-119.180039,2004-04-07
16659,Mars Needs Moms,-111.007242,2011-03-09
43611,Valerian and the City of a Thousand Planets,-107.447384,2017-07-20
2684,The 13th Warrior,-98.301101,1999-08-27


__Movies Top 5 - Highest ROI__

In [25]:
movie_df['ROI'] = movie_df.revenue_musd / movie_df.budget_musd

In [30]:
movie_df[movie_df['budget_musd']>=10].sort_values(by='ROI',ascending=False)[['title','ROI','budget_musd','release_date']].head(5)

Unnamed: 0,title,ROI,budget_musd,release_date
1055,E.T. the Extra-Terrestrial,75.520507,10.5,1982-04-03
255,Star Wars,70.490728,11.0,1977-05-25
588,Pretty Woman,33.071429,14.0,1990-03-23
18300,The Intouchables,32.806221,13.0,2011-11-02
1144,The Empire Strikes Back,29.911111,18.0,1980-05-17


__Movies Top 5 - Lowest ROI__

In [31]:
movie_df[movie_df['budget_musd']>=10].sort_values(by='ROI',ascending=True)[['title','ROI','budget_musd','release_date']].head(5)

Unnamed: 0,title,ROI,budget_musd,release_date
6955,Chasing Liberty,5.217391e-07,23.0,2004-01-09
8041,The Cookout,7.5e-07,16.0,2004-09-03
17381,Deadfall,1.8e-06,10.0,1993-10-08
6678,In the Cut,1.916667e-06,12.0,2003-09-09
20015,The Samaritan,0.0002100833,12.0,2012-03-02


__Movies Top 5 - Most Votes__

In [28]:
movie_df.sort_values(by='vote_count',ascending=False)[['title','vote_count','release_date']].head(5)

Unnamed: 0,title,vote_count,release_date
15368,Inception,14075.0,2010-07-14
12396,The Dark Knight,12269.0,2008-07-16
14448,Avatar,12114.0,2009-12-10
17669,The Avengers,12000.0,2012-04-25
26272,Deadpool,11444.0,2016-02-09


__Movies Top 5 - Highest Rating__

In [35]:
movie_df[movie_df.vote_count >= 1000].sort_values(by='vote_average',ascending=False)[['title','vote_average','vote_count','release_date']].head(5)

Unnamed: 0,title,vote_average,vote_count,release_date
826,The Godfather,8.5,6024.0,1972-03-14
39639,Your Name.,8.5,1030.0,2016-08-26
313,The Shawshank Redemption,8.5,8358.0,1994-09-23
291,Pulp Fiction,8.3,8670.0,1994-09-10
1174,Once Upon a Time in America,8.3,1104.0,1984-02-16


__Movies Top 5 - Lowest Rating__

In [36]:
movie_df[movie_df.vote_count >= 10].sort_values(by='vote_average',ascending=True)[['title','vote_average','vote_count','release_date']].head(5)

Unnamed: 0,title,vote_average,vote_count,release_date
41602,Call Me by Your Name,0.0,18.0,2017-10-27
30618,Extinction: Nature Has Evolved,0.0,10.0,2017-03-10
43576,How to Talk to Girls at Parties,0.0,10.0,2017-12-27
25418,Santa Claus,1.6,12.0,1959-11-26
7030,The Beast of Yucca Flats,1.6,18.0,1961-06-02


__Movies Top 5 - Most Popular__

In [39]:
movie_df.sort_values(by='popularity',ascending=False)[['title','popularity','release_date']].head(5)

Unnamed: 0,title,popularity,release_date
30330,Minions,547.488298,2015-06-17
32927,Wonder Woman,294.337037,2017-05-30
41556,Beauty and the Beast,287.253654,2017-03-16
42940,Baby Driver,228.032744,2017-06-28
24187,Big Hero 6,213.849907,2014-10-24


## Find your next Movie

3. __Filter__ the Dataset for movies that meet the following conditions:

__Search 1: Science Fiction Action Movie with Bruce Willis (sorted from high to low Rating)__

__Search 2: Movies with Uma Thurman and directed by Quentin Tarantino (sorted from short to long runtime)__

__Search 3: Most Successful Pixar Studio Movies between 2010 and 2015 (sorted from high to low Revenue)__

__Search 4: Action or Thriller Movie with original language English and minimum Rating of 7.5 (most recent movies first)__

## Search 1: Science Fiction Action Movie with Bruce Willis (sorted from high to low Rating)

In [97]:
def search_movie(search_params,df):
    # [{'col':'....','val':'....'}]
    mask = np.full(df.shape[0], True)
    for search_param in search_params:
        col_name = search_param['col']
        search_str = search_param['val']
        mask = mask & (df[col_name].str.match(f'.*{search_str}.*',na=False))
    return df[mask]

def search_movie_or(search_params,df):
    # [{'col':'....','val':'....'}]
    mask = np.full(df.shape[0], False)
    for search_param in search_params:
        col_name = search_param['col']
        search_str = search_param['val']
        mask = mask | (df[col_name].str.match(f'.*{search_str}.*',na=False))
    return df[mask]

In [72]:
(search_movie([{'col':'genres','val':'Science Fiction'},{'col':'cast','val':'Bruce Willis'}],movie_df)
    .sort_values(by='vote_average',ascending=False)).head(5)

Unnamed: 0,id,title,tagline,release_date,genres,belongs_to_collection,original_language,budget_musd,revenue_musd,production_companies,...,runtime,overview,spoken_languages,poster_path,cast,cast_size,crew_size,director,profit_musd,ROI
31,63,Twelve Monkeys,The future is history.,1995-12-29,Science Fiction|Thriller|Mystery,,en,29.5,168.84,Universal Pictures|Atlas Entertainment|Classico,...,129.0,"In the year 2035, convict James Cole reluctant...",English|Français,<img src='http://image.tmdb.org/t/p/w185//2F9K...,Bruce Willis|Madeleine Stowe|Brad Pitt|Christo...,65,151,Terry Gilliam,139.34,5.72339
1448,18,The Fifth Element,There is no future without it.,1997-05-07,Adventure|Fantasy|Action|Thriller|Science Fiction,,en,90.0,263.92018,Columbia Pictures|Gaumont,...,126.0,"In 2257, a taxi driver is unintentionally give...",English|svenska|Deutsch,<img src='http://image.tmdb.org/t/p/w185//fPtl...,Bruce Willis|Gary Oldman|Ian Holm|Milla Jovovi...,114,134,Luc Besson,173.92018,2.932446
3836,9741,Unbreakable,Some things are only revealed by accident.,2000-11-13,Science Fiction|Thriller|Drama,,en,75.0,248.118121,Limited Edition Productions Inc.|Touchstone Pi...,...,106.0,An ordinary man makes an extraordinary discove...,English,<img src='http://image.tmdb.org/t/p/w185//kXkV...,Bruce Willis|Samuel L. Jackson|Robin Wright|Sp...,40,56,M. Night Shyamalan,173.118121,3.308242
19218,59967,Looper,"Hunted By Your Future, Haunted By Your Past",2012-09-26,Action|Thriller|Science Fiction,,en,30.0,47.042,Endgame Entertainment|FilmDistrict|DMG Enterta...,...,118.0,"In the futuristic action thriller Looper, time...",English,<img src='http://image.tmdb.org/t/p/w185//sNjL...,Joseph Gordon-Levitt|Bruce Willis|Emily Blunt|...,34,42,Rian Johnson,17.042,1.568067
1786,95,Armageddon,The Earth's Darkest Day Will Be Man's Finest Hour,1998-07-01,Action|Thriller|Science Fiction|Adventure,,en,140.0,553.799566,Jerry Bruckheimer Films|Touchstone Pictures|Va...,...,151.0,When an asteroid threatens to collide with Ear...,English|Pусский,<img src='http://image.tmdb.org/t/p/w185//fMtO...,Bruce Willis|Billy Bob Thornton|Ben Affleck|Li...,67,108,Michael Bay,413.799566,3.955711


## Search 2: Movies with Uma Thurman and directed by Quentin Tarantino (sorted from short to long runtime)

In [80]:
(search_movie([{'col':'director','val':'Quentin Tarantino'},{'col':'cast','val':'Uma Thurman'}],movie_df)
    .sort_values(by='runtime',ascending=True))[['title','director','cast','runtime']]

Unnamed: 0,title,director,cast,runtime
6667,Kill Bill: Vol. 1,Quentin Tarantino,Uma Thurman|Lucy Liu|Vivica A. Fox|Daryl Hanna...,111.0
7208,Kill Bill: Vol. 2,Quentin Tarantino,Uma Thurman|David Carradine|Daryl Hannah|Micha...,136.0
291,Pulp Fiction,Quentin Tarantino,John Travolta|Samuel L. Jackson|Uma Thurman|Br...,154.0


## Search 3: Most Successful Pixar Studio Movies between 2010 and 2015 (sorted from high to low Revenue)

In [101]:
pixar_movies = search_movie([{'col':'production_companies','val':'Pixar Animation Studios'}],movie_df)

In [105]:
(search_movie_or([{'col':'release_date','val':'2010'},{'col':'release_date','val':'2015'}],pixar_movies)
.sort_values(by='revenue_musd',ascending=False))[['title','production_companies','release_date','revenue_musd']]

Unnamed: 0,title,production_companies,release_date,revenue_musd
15236,Toy Story 3,Walt Disney Pictures|Pixar Animation Studios,2010-06-16,1066.969703
29957,Inside Out,Walt Disney Pictures|Pixar Animation Studios,2015-06-09,857.611174
30388,The Good Dinosaur,Walt Disney Pictures|Pixar Animation Studios,2015-11-14,331.926147
16392,Day & Night,Walt Disney Pictures|Pixar Animation Studios,2010-06-17,
31803,Lava,Pixar Animation Studios,2015-06-21,
34560,Sanjay's Super Team,Pixar Animation Studios,2015-11-25,
40675,Riley's First Date?,Walt Disney Pictures|Pixar Animation Studios,2015-11-03,


## Search 4: Action or Thriller Movie with original language English and minimum Rating of 7.5 (most recent movies first)

In [114]:
movie_df['release_ts'] = movie_df.release_date.apply(lambda date_str: pd.Timestamp(date_str))

In [116]:
action_thriller_df = search_movie_or([{'col':'genres','val':'Action'},{'col':'genres','val':'Thriller'}],movie_df)

In [118]:
(action_thriller_df[(action_thriller_df['original_language'] == 'en') & (action_thriller_df['vote_average'] >= 7.5)]
    .sort_values(by='release_ts',ascending=False))[['title','genres','original_language','vote_average','release_date']]

Unnamed: 0,title,genres,original_language,vote_average,release_date
44490,Descendants 2,TV Movie|Family|Action|Comedy|Music|Adventure,en,7.5,2017-07-21
43941,Dunkirk,Action|Drama|History|Thriller|War,en,7.5,2017-07-19
42624,The Book of Henry,Thriller|Drama|Crime,en,7.6,2017-06-16
26273,Guardians of the Galaxy Vol. 2,Action|Adventure|Comedy|Science Fiction,en,7.6,2017-04-19
43467,Revengeance,Comedy|Action|Animation,en,8.0,2017-04-05
...,...,...,...,...,...
11135,The Music Box,Action|Comedy,en,7.5,1932-04-16
8268,Scarface,Action|Adventure|Crime|Drama|Thriller,en,7.5,1932-04-09
8255,"Steamboat Bill, Jr.",Action|Comedy,en,7.9,1928-02-14
2879,The General,Action|Adventure|Comedy|Drama,en,8.0,1926-12-31


## Are Franchises more successful?

4. __Analyze__ the Dataset and __find out whether Franchises (Movies that belong to a collection) are more successful than stand-alone movies__ in terms of:

- mean revenue
- median Return on Investment
- mean budget raised
- mean popularity
- mean rating

hint: use groupby()

__Franchise vs. Stand-alone: Average Revenue__

In [121]:
movie_df['has_franchise'] = ~movie_df.belongs_to_collection.isna()

In [123]:
movie_df.groupby('has_franchise').agg({'revenue_musd':['mean']})

Unnamed: 0_level_0,revenue_musd
Unnamed: 0_level_1,mean
has_franchise,Unnamed: 1_level_2
False,44.742814
True,165.708193


__Franchise vs. Stand-alone: Return on Investment / Profitability (median)__

In [124]:
movie_df.groupby('has_franchise').agg({'ROI':['median']})

Unnamed: 0_level_0,ROI
Unnamed: 0_level_1,median
has_franchise,Unnamed: 1_level_2
False,1.619699
True,3.709195


__Franchise vs. Stand-alone: Average Budget__

In [125]:
movie_df.groupby('has_franchise').agg({'budget_musd':['mean']})

Unnamed: 0_level_0,budget_musd
Unnamed: 0_level_1,mean
has_franchise,Unnamed: 1_level_2
False,18.047741
True,38.319847


__Franchise vs. Stand-alone: Average Popularity__

In [126]:
movie_df.groupby('has_franchise').agg({'popularity':['mean']})

Unnamed: 0_level_0,popularity
Unnamed: 0_level_1,mean
has_franchise,Unnamed: 1_level_2
False,2.592726
True,6.245051


__Franchise vs. Stand-alone: Average Rating__

In [131]:
movie_df.groupby('has_franchise').agg({'vote_average':['mean']})

Unnamed: 0_level_0,mean
has_franchise,Unnamed: 1_level_1
False,6.008787
True,5.956806


## Most Successful Franchises

5. __Find__ the __most successful Franchises__ in terms of

- __total number of movies__
- __total & mean budget__
- __total & mean revenue__
- __mean rating__

### total number of movies

In [144]:
(movie_df[movie_df.has_franchise == True].groupby('belongs_to_collection').agg({'title':['count']})['title']).sort_values(by='count',ascending=False).head(10)

Unnamed: 0_level_0,count
belongs_to_collection,Unnamed: 1_level_1
The Bowery Boys,29
Totò Collection,27
James Bond Collection,26
Zatôichi: The Blind Swordsman,26
The Carry On Collection,25
Charlie Chan (Sidney Toler) Collection,21
Pokémon Collection,20
Godzilla (Showa) Collection,16
Dragon Ball Z (Movie) Collection,15
Charlie Chan (Warner Oland) Collection,15


## total budget

In [145]:
(movie_df[movie_df.has_franchise == True].groupby('belongs_to_collection').agg({'budget_musd':['sum']})['budget_musd']).sort_values(by='sum',ascending=False).head(10)

Unnamed: 0_level_0,sum
belongs_to_collection,Unnamed: 1_level_1
James Bond Collection,1539.65
Harry Potter Collection,1280.0
Pirates of the Caribbean Collection,1250.0
The Fast and the Furious Collection,1009.0
X-Men Collection,983.0
Transformers Collection,965.0
Star Wars Collection,854.35
The Hobbit Collection,750.0
The Terminator Collection,661.4
Mission: Impossible Collection,650.0


### mean budget

In [151]:
(movie_df[movie_df.has_franchise == True].groupby('belongs_to_collection').agg({'budget_musd':['mean','sum','count']})['budget_musd']).sort_values(by='mean',ascending=False).head(10)

Unnamed: 0_level_0,mean,sum,count
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Tangled Collection,260.0,260.0,1
The Avengers Collection,250.0,500.0,2
Pirates of the Caribbean Collection,250.0,1250.0,5
The Hobbit Collection,250.0,750.0,3
Man of Steel Collection,237.5,475.0,2
Avatar Collection,237.0,237.0,1
The Amazing Spider-Man Collection,207.5,415.0,2
World War Z Collection,200.0,200.0,1
Spider-Man Collection,199.0,597.0,3
The Dark Knight Collection,195.0,585.0,3


### total & mean revenue

In [153]:
(movie_df[movie_df.has_franchise == True].groupby('belongs_to_collection').agg({'revenue_musd':['sum','mean','count']})['revenue_musd']).sort_values(by='sum',ascending=False).head(10)

Unnamed: 0_level_0,sum,mean,count
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Harry Potter Collection,7707.367425,963.420928,8
Star Wars Collection,7434.49479,929.311849,8
James Bond Collection,7106.970239,273.345009,26
The Fast and the Furious Collection,5125.098793,640.637349,8
Pirates of the Caribbean Collection,4521.576826,904.315365,5
Transformers Collection,4366.101244,873.220249,5
Despicable Me Collection,3691.070216,922.767554,4
The Twilight Collection,3342.10729,668.421458,5
Ice Age Collection,3216.708553,643.341711,5
Jurassic Park Collection,3031.484143,757.871036,4


In [154]:
(movie_df[movie_df.has_franchise == True].groupby('belongs_to_collection').agg({'revenue_musd':['mean','sum','count']})['revenue_musd']).sort_values(by='mean',ascending=False).head(10)

Unnamed: 0_level_0,mean,sum,count
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avatar Collection,2787.965087,2787.965087,1
The Avengers Collection,1462.480802,2924.961604,2
Frozen Collection,1274.219009,1274.219009,1
Finding Nemo Collection,984.453213,1968.906425,2
The Hobbit Collection,978.507785,2935.523356,3
The Lord of the Rings Collection,972.181581,2916.544743,3
Harry Potter Collection,963.420928,7707.367425,8
Star Wars Collection,929.311849,7434.49479,8
Despicable Me Collection,922.767554,3691.070216,4
Pirates of the Caribbean Collection,904.315365,4521.576826,5


### mean rating

In [157]:
(movie_df[movie_df.has_franchise == True].groupby('belongs_to_collection').agg({'vote_average':['mean','sum','count']})['vote_average']).sort_values(by='mean',ascending=False).head(10)

Unnamed: 0_level_0,mean,sum,count
belongs_to_collection,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Argo Collection,9.3,9.3,1
Kenji Misumi's Trilogy of the Sword,9.0,9.0,1
Dreileben,9.0,9.0,1
Bloodfight,9.0,9.0,1
Алиса в стране чудес (Коллекция),8.7,8.7,1
We Were Here,8.65,17.3,2
Kizumonogatari,8.633333,25.9,3
Glass Tiger collection,8.5,8.5,1
Spirits' Homecoming Collection,8.5,8.5,1
The Nigger Charley Collection,8.5,8.5,1


## Most Successful Directors

6. __Find__ the __most successful Directors__ in terms of

- __total number of movies__
- __total revenue__
- __mean rating__

In [159]:
dir_df = movie_df.groupby('director').agg({'id':['count'],'revenue_musd':['sum'],'vote_average':['mean']})
dir_df

Unnamed: 0_level_0,id,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
director,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Dale Trevillion\t,2,0.000000,4.0
Davide Manuli,1,0.000000,6.9
E.W. Swackhamer,1,0.000000,5.9
Vitaliy Vorobyov,1,0.000000,5.5
Yeon Sang-Ho,4,2.129768,6.6
...,...,...,...
Ярополк Лапшин,1,0.000000,10.0
پیمان معادی,1,0.000000,6.0
塩谷 直義,1,0.000000,7.2
杰森·莫玛,1,0.000000,5.8


In [164]:
dir_df.nlargest(10,('id','count'))

Unnamed: 0_level_0,id,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
director,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
John Ford,66,85.170757,6.381818
Michael Curtiz,65,37.8175,5.998246
Werner Herzog,54,24.57258,6.805556
Alfred Hitchcock,53,250.107584,6.639623
Georges Méliès,49,0.0,5.934694
Woody Allen,49,993.970588,6.691837
Jean-Luc Godard,46,0.867433,6.804348
Sidney Lumet,46,294.522734,6.576744
Charlie Chaplin,44,26.519181,6.540909
Raoul Walsh,43,1.21388,6.004762


In [161]:
dir_df.nlargest(10,('revenue_musd','sum'))

Unnamed: 0_level_0,id,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
director,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Steven Spielberg,33,9256.621422,6.893939
Peter Jackson,13,6528.244659,7.138462
Michael Bay,13,6437.466781,6.392308
James Cameron,11,5900.61031,6.927273
David Yates,9,5334.563196,6.7
Christopher Nolan,11,4747.408665,7.618182
Robert Zemeckis,19,4138.233542,6.794737
Tim Burton,21,4032.916124,6.733333
Ridley Scott,24,3917.52924,6.604167
Chris Columbus,15,3866.836869,6.44


In [162]:
dir_df.nlargest(10,('vote_average','mean'))

Unnamed: 0_level_0,id,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
director,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
A.W. Vidmer,1,0.0,10.0
Amy Schatz,1,0.0,10.0
Ana Poliak,1,0.0,10.0
Andrew Bowser,1,0.0,10.0
Andrew Napier,1,0.0,10.0
Antonis Sotiropoulos,1,0.0,10.0
Barry Bruce,1,0.0,10.0
Brandon Chesbro,1,0.0,10.0
Brett M. Butler,1,0.0,10.0
Brian Skeet,1,0.0,10.0


## Most Successful Actors

6. __Find__ the __most successful Actors__ in terms of

- __total number of movies__
- __total revenue__
- __mean rating__

In [213]:
# movie_df.set_index('id',inplace=True)
actors_df = movie_df.cast.dropna().str.split('|',expand=True).stack().reset_index(level=1,drop=True).to_frame()
actors_df = actors_df.applymap(lambda x:x.strip())

In [214]:
actors_df.columns = ['actor_name']
actors_df

Unnamed: 0_level_0,actor_name
id,Unnamed: 1_level_1
862,Tom Hanks
862,Tim Allen
862,Don Rickles
862,Jim Varney
862,Wallace Shawn
...,...
227506,Iwan Mosschuchin
227506,Nathalie Lissenko
227506,Pavel Pavlov
227506,Aleksandr Chabrov


In [215]:
actors_df = actors_df.merge(movie_df,how='left',left_index=True,right_index=True)

In [223]:
actors_df.groupby('actor_name').agg({'title':['count']})['title'].sort_values(by='count',ascending=False)

Unnamed: 0_level_0,count
actor_name,Unnamed: 1_level_1
Bess Flowers,240
Christopher Lee,148
John Wayne,125
Samuel L. Jackson,122
Michael Caine,110
...,...
James Weicht,1
James Welch,1
James Wellington,1
James Welsh,1


In [225]:
actors_df.groupby('actor_name').agg({'revenue_musd':['sum']})['revenue_musd'].sort_values(by='sum',ascending=False)

Unnamed: 0_level_0,sum
actor_name,Unnamed: 1_level_1
Stan Lee,19414.957555
Samuel L. Jackson,17109.620672
Warwick Davis,13256.032188
Frank Welker,13044.152470
John Ratzenberger,12596.126073
...,...
Jankidas,0.000000
Janko Brett,0.000000
Janko Cekic,0.000000
Janko Rakoš,0.000000


In [226]:
actors_df.groupby('actor_name').agg({'vote_average':['mean']})['vote_average'].sort_values(by='mean',ascending=False)

Unnamed: 0_level_0,mean
actor_name,Unnamed: 1_level_1
József Ruszt,10.0
Jamie Zevallos,10.0
Sarah Pickering,10.0
Judy Foster,10.0
Mario Cunha,10.0
...,...
Дмитрий Писаренко,
Манолис Сорманн,
Манос Вакусис,
Теодосис Заннис,


In [228]:
actor_eval = actors_df.groupby('actor_name').agg({'title':['count'],'revenue_musd':['sum'],'vote_average':['mean']})
actor_eval

Unnamed: 0_level_0,title,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
actor_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
"""Elliot""",1,0.000000,6.6000
"""Freeway"" Ricky Ross",1,0.000000,6.8000
"""Jamez""",1,0.000000,6.6000
"""Lil' Mikey"" Davis",1,0.282448,5.7000
"""Weird Al"" Yankovic",10,205.402057,6.0400
...,...,...,...
长泽雅美,11,0.346485,6.4000
陳美貞,1,83.061158,7.0000
高桥一生,8,333.108461,6.7375
강계열,1,0.000000,6.0000


In [229]:
actor_eval.nlargest(10,('title','count'))

Unnamed: 0_level_0,title,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
actor_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Bess Flowers,240,368.913259,6.184186
Christopher Lee,148,9417.047887,5.910204
John Wayne,125,236.094,5.712097
Samuel L. Jackson,122,17109.620672,6.266116
Michael Caine,110,8053.404838,6.269444
Gérard Depardieu,109,1247.608953,6.053211
John Carradine,109,255.839586,5.546667
Donald Sutherland,108,5390.766679,6.233962
Jackie Chan,108,4699.185933,6.275701
Frank Welker,107,13044.15247,6.310377


In [230]:
actor_eval.nlargest(10,('revenue_musd','sum'))

Unnamed: 0_level_0,title,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
actor_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Stan Lee,48,19414.957555,6.513043
Samuel L. Jackson,122,17109.620672,6.266116
Warwick Davis,34,13256.032188,6.294118
Frank Welker,107,13044.15247,6.310377
John Ratzenberger,46,12596.126073,6.484444
Jess Harnell,35,12234.608163,6.435294
Hugo Weaving,40,11027.578473,6.473684
Ian McKellen,44,11015.592318,6.353488
Johnny Depp,69,10653.760641,6.44058
Alan Rickman,45,10612.625348,6.715556


In [231]:
actor_eval.nlargest(10,('vote_average','mean'))

Unnamed: 0_level_0,title,revenue_musd,vote_average
Unnamed: 0_level_1,count,sum,mean
actor_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
A.B. Imeson,1,0.0,10.0
Aakar Kaushik,1,0.0,10.0
Adam Saleh,1,0.0,10.0
Adnan Koç,1,0.0,10.0
Agnese Civle,1,0.0,10.0
Agoritsa Oikonomou,1,0.0,10.0
Ahuva Keren,1,0.0,10.0
Aigars Apinis,1,0.0,10.0
Al Mackenzie,1,0.0,10.0
Alan Gibson,1,0.0,10.0


## Most Successful Genres

6. __Find__ the __most successful Genres__ in terms of

- __total profit__
- __total revenue__
- __mean ROI__
- __mean rating__

In [236]:
genre_df = movie_df.genres.dropna().str.split('|',expand=True)
genre_df

Unnamed: 0_level_0,0,1,2,3,4,5,6,7
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
862,Animation,Comedy,Family,,,,,
8844,Adventure,Fantasy,Family,,,,,
15602,Romance,Comedy,,,,,,
31357,Comedy,Drama,Romance,,,,,
11862,Comedy,,,,,,,
...,...,...,...,...,...,...,...,...
222848,Science Fiction,,,,,,,
30840,Drama,Action,Romance,,,,,
439050,Drama,Family,,,,,,
111109,Drama,,,,,,,


In [242]:
genre_df = genre_df.stack().reset_index(level=1,drop=True).to_frame()
genre_df

Unnamed: 0_level_0,0
id,Unnamed: 1_level_1
862,Animation
862,Comedy
862,Family
8844,Adventure
8844,Fantasy
...,...
439050,Family
111109,Drama
67758,Action
67758,Drama


In [243]:
genre_df.columns = ['genre_name']

In [244]:
genre_df

Unnamed: 0_level_0,genre_name
id,Unnamed: 1_level_1
862,Animation
862,Comedy
862,Family
8844,Adventure
8844,Fantasy
...,...
439050,Family
111109,Drama
67758,Action
67758,Drama


In [245]:
genre_df = genre_df.merge(movie_df,how='left',left_index=True,right_index=True)

In [246]:
genre_df

Unnamed: 0_level_0,genre_name,title,tagline,release_date,genres,belongs_to_collection,original_language,budget_musd,revenue_musd,production_companies,...,spoken_languages,poster_path,cast,cast_size,crew_size,director,profit_musd,ROI,release_ts,has_franchise
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,Drama,Ariel,,1988-10-21,Drama|Crime,,fi,,,Villealfa Filmproduction Oy|Finnish Film Found...,...,suomi|Deutsch,<img src='http://image.tmdb.org/t/p/w185//ojDg...,Turo Pajala|Susanna Haavisto|Matti Pellonpää|E...,4,6,Aki Kaurismäki,,,1988-10-21,False
2,Crime,Ariel,,1988-10-21,Drama|Crime,,fi,,,Villealfa Filmproduction Oy|Finnish Film Found...,...,suomi|Deutsch,<img src='http://image.tmdb.org/t/p/w185//ojDg...,Turo Pajala|Susanna Haavisto|Matti Pellonpää|E...,4,6,Aki Kaurismäki,,,1988-10-21,False
3,Drama,Shadows in Paradise,,1986-10-16,Drama|Comedy,,fi,,,Villealfa Filmproduction Oy,...,English|suomi|svenska,<img src='http://image.tmdb.org/t/p/w185//nj01...,Matti Pellonpää|Kati Outinen|Sakari Kuosmanen|...,7,11,Aki Kaurismäki,,,1986-10-16,False
3,Comedy,Shadows in Paradise,,1986-10-16,Drama|Comedy,,fi,,,Villealfa Filmproduction Oy,...,English|suomi|svenska,<img src='http://image.tmdb.org/t/p/w185//nj01...,Matti Pellonpää|Kati Outinen|Sakari Kuosmanen|...,7,11,Aki Kaurismäki,,,1986-10-16,False
5,Crime,Four Rooms,Twelve outrageous guests. Four scandalous requ...,1995-12-09,Crime|Comedy,,en,4.00000,4.3,Miramax Films|A Band Apart,...,English,<img src='http://image.tmdb.org/t/p/w185//xhU6...,Tim Roth|Antonio Banderas|Jennifer Beals|Madon...,24,88,Allison Anders,0.3,1.075,1995-12-09,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
468343,Romance,Silja - nuorena nukkunut,,1956-01-01,Drama|Romance,,fi,,,,...,,<img src='http://image.tmdb.org/t/p/w185//q9mc...,,0,1,Jack Witikka,,,1956-01-01,False
468707,Romance,Thick Lashes of Lauri Mäntyvaara,,2017-07-28,Romance|Comedy,,fi,1.25404,,Elokuvayhtiö Oy Aamu,...,suomi,<img src='http://image.tmdb.org/t/p/w185//9M8x...,Inka Haapamäki|Rosa Honkonen|Tiitus Rantala|Sa...,6,2,Hannaleena Hauru,,,2017-07-28,False
468707,Comedy,Thick Lashes of Lauri Mäntyvaara,,2017-07-28,Romance|Comedy,,fi,1.25404,,Elokuvayhtiö Oy Aamu,...,suomi,<img src='http://image.tmdb.org/t/p/w185//9M8x...,Inka Haapamäki|Rosa Honkonen|Tiitus Rantala|Sa...,6,2,Hannaleena Hauru,,,2017-07-28,False
469172,Fantasy,Manuel on the Island of Wonders,,1984-08-02,Fantasy|Drama,,pt,,,Institut National de l'Audiovisuel (INA)|Radio...,...,Português|Français,<img src='http://image.tmdb.org/t/p/w185//mipl...,Ruben de Freitas|Teresa Madruga|Fernando Heito...,20,27,Raúl Ruiz,,,1984-08-02,False


In [247]:
genre_df.columns.to_list()

['genre_name',
 'title',
 'tagline',
 'release_date',
 'genres',
 'belongs_to_collection',
 'original_language',
 'budget_musd',
 'revenue_musd',
 'production_companies',
 'production_countries',
 'vote_count',
 'vote_average',
 'popularity',
 'runtime',
 'overview',
 'spoken_languages',
 'poster_path',
 'cast',
 'cast_size',
 'crew_size',
 'director',
 'profit_musd',
 'ROI',
 'release_ts',
 'has_franchise']

In [257]:
genre_wise_performence = genre_df.groupby('genre_name').agg({'budget_musd':['mean','sum','count'],'profit_musd':['mean','sum'],'revenue_musd':['mean','sum'],'ROI':['mean']})

In [253]:
genre_wise_performence.nlargest(20,('budget_musd','mean'))

Unnamed: 0_level_0,budget_musd,budget_musd,profit_musd,profit_musd,revenue_musd,revenue_musd,ROI
Unnamed: 0_level_1,mean,sum,mean,sum,mean,sum,mean
genre_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
Adventure,51.093255,65705.926364,141.137342,135068.436282,179.192356,199978.66936,1071.170962
Fantasy,46.991335,34256.683311,137.24265,69856.509099,166.006633,103920.152145,5.958486
Animation,46.738843,19957.486079,160.542398,46878.380145,176.989426,67432.97132,6.765834
Family,46.620167,33519.899995,137.569937,72912.066603,159.103683,107076.778493,1927.31528
Action,36.766719,77688.077617,89.277186,126237.940703,116.073804,201388.050019,726.056558
Science Fiction,35.450581,35344.22913,99.956942,63372.701088,131.516076,97847.960421,5.281461
War,24.393692,7805.981356,45.462982,9228.985278,65.475137,15910.458263,20682.075826
Thriller,23.370518,55504.980272,51.565089,77399.199214,69.520125,129724.55297,672.383836
History,23.274574,8914.161694,31.88172,7492.204166,50.515927,14902.19842,17874.906652
Western,22.542124,3178.439427,24.937032,2219.395832,43.782042,5122.498884,4.20655


In [254]:
genre_wise_performence.nlargest(20,('profit_musd','mean'))

Unnamed: 0_level_0,budget_musd,budget_musd,profit_musd,profit_musd,revenue_musd,revenue_musd,ROI
Unnamed: 0_level_1,mean,sum,mean,sum,mean,sum,mean
genre_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
Animation,46.738843,19957.486079,160.542398,46878.380145,176.989426,67432.97132,6.765834
Adventure,51.093255,65705.926364,141.137342,135068.436282,179.192356,199978.66936,1071.170962
Family,46.620167,33519.899995,137.569937,72912.066603,159.103683,107076.778493,1927.31528
Fantasy,46.991335,34256.683311,137.24265,69856.509099,166.006633,103920.152145,5.958486
Science Fiction,35.450581,35344.22913,99.956942,63372.701088,131.516076,97847.960421,5.281461
Action,36.766719,77688.077617,89.277186,126237.940703,116.073804,201388.050019,726.056558
Comedy,21.601565,60095.55337,55.292873,102291.814344,64.097213,166845.04597,8373.127236
Thriller,23.370518,55504.980272,51.565089,77399.199214,69.520125,129724.55297,672.383836
Music,14.78868,4466.181321,46.692652,8964.989208,50.076001,13370.292367,5.970629
Mystery,22.434132,14425.14693,45.831507,20303.357592,63.190209,34754.614989,42.379596


In [255]:
genre_wise_performence.nlargest(20,('revenue_musd','mean'))

Unnamed: 0_level_0,budget_musd,budget_musd,profit_musd,profit_musd,revenue_musd,revenue_musd,ROI
Unnamed: 0_level_1,mean,sum,mean,sum,mean,sum,mean
genre_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
Adventure,51.093255,65705.926364,141.137342,135068.436282,179.192356,199978.66936,1071.170962
Animation,46.738843,19957.486079,160.542398,46878.380145,176.989426,67432.97132,6.765834
Fantasy,46.991335,34256.683311,137.24265,69856.509099,166.006633,103920.152145,5.958486
Family,46.620167,33519.899995,137.569937,72912.066603,159.103683,107076.778493,1927.31528
Science Fiction,35.450581,35344.22913,99.956942,63372.701088,131.516076,97847.960421,5.281461
Action,36.766719,77688.077617,89.277186,126237.940703,116.073804,201388.050019,726.056558
Thriller,23.370518,55504.980272,51.565089,77399.199214,69.520125,129724.55297,672.383836
War,24.393692,7805.981356,45.462982,9228.985278,65.475137,15910.458263,20682.075826
Comedy,21.601565,60095.55337,55.292873,102291.814344,64.097213,166845.04597,8373.127236
Mystery,22.434132,14425.14693,45.831507,20303.357592,63.190209,34754.614989,42.379596


In [256]:
genre_wise_performence.nlargest(20,('ROI','mean'))

Unnamed: 0_level_0,budget_musd,budget_musd,profit_musd,profit_musd,revenue_musd,revenue_musd,ROI
Unnamed: 0_level_1,mean,sum,mean,sum,mean,sum,mean
genre_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
War,24.393692,7805.981356,45.462982,9228.985278,65.475137,15910.458263,20682.075826
History,23.274574,8914.161694,31.88172,7492.204166,50.515927,14902.19842,17874.906652
Crime,21.958441,27601.759972,41.942148,36112.189363,58.51868,63375.730411,14402.467253
Romance,17.484859,26034.954973,45.498836,46090.320974,51.272289,73473.190426,13275.649591
Drama,16.878877,70013.580668,35.622201,91940.901552,43.826162,160754.363574,10134.654732
Comedy,21.601565,60095.55337,55.292873,102291.814344,64.097213,166845.04597,8373.127236
Family,46.620167,33519.899995,137.569937,72912.066603,159.103683,107076.778493,1927.31528
Horror,9.336621,11820.161791,34.483605,20207.392515,41.955231,30837.094673,1746.946979
Adventure,51.093255,65705.926364,141.137342,135068.436282,179.192356,199978.66936,1071.170962
Action,36.766719,77688.077617,89.277186,126237.940703,116.073804,201388.050019,726.056558


In [258]:
genre_wise_performence.nlargest(20,('budget_musd','count'))

Unnamed: 0_level_0,budget_musd,budget_musd,budget_musd,profit_musd,profit_musd,revenue_musd,revenue_musd,ROI
Unnamed: 0_level_1,mean,sum,count,mean,sum,mean,sum,mean
genre_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
Drama,16.878877,70013.580668,4148,35.622201,91940.901552,43.826162,160754.363574,10134.654732
Comedy,21.601565,60095.55337,2782,55.292873,102291.814344,64.097213,166845.04597,8373.127236
Thriller,23.370518,55504.980272,2375,51.565089,77399.199214,69.520125,129724.55297,672.383836
Action,36.766719,77688.077617,2113,89.277186,126237.940703,116.073804,201388.050019,726.056558
Romance,17.484859,26034.954973,1489,45.498836,46090.320974,51.272289,73473.190426,13275.649591
Adventure,51.093255,65705.926364,1286,141.137342,135068.436282,179.192356,199978.66936,1071.170962
Horror,9.336621,11820.161791,1266,34.483605,20207.392515,41.955231,30837.094673,1746.946979
Crime,21.958441,27601.759972,1257,41.942148,36112.189363,58.51868,63375.730411,14402.467253
Science Fiction,35.450581,35344.22913,997,99.956942,63372.701088,131.516076,97847.960421,5.281461
Fantasy,46.991335,34256.683311,729,137.24265,69856.509099,166.006633,103920.152145,5.958486
