# TMDB Movie KPI Analysis & Performance Metrics

This notebook analyzes cleaned movie data to identify top performers and generate insights.

## Objectives
1. Identify best/worst performing movies by revenue, budget, profit, ROI
2. Find most voted, highest/lowest rated, and most popular movies
3. Execute advanced search queries (Sci-Fi Action + Bruce Willis, Uma + Tarantino)
4. Compare franchise vs standalone movie performance
5. Identify most successful franchises and directors

## Setup

In [1]:
# Import required libraries
import sys
import os
from pathlib import Path
import pandas as pd
import numpy as np

# Add project root to path and set working directory
project_root = Path.cwd().parent
sys.path.append(str(project_root))
os.chdir(str(project_root))

from src.analytics.kpi_calculator import *
from src.analytics.filters import *
from src.analytics.aggregators import *
from src.utils.helpers import load_config, setup_logging

# Setup logger for notebook
logger = setup_logging(module_name='kpi_notebook')
logger.info("✓ Imports successful")

2025-12-10 08:47:26 - kpi_notebook - INFO - ✓ Imports successful


## 1. Load Configuration and Data

Load the cleaned data from Step 2 (Data Cleaning & Preprocessing).

In [2]:
# Load configuration
config = load_config('config/config.yaml')
processed_path = Path(config['paths']['processed_data'])

# Load cleaned data from Step 2
df = pd.read_parquet(processed_path / 'movies_cleaned.parquet')

logger.info(f"Loaded {len(df)} movies")
logger.info(f"Columns: {list(df.columns)}")
logger.info(f"Memory usage: {df.memory_usage(deep=True).sum() / 1024**2:.2f} MB")

# Display first few rows
df.head()

2025-12-10 08:47:26 - kpi_notebook - INFO - Loaded 18 movies
2025-12-10 08:47:26 - kpi_notebook - INFO - Columns: ['id', 'title', 'tagline', 'release_date', 'genres', 'collection_name', 'original_language', 'budget_musd', 'revenue_musd', 'production_companies', 'production_countries', 'vote_count', 'vote_average', 'popularity', 'runtime', 'overview', 'spoken_languages', 'poster_path', 'cast', 'cast_size', 'director', 'crew_size']
2025-12-10 08:47:26 - kpi_notebook - INFO - Memory usage: 0.02 MB


Unnamed: 0,id,title,tagline,release_date,genres,collection_name,original_language,budget_musd,revenue_musd,production_companies,...,vote_average,popularity,runtime,overview,spoken_languages,poster_path,cast,cast_size,director,crew_size
0,109445,Frozen,Only the act of true love will thaw a frozen h...,2013-11-20,Animation|Family|Adventure|Fantasy,Frozen Collection,en,150.0,1274.219009,Walt Disney Animation Studios,...,7.25,18.277,102,Young princess Anna of Arendelle dreams about ...,English,/itAKcobTYGpYT8Phwjd8c9hleTo.jpg,Idina Menzel|Kristen Bell|Jonathan Groff|Josh ...,60,Chris Buck,285
1,12445,Harry Potter and the Deathly Hallows: Part 2,It all ends.,2011-07-12,Adventure|Fantasy,Harry Potter Collection,en,125.0,1341.511219,Warner Bros. Pictures|Heyday Films,...,8.084,17.3221,130,"Harry, Ron and Hermione continue their quest t...",English,/c54HpQmuwXjHq2C9wmoACjxoom3.jpg,Daniel Radcliffe|Emma Watson|Rupert Grint|Ralp...,105,David Yates,159
2,135397,Jurassic World,The park is open.,2015-06-06,Action|Adventure|Science Fiction|Thriller,Jurassic Park Collection,en,150.0,1671.537444,Amblin Entertainment|Universal Pictures|Legend...,...,6.699,9.3758,124,Twenty-two years after the events of Jurassic ...,English,/rhr4y79GpxQF9IsfJItRXVaoGs4.jpg,Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi...,53,Colin Trevorrow,426
3,140607,Star Wars: The Force Awakens,Every generation has a story.,2015-12-15,Adventure|Action|Science Fiction,Star Wars Collection,en,245.0,2068.223624,Lucasfilm Ltd.|Bad Robot,...,7.3,7.5842,136,Thirty years after defeating the Galactic Empi...,English,/wqnLdwVXoBjKibFRR5U3y0aDUhs.jpg,Harrison Ford|Mark Hamill|Carrie Fisher|Adam D...,183,J.J. Abrams,262
4,168259,Furious 7,Vengeance hits home.,2015-04-01,Action|Crime|Thriller,The Fast and the Furious Collection,en,190.0,1515.4,Original Film|One Race|Universal Pictures,...,7.223,17.1859,139,Deckard Shaw seeks revenge against Dominic Tor...,العربية|English|Español|ภาษาไทย,/ktofZ9Htrjiy0P6LEowsDaxd3Ri.jpg,Vin Diesel|Paul Walker|Jason Statham|Michelle ...,49,James Wan,227


## 2. Best/Worst Performing Movies - Revenue

Identify movies with the highest and lowest revenue.

In [3]:
# Highest revenue movies
logger.info("="*60)
logger.info("TOP 10 MOVIES BY REVENUE")
logger.info("="*60)

top_revenue = get_top_by_revenue(df, top_n=10)
top_revenue

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 MOVIES BY REVENUE


Unnamed: 0_level_0,rank,title,revenue_musd,release_year,budget_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,Avatar,2923.706026,2009,237.0
2,2,Avengers: Endgame,2799.4391,2019,356.0
3,3,Titanic,2264.162353,1997,200.0
4,4,Star Wars: The Force Awakens,2068.223624,2015,245.0
5,5,Avengers: Infinity War,2052.415039,2018,300.0
6,6,Jurassic World,1671.537444,2015,150.0
7,7,The Lion King,1662.020819,2019,260.0
8,8,The Avengers,1518.815515,2012,220.0
9,9,Furious 7,1515.4,2015,190.0
10,10,Frozen II,1453.683476,2019,150.0


In [4]:
# Lowest revenue movies
logger.info("\n" + "="*60)
logger.info("BOTTOM 10 MOVIES BY REVENUE")
logger.info("="*60)

bottom_revenue = get_bottom_by_revenue(df, top_n=10)
bottom_revenue

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - BOTTOM 10 MOVIES BY REVENUE


Unnamed: 0_level_0,rank,title,revenue_musd,release_year,budget_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,Incredibles 2,1243.225667,2018,200.0
2,2,Beauty and the Beast,1266.115964,2017,160.0
3,3,Frozen,1274.219009,2013,150.0
4,4,Jurassic World: Fallen Kingdom,1310.469037,2018,170.0
5,5,Star Wars: The Last Jedi,1332.69883,2017,300.0
6,6,Harry Potter and the Deathly Hallows: Part 2,1341.511219,2011,125.0
7,7,Black Panther,1349.926083,2018,200.0
8,8,Avengers: Age of Ultron,1405.403694,2015,235.0
9,9,Frozen II,1453.683476,2019,150.0
10,10,Furious 7,1515.4,2015,190.0


## 3. Best/Worst Performing Movies - Budget

Identify movies with the highest and lowest production budgets.

In [5]:
# Highest budget movies
logger.info("="*60)
logger.info("TOP 10 MOVIES BY BUDGET")
logger.info("="*60)

top_budget = get_top_by_budget(df, top_n=10)
top_budget

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 MOVIES BY BUDGET


Unnamed: 0_level_0,rank,title,budget_musd,release_year,revenue_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,Avengers: Endgame,356.0,2019,2799.4391
2,2,Star Wars: The Last Jedi,300.0,2017,1332.69883
3,3,Avengers: Infinity War,300.0,2018,2052.415039
4,4,The Lion King,260.0,2019,1662.020819
5,5,Star Wars: The Force Awakens,245.0,2015,2068.223624
6,6,Avatar,237.0,2009,2923.706026
7,7,Avengers: Age of Ultron,235.0,2015,1405.403694
8,8,The Avengers,220.0,2012,1518.815515
9,9,Titanic,200.0,1997,2264.162353
10,10,Incredibles 2,200.0,2018,1243.225667


In [6]:
# Lowest budget movies
logger.info("\n" + "="*60)
logger.info("BOTTOM 10 MOVIES BY BUDGET")
logger.info("="*60)

bottom_budget = get_bottom_by_budget(df, top_n=10)
bottom_budget

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - BOTTOM 10 MOVIES BY BUDGET


Unnamed: 0_level_0,rank,title,budget_musd,release_year,revenue_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,Harry Potter and the Deathly Hallows: Part 2,125.0,2011,1341.511219
2,2,Frozen,150.0,2013,1274.219009
3,3,Jurassic World,150.0,2015,1671.537444
4,4,Frozen II,150.0,2019,1453.683476
5,5,Beauty and the Beast,160.0,2017,1266.115964
6,6,Jurassic World: Fallen Kingdom,170.0,2018,1310.469037
7,7,Furious 7,190.0,2015,1515.4
8,8,Incredibles 2,200.0,2018,1243.225667
9,9,Black Panther,200.0,2018,1349.926083
10,10,Titanic,200.0,1997,2264.162353


## 4. Best/Worst Performing Movies - Profit

Identify movies with the highest and lowest profit (Revenue - Budget).

In [7]:
# Highest profit movies
logger.info("="*60)
logger.info("TOP 10 MOVIES BY PROFIT")
logger.info("="*60)

top_profit = get_top_by_profit(df, top_n=10)
top_profit

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 MOVIES BY PROFIT


Unnamed: 0_level_0,rank,title,profit_musd,release_year,budget_musd,revenue_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,1,Avatar,2686.706026,2009,237.0,2923.706026
2,2,Avengers: Endgame,2443.4391,2019,356.0,2799.4391
3,3,Titanic,2064.162353,1997,200.0,2264.162353
4,4,Star Wars: The Force Awakens,1823.223624,2015,245.0,2068.223624
5,5,Avengers: Infinity War,1752.415039,2018,300.0,2052.415039
6,6,Jurassic World,1521.537444,2015,150.0,1671.537444
7,7,The Lion King,1402.020819,2019,260.0,1662.020819
8,8,Furious 7,1325.4,2015,190.0,1515.4
9,9,Frozen II,1303.683476,2019,150.0,1453.683476
10,10,The Avengers,1298.815515,2012,220.0,1518.815515


In [8]:
# Lowest profit movies
logger.info("\n" + "="*60)
logger.info("BOTTOM 10 MOVIES BY PROFIT")
logger.info("="*60)

bottom_profit = get_bottom_by_profit(df, top_n=10)
bottom_profit

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - BOTTOM 10 MOVIES BY PROFIT


Unnamed: 0_level_0,rank,title,profit_musd,release_year,budget_musd,revenue_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,1,Star Wars: The Last Jedi,1032.69883,2017,300.0,1332.69883
2,2,Incredibles 2,1043.225667,2018,200.0,1243.225667
3,3,Beauty and the Beast,1106.115964,2017,160.0,1266.115964
4,4,Frozen,1124.219009,2013,150.0,1274.219009
5,5,Jurassic World: Fallen Kingdom,1140.469037,2018,170.0,1310.469037
6,6,Black Panther,1149.926083,2018,200.0,1349.926083
7,7,Avengers: Age of Ultron,1170.403694,2015,235.0,1405.403694
8,8,Harry Potter and the Deathly Hallows: Part 2,1216.511219,2011,125.0,1341.511219
9,9,The Avengers,1298.815515,2012,220.0,1518.815515
10,10,Frozen II,1303.683476,2019,150.0,1453.683476


## 5. Best/Worst Performing Movies - ROI

Calculate Return on Investment (ROI) for movies with budget >= $10M.

ROI = (Revenue - Budget) / Budget × 100

In [9]:
# Highest ROI (budget >= $10M)
logger.info("="*60)
logger.info("TOP 10 MOVIES BY ROI (Budget >= $10M)")
logger.info("="*60)

top_roi = get_top_by_roi(df, top_n=10)
top_roi

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 MOVIES BY ROI (Budget >= $10M)


Unnamed: 0_level_0,rank,title,roi,release_year,budget_musd,revenue_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,1,Avatar,1133.631235,2009,237.0,2923.706026
2,2,Titanic,1032.081177,1997,200.0,2264.162353
3,3,Jurassic World,1014.358296,2015,150.0,1671.537444
4,4,Harry Potter and the Deathly Hallows: Part 2,973.208975,2011,125.0,1341.511219
5,5,Frozen II,869.122317,2019,150.0,1453.683476
6,6,Frozen,749.479339,2013,150.0,1274.219009
7,7,Star Wars: The Force Awakens,744.172908,2015,245.0,2068.223624
8,8,Furious 7,697.578947,2015,190.0,1515.4
9,9,Beauty and the Beast,691.322478,2017,160.0,1266.115964
10,10,Avengers: Endgame,686.359298,2019,356.0,2799.4391


In [10]:
# Lowest ROI (budget >= $10M)
logger.info("\n" + "="*60)
logger.info("BOTTOM 10 MOVIES BY ROI (Budget >= $10M)")
logger.info("="*60)

bottom_roi = get_bottom_by_roi(df, top_n=10)
bottom_roi

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - BOTTOM 10 MOVIES BY ROI (Budget >= $10M)


Unnamed: 0_level_0,rank,title,roi,release_year,budget_musd,revenue_musd
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,1,Star Wars: The Last Jedi,344.232943,2017,300.0,1332.69883
2,2,Avengers: Age of Ultron,498.044125,2015,235.0,1405.403694
3,3,Incredibles 2,521.612833,2018,200.0,1243.225667
4,4,The Lion King,539.238777,2019,260.0,1662.020819
5,5,Black Panther,574.963042,2018,200.0,1349.926083
6,6,Avengers: Infinity War,584.138346,2018,300.0,2052.415039
7,7,The Avengers,590.370689,2012,220.0,1518.815515
8,8,Jurassic World: Fallen Kingdom,670.864139,2018,170.0,1310.469037
9,9,Avengers: Endgame,686.359298,2019,356.0,2799.4391
10,10,Beauty and the Beast,691.322478,2017,160.0,1266.115964


## 6. Most Voted and Most Popular Movies

Identify movies with the highest number of votes and highest popularity scores.

In [11]:
# Most voted movies
logger.info("="*60)
logger.info("TOP 10 MOST VOTED MOVIES")
logger.info("="*60)

most_voted = get_most_voted(df, top_n=10)
most_voted

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 MOST VOTED MOVIES


Unnamed: 0_level_0,rank,title,vote_count,release_year,vote_average
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,The Avengers,34329,2012,7.87
2,2,Avatar,32883,2009,7.594
3,3,Avengers: Infinity War,31188,2018,8.235
4,4,Avengers: Endgame,26978,2019,8.237
5,5,Titanic,26519,1997,7.903
6,6,Avengers: Age of Ultron,23882,2015,7.271
7,7,Black Panther,22980,2018,7.366
8,8,Harry Potter and the Deathly Hallows: Part 2,21464,2011,8.084
9,9,Jurassic World,21127,2015,6.699
10,10,Star Wars: The Force Awakens,20104,2015,7.3


In [12]:
# Most popular movies
logger.info("\n" + "="*60)
logger.info("TOP 10 MOST POPULAR MOVIES")
logger.info("="*60)

most_popular = get_most_popular(df, top_n=10)
most_popular

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 MOST POPULAR MOVIES


Unnamed: 0_level_0,rank,title,popularity,release_year,vote_average
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,The Avengers,40.3021,2012,7.87
2,2,Avatar,38.2316,2009,7.594
3,3,Titanic,23.4289,1997,7.903
4,4,Avengers: Infinity War,20.7267,2018,8.235
5,5,Frozen,18.277,2013,7.25
6,6,Harry Potter and the Deathly Hallows: Part 2,17.3221,2011,8.084
7,7,Furious 7,17.1859,2015,7.223
8,8,Avengers: Endgame,12.0878,2019,8.237
9,9,Avengers: Age of Ultron,11.1058,2015,7.271
10,10,Beauty and the Beast,9.8219,2017,6.969


## 7. Highest/Lowest Rated Movies

Filter to movies with at least 10 votes to ensure rating reliability.

In [13]:
# Highest rated movies (vote_count >= 10)
logger.info("="*60)
logger.info("TOP 10 HIGHEST RATED MOVIES (Votes >= 10)")
logger.info("="*60)

top_rated = get_top_rated(df, top_n=10)
top_rated

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 HIGHEST RATED MOVIES (Votes >= 10)


Unnamed: 0_level_0,rank,title,vote_average,release_year,vote_count
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,Avengers: Endgame,8.237,2019,26978
2,2,Avengers: Infinity War,8.235,2018,31188
3,3,Harry Potter and the Deathly Hallows: Part 2,8.084,2011,21464
4,4,Titanic,7.903,1997,26519
5,5,The Avengers,7.87,2012,34329
6,6,Avatar,7.594,2009,32883
7,7,Incredibles 2,7.455,2018,13373
8,8,Black Panther,7.366,2018,22980
9,9,Star Wars: The Force Awakens,7.3,2015,20104
10,10,Avengers: Age of Ultron,7.271,2015,23882


In [14]:
# Lowest rated movies (vote_count >= 10)
logger.info("\n" + "="*60)
logger.info("BOTTOM 10 LOWEST RATED MOVIES (Votes >= 10)")
logger.info("="*60)

bottom_rated = get_bottom_rated(df, top_n=10)
bottom_rated

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - BOTTOM 10 LOWEST RATED MOVIES (Votes >= 10)


Unnamed: 0_level_0,rank,title,vote_average,release_year,vote_count
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1,Jurassic World: Fallen Kingdom,6.537,2018,12413
2,2,Jurassic World,6.699,2015,21127
3,3,Star Wars: The Last Jedi,6.8,2017,15928
4,4,Beauty and the Beast,6.969,2017,15835
5,5,The Lion King,7.102,2019,10571
6,6,Furious 7,7.223,2015,11035
7,7,Frozen II,7.241,2019,10078
8,8,Frozen,7.25,2013,17188
9,9,Avengers: Age of Ultron,7.271,2015,23882
10,10,Star Wars: The Force Awakens,7.3,2015,20104


## 8. Advanced Movie Search Queries

Execute complex multi-criteria searches.

### Search 1: Best-Rated Science Fiction Action Movies with Bruce Willis

Find Sci-Fi Action movies starring Bruce Willis, sorted by rating (highest to lowest).

In [15]:
# Search 1: Bruce Willis in Sci-Fi Action
logger.info("="*60)
logger.info("SEARCH 1: Sci-Fi Action Movies with Bruce Willis")
logger.info("="*60)

search1_results = search_scifi_action_bruce_willis(df)
logger.info(f"Found {len(search1_results)} movies")
search1_results

2025-12-10 08:47:26 - kpi_notebook - INFO - SEARCH 1: Sci-Fi Action Movies with Bruce Willis
2025-12-10 08:47:26 - kpi_notebook - INFO - Found 0 movies


Unnamed: 0,title,vote_average,vote_count,genres,director,revenue_musd


### Search 2: Uma Thurman + Quentin Tarantino Movies

Find movies starring Uma Thurman, directed by Quentin Tarantino (sorted by runtime - shortest to longest).

In [16]:
# Search 2: Uma Thurman directed by Quentin Tarantino
logger.info("\n" + "="*60)
logger.info("SEARCH 2: Uma Thurman + Quentin Tarantino Movies")
logger.info("="*60)

search2_results = search_uma_tarantino(df)
logger.info(f"Found {len(search2_results)} movies")
search2_results

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - SEARCH 2: Uma Thurman + Quentin Tarantino Movies
2025-12-10 08:47:26 - kpi_notebook - INFO - Found 0 movies


Unnamed: 0,title,runtime,vote_average,genres,revenue_musd,budget_musd


## 9. Franchise vs Standalone Movie Performance

Compare movies that belong to franchises/collections vs standalone films.

**Metrics:**
- Mean Revenue
- Median ROI
- Mean Budget
- Mean Popularity
- Mean Rating

In [17]:
# Compare franchise vs standalone
logger.info("="*60)
logger.info("FRANCHISE VS STANDALONE COMPARISON")
logger.info("="*60)

comparison = compare_franchise_vs_standalone(df)
comparison

2025-12-10 08:47:26 - kpi_notebook - INFO - FRANCHISE VS STANDALONE COMPARISON


Unnamed: 0,Movie_Type,Count,Mean_Revenue_MUSD,Median_ROI_Percent,Mean_Budget_MUSD,Mean_Popularity,Mean_Rating
0,Standalone,2,1765.14,861.7,180.0,16.63,7.44
1,Franchise,16,1682.67,678.61,218.0,15.0,7.39


## 10. Most Successful Franchises

Identify the most successful movie franchises based on various metrics.

In [18]:
# Top franchises by total revenue
logger.info("="*60)
logger.info("TOP 10 FRANCHISES BY TOTAL REVENUE")
logger.info("="*60)

top_franchises_revenue = get_top_franchises(df, sort_by='total_revenue', top_n=10)
top_franchises_revenue

2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 FRANCHISES BY TOTAL REVENUE


Unnamed: 0,Rank,Franchise,Movie_Count,Total_Budget_MUSD,Mean_Budget_MUSD,Total_Revenue_MUSD,Mean_Revenue_MUSD,Mean_Rating
0,1,The Avengers Collection,4,1111.0,277.75,7776.07,1944.02,7.9
1,2,Star Wars Collection,2,545.0,272.5,3400.92,1700.46,7.05
2,3,Jurassic Park Collection,2,320.0,160.0,2982.01,1491.0,6.62
3,4,Avatar Collection,1,237.0,237.0,2923.71,2923.71,7.59
4,5,Frozen Collection,2,300.0,150.0,2727.9,1363.95,7.25
5,6,The Lion King (Reboot) Collection,1,260.0,260.0,1662.02,1662.02,7.1
6,7,The Fast and the Furious Collection,1,190.0,190.0,1515.4,1515.4,7.22
7,8,Black Panther Collection,1,200.0,200.0,1349.93,1349.93,7.37
8,9,Harry Potter Collection,1,125.0,125.0,1341.51,1341.51,8.08
9,10,The Incredibles Collection,1,200.0,200.0,1243.23,1243.23,7.46


In [19]:
# Top franchises by mean rating
logger.info("\n" + "="*60)
logger.info("TOP 10 FRANCHISES BY MEAN RATING")
logger.info("="*60)

top_franchises_rating = get_top_franchises(df, sort_by='mean_rating', top_n=10)
top_franchises_rating

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 FRANCHISES BY MEAN RATING


Unnamed: 0,Rank,Franchise,Movie_Count,Total_Budget_MUSD,Mean_Budget_MUSD,Total_Revenue_MUSD,Mean_Revenue_MUSD,Mean_Rating
0,1,Harry Potter Collection,1,125.0,125.0,1341.51,1341.51,8.08
1,2,The Avengers Collection,4,1111.0,277.75,7776.07,1944.02,7.9
2,3,Avatar Collection,1,237.0,237.0,2923.71,2923.71,7.59
3,4,The Incredibles Collection,1,200.0,200.0,1243.23,1243.23,7.46
4,5,Black Panther Collection,1,200.0,200.0,1349.93,1349.93,7.37
5,6,Frozen Collection,2,300.0,150.0,2727.9,1363.95,7.25
6,7,The Fast and the Furious Collection,1,190.0,190.0,1515.4,1515.4,7.22
7,8,The Lion King (Reboot) Collection,1,260.0,260.0,1662.02,1662.02,7.1
8,9,Star Wars Collection,2,545.0,272.5,3400.92,1700.46,7.05
9,10,Jurassic Park Collection,2,320.0,160.0,2982.01,1491.0,6.62


In [20]:
# Top franchises by movie count
logger.info("\n" + "="*60)
logger.info("TOP 10 FRANCHISES BY MOVIE COUNT")
logger.info("="*60)

top_franchises_count = get_top_franchises(df, sort_by='movie_count', top_n=10)
top_franchises_count

2025-12-10 08:47:26 - kpi_notebook - INFO - 
2025-12-10 08:47:26 - kpi_notebook - INFO - TOP 10 FRANCHISES BY MOVIE COUNT


Unnamed: 0,Rank,Franchise,Movie_Count,Total_Budget_MUSD,Mean_Budget_MUSD,Total_Revenue_MUSD,Mean_Revenue_MUSD,Mean_Rating
0,1,The Avengers Collection,4,1111.0,277.75,7776.07,1944.02,7.9
1,2,Star Wars Collection,2,545.0,272.5,3400.92,1700.46,7.05
2,3,Jurassic Park Collection,2,320.0,160.0,2982.01,1491.0,6.62
3,4,Frozen Collection,2,300.0,150.0,2727.9,1363.95,7.25
4,5,Avatar Collection,1,237.0,237.0,2923.71,2923.71,7.59
5,6,The Lion King (Reboot) Collection,1,260.0,260.0,1662.02,1662.02,7.1
6,7,The Fast and the Furious Collection,1,190.0,190.0,1515.4,1515.4,7.22
7,8,Black Panther Collection,1,200.0,200.0,1349.93,1349.93,7.37
8,9,Harry Potter Collection,1,125.0,125.0,1341.51,1341.51,8.08
9,10,The Incredibles Collection,1,200.0,200.0,1243.23,1243.23,7.46


## 11. Most Successful Directors

Identify the most successful directors based on various metrics.

In [21]:
# Top directors by total revenue
logger.info("="*60)
logger.info("TOP 10 DIRECTORS BY TOTAL REVENUE")
logger.info("="*60)

top_directors_revenue = get_top_directors(df, sort_by='total_revenue', top_n=10)
top_directors_revenue

2025-12-10 08:47:27 - kpi_notebook - INFO - TOP 10 DIRECTORS BY TOTAL REVENUE


Unnamed: 0,Rank,Director,Movie_Count,Total_Revenue_MUSD,Mean_Revenue_MUSD,Mean_Rating
0,1,James Cameron,2,5187.87,2593.93,7.75
1,2,Joss Whedon,2,2924.22,1462.11,7.57
2,3,Anthony Russo,1,2799.44,2799.44,8.24
3,4,J.J. Abrams,1,2068.22,2068.22,7.3
4,5,Joe Russo,1,2052.42,2052.42,8.24
5,6,Colin Trevorrow,1,1671.54,1671.54,6.7
6,7,Jon Favreau,1,1662.02,1662.02,7.1
7,8,James Wan,1,1515.4,1515.4,7.22
8,9,Jennifer Lee,1,1453.68,1453.68,7.24
9,10,Ryan Coogler,1,1349.93,1349.93,7.37


In [22]:
# Top directors by mean rating
logger.info("\n" + "="*60)
logger.info("TOP 10 DIRECTORS BY MEAN RATING")
logger.info("="*60)

top_directors_rating = get_top_directors(df, sort_by='mean_rating', top_n=10)
top_directors_rating

2025-12-10 08:47:27 - kpi_notebook - INFO - 
2025-12-10 08:47:27 - kpi_notebook - INFO - TOP 10 DIRECTORS BY MEAN RATING


Unnamed: 0,Rank,Director,Movie_Count,Total_Revenue_MUSD,Mean_Revenue_MUSD,Mean_Rating
0,1,Anthony Russo,1,2799.44,2799.44,8.24
1,2,Joe Russo,1,2052.42,2052.42,8.24
2,3,David Yates,1,1341.51,1341.51,8.08
3,4,James Cameron,2,5187.87,2593.93,7.75
4,5,Joss Whedon,2,2924.22,1462.11,7.57
5,6,Brad Bird,1,1243.23,1243.23,7.46
6,7,Ryan Coogler,1,1349.93,1349.93,7.37
7,8,J.J. Abrams,1,2068.22,2068.22,7.3
8,9,Chris Buck,1,1274.22,1274.22,7.25
9,10,Jennifer Lee,1,1453.68,1453.68,7.24


In [23]:
# Top directors by movie count
logger.info("\n" + "="*60)
logger.info("TOP 10 DIRECTORS BY MOVIE COUNT")
logger.info("="*60)

top_directors_count = get_top_directors(df, sort_by='movie_count', top_n=10)
top_directors_count

2025-12-10 08:47:27 - kpi_notebook - INFO - 
2025-12-10 08:47:27 - kpi_notebook - INFO - TOP 10 DIRECTORS BY MOVIE COUNT


Unnamed: 0,Rank,Director,Movie_Count,Total_Revenue_MUSD,Mean_Revenue_MUSD,Mean_Rating
0,1,James Cameron,2,5187.87,2593.93,7.75
1,2,Joss Whedon,2,2924.22,1462.11,7.57
2,3,Anthony Russo,1,2799.44,2799.44,8.24
3,4,J.J. Abrams,1,2068.22,2068.22,7.3
4,5,Joe Russo,1,2052.42,2052.42,8.24
5,6,Colin Trevorrow,1,1671.54,1671.54,6.7
6,7,Jon Favreau,1,1662.02,1662.02,7.1
7,8,James Wan,1,1515.4,1515.4,7.22
8,9,Jennifer Lee,1,1453.68,1453.68,7.24
9,10,Ryan Coogler,1,1349.93,1349.93,7.37
