# Phase 2 Project (Group 5) 


<h1><center>From Scripts to Screens: Analyzing Revenue, Genre Evolution and Industry Trends in Movie Production</center></h1>


![Movie Theater Night Out](https://www.fromfrugaltofree.com/wp-content/uploads/2023/12/movie-theater-night-out-e1691641716226-1170x550.jpg.webp)

## Project Problem

Your company now sees all the big companies creating original video content and they want to get in on the fun. They have decided to create a new movie studio, but they don’t know anything about creating movies. You are charged with exploring what types of films are currently doing the best at the box office. You must then translate those findings into actionable insights that the head of your company's new movie studio can use to help decide what type of films to create.

## 1. Project Introduction

In recent years, the film industry has experienced rapid growth, with streaming services and traditional studios alike competing for viewers' attention. For companies entering this space, understanding the dynamics of profitability is essential. One of the most influential factors in a movie's financial success is its genre, which can heavily influence production costs, audience appeal, and box office revenue. This report explores the profitability of different film genres, helping new studios make data-informed decisions about which genres to focus on in future productions.

As we analyze the trends and performance metrics associated with various genres, our goal is to identify patterns and factors contributing to profitability. This analysis not only focuses on historical box office success but also considers seasonal trends, budget requirements, and audience preferences to create a comprehensive view of genre-driven profitability.

---

*Key Questions Addressed in This Report:*

Q1. What is the relationship between production budgets and profitability in terms of both Domestic Earnings and Worldwide Earnings?

Q2. Which Genres Are Most Profitable, and What Key Trends Exist in Box Office Performance?

Q3. How do certain directors influence, and which directors consistently contribute to higher success in box office performance?

Q4. How do critic ratings and audience ratings correlate with the box office performance metrics movies?

Q5. What are the market trends by year whether seasonal or annual that influences box office success?

This report combines historical data with analytical insights to help studios navigate the complex landscape of genre profitability. By addressing these questions, it aims to offer practical guidance on selecting film genres that align with both audience demand and financial objectives.


## 2. Data Collection

In this section, we want to make sure the data covers a reasonable time range to analyze recent trends and we also want to ensure consistency in the data especially regarding the revenue.
The sources of the data are ;
- Rotten Tomatoes
- The Numbers

In [172]:
# import necessary libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import statsmodels.api as sm
import requests
import zipfile
import os
import warnings
warnings.filterwarnings('ignore')


In [173]:
# Loading the rt.movie_info dataframe

df_movie_info = pd.read_csv('zippedData/rt.movie_info.tsv.gz', compression='gzip', sep='\t')
df_movie_info.head()


Unnamed: 0,id,synopsis,rating,genre,director,writer,theater_date,dvd_date,currency,box_office,runtime,studio
0,1,"This gritty, fast-paced, and innovative police...",R,Action and Adventure|Classics|Drama,William Friedkin,Ernest Tidyman,"Oct 9, 1971","Sep 25, 2001",,,104 minutes,
1,3,"New York City, not-too-distant-future: Eric Pa...",R,Drama|Science Fiction and Fantasy,David Cronenberg,David Cronenberg|Don DeLillo,"Aug 17, 2012","Jan 1, 2013",$,600000.0,108 minutes,Entertainment One
2,5,Illeana Douglas delivers a superb performance ...,R,Drama|Musical and Performing Arts,Allison Anders,Allison Anders,"Sep 13, 1996","Apr 18, 2000",,,116 minutes,
3,6,Michael Douglas runs afoul of a treacherous su...,R,Drama|Mystery and Suspense,Barry Levinson,Paul Attanasio|Michael Crichton,"Dec 9, 1994","Aug 27, 1997",,,128 minutes,
4,7,,NR,Drama|Romance,Rodney Bennett,Giles Cooper,,,,,200 minutes,


In [174]:
# Loading the tn.movie_budgets dataframe

df_movie_budgets = pd.read_csv('zippedData/tn.movie_budgets.csv.gz', compression='gzip')
df_movie_budgets.head()

Unnamed: 0,id,release_date,movie,production_budget,domestic_gross,worldwide_gross
0,1,"Dec 18, 2009",Avatar,"$425,000,000","$760,507,625","$2,776,345,279"
1,2,"May 20, 2011",Pirates of the Caribbean: On Stranger Tides,"$410,600,000","$241,063,875","$1,045,663,875"
2,3,"Jun 7, 2019",Dark Phoenix,"$350,000,000","$42,762,350","$149,762,350"
3,4,"May 1, 2015",Avengers: Age of Ultron,"$330,600,000","$459,005,868","$1,403,013,963"
4,5,"Dec 15, 2017",Star Wars Ep. VIII: The Last Jedi,"$317,000,000","$620,181,382","$1,316,721,747"


In [175]:
# Loading the rt.reviews dataframe

df_reviews = pd.read_csv('zippedData/rt.reviews.tsv.gz', compression='gzip', sep='\t', encoding='ISO-8859-1')
df_reviews.head()

Unnamed: 0,id,review,rating,fresh,critic,top_critic,publisher,date
0,3,A distinctly gallows take on contemporary fina...,3/5,fresh,PJ Nabarro,0,Patrick Nabarro,"November 10, 2018"
1,3,It's an allegory in search of a meaning that n...,,rotten,Annalee Newitz,0,io9.com,"May 23, 2018"
2,3,... life lived in a bubble in financial dealin...,,fresh,Sean Axmaker,0,Stream on Demand,"January 4, 2018"
3,3,Continuing along a line introduced in last yea...,,fresh,Daniel Kasman,0,MUBI,"November 16, 2017"
4,3,... a perverse twist on neorealism...,,fresh,,0,Cinema Scope,"October 12, 2017"


Based on the above results, the model provides reasonably close estimates to actual values, especially in terms of mean and maximum profits. 
However, the presence of negative predictions like the minimum value suggests the model may need further refinement to improve accuracy, particularly for lower-grossing films.