# Questions to Answer
- **Does the MPAA rating of a movie (G/PG/PG-13/R) affect how much revenue the movie generates?**

    - perform a statistical test to get a mathematically-supported answer.
    - report if you found a significant difference between ratings
        - If so, what was the p-value of your analysis?
        - And which rating earns the most revenue?
    - prepare a visualization that supports your finding.
    

- **Think of 2 additional hypotheses to test 
that your stakeholder may want to know.**
- Some example hypotheses you could test:
    - Do movies that are over 2.5 hours long earn more revenue than movies that are 1.5 hours long (or less)?
    - Do movies released in 2020 earn less revenue than movies released in 2018?
        - How do the years compare for movie ratings?
    - Do some movie genres earn more revenue than others?
    - Are some genres higher rated than others?

In [2]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as stats
from scipy.stats import kruskal
import json
import pymysql
import glob
from sqlalchemy import create_engine
from urllib.parse import quote_plus
from sqlalchemy_utils import create_database, database_exists
pd.set_option('display.max_columns', None)

# Load Data

## Get movie data from 2010-2019

In [10]:
file = 'C:\\Users\\Chris Palisoc\\Documents\\Coding Dojo\\Coding Dojo Project 3\\Project3\\Data\\tmdb_results_combined.csv.gz'
df = pd.read_csv(file)
df.info()
df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4507 entries, 0 to 4506
Data columns (total 26 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   imdb_id                4507 non-null   object 
 1   adult                  4505 non-null   float64
 2   backdrop_path          2041 non-null   object 
 3   belongs_to_collection  253 non-null    object 
 4   budget                 4505 non-null   float64
 5   genres                 4505 non-null   object 
 6   homepage               230 non-null    object 
 7   id                     4505 non-null   float64
 8   original_language      4505 non-null   object 
 9   original_title         4505 non-null   object 
 10  overview               4157 non-null   object 
 11  popularity             4505 non-null   float64
 12  poster_path            3870 non-null   object 
 13  production_companies   4505 non-null   object 
 14  production_countries   4505 non-null   object 
 15  rele

Unnamed: 0,imdb_id,adult,backdrop_path,belongs_to_collection,budget,genres,homepage,id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count,certification
0,0,,,,,,,,,,,,,,,,,,,,,,,,,
1,tt0113026,0.0,/vMFs7nw6P0bIV1jDsQpxAieAVnH.jpg,,10000000.0,"[{'id': 35, 'name': 'Comedy'}, {'id': 10402, '...",,62127.0,en,The Fantasticks,Two rural teens sing and dance their way throu...,2.623,/hfO64mXz3DgUxkBVU7no2UWRP7x.jpg,"[{'id': 51207, 'logo_path': None, 'name': 'Sul...","[{'iso_3166_1': 'US', 'name': 'United States o...",2000-09-22,0.0,86.0,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Try to remember the first time magic happened,The Fantasticks,0.0,5.5,22.0,
2,tt0113092,0.0,,,0.0,"[{'id': 878, 'name': 'Science Fiction'}]",,110977.0,en,For the Cause,Earth is in a state of constant war and two co...,2.806,/h9bWO13nWRGZJo4XVPiElXyrRMU.jpg,"[{'id': 7405, 'logo_path': '/rfnws0uY8rsNAsrLb...","[{'iso_3166_1': 'US', 'name': 'United States o...",2000-11-15,0.0,100.0,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,The ultimate showdown on a forbidden planet.,For the Cause,0.0,5.1,8.0,
3,tt0116391,0.0,,,0.0,"[{'id': 18, 'name': 'Drama'}, {'id': 28, 'name...",,442869.0,hi,Gang,"After falling prey to underworld, four friends...",0.824,/yB5wRu4uyXXwZA3PEj8cITu0xt3.jpg,[],"[{'iso_3166_1': 'IN', 'name': 'India'}]",2000-04-14,0.0,152.0,"[{'english_name': 'Hindi', 'iso_639_1': 'hi', ...",Released,,Gang,0.0,4.0,1.0,
4,tt0116748,0.0,/wr0hTHwkYIRC82MwNbhOvqrw27N.jpg,,0.0,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,579396.0,hi,Karobaar,Wealthy Rajiv Sinha and middle-classed Amar Sa...,0.921,/wFSOXXrJklY2ngjIJCus9c2DfJW.jpg,[],[],2000-09-15,0.0,180.0,"[{'english_name': 'Hindi', 'iso_639_1': 'hi', ...",Released,The Business of Love,Karobaar,0.0,5.5,2.0,
