<a href="https://colab.research.google.com/github/drusho/data_analysis/blob/main/fandango_movie_reviews.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# __Fandango Movie Reviews__

The purpose of the notebook to review research orginally performed in 2015 by fivethirtyeight over potiental bias in movie reviews by Fandango.

The data used in the notebook was taken from [Fivethirtyeight's Github.]("https://github.com/fivethirtyeight/data/tree/maer/fandango#fandango")

<br>

#### __Summary of Original Study by Fivethirtyeight__
The original by FiveThirtyEight compiled data for 147 films from 2015 that have substantive reviews from both critics and consumers. Whenever a movie is released, critics from review websites such as Metacritic, Fandango, Rotten Tomatoes, and IMDB review and rate the film. They also ask the users from their respective communities to review and rate each film. An average rating is then created using reviews from both critics and users.  This average rating is then displayed on their websites.

The purpose of this investigation was to look for potiental bias in Fandango's move ratings.  Fandango is unique among movie review websites because it also sells movie theater tickets, and thus there could be a financial incentive to modify movie ratings to increase tickets sales.  A final analysis of this investigation was [pubished]("http://fivethirtyeight.com/features/fandango-movies-ratings/") once it was discovered that poorily reviewed movies from other sites were still rated high by Fandango.


#### Import csv from Fandango's Github

In [1]:
# Fetch a single <1MB file using the raw GitHub URL.
!curl --remote-name \
     -H 'Accept: application/vnd.github.v3.raw' \
     --location https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_score_comparison.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 15144  100 15144    0     0  65843      0 --:--:-- --:--:-- --:--:-- 65843


In [10]:
import pandas as pd

reviews = pd.read_csv("fandango_score_comparison.csv")
reviews.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 146 entries, 0 to 145
Data columns (total 22 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   FILM                        146 non-null    object 
 1   RottenTomatoes              146 non-null    int64  
 2   RottenTomatoes_User         146 non-null    int64  
 3   Metacritic                  146 non-null    int64  
 4   Metacritic_User             146 non-null    float64
 5   IMDB                        146 non-null    float64
 6   Fandango_Stars              146 non-null    float64
 7   Fandango_Ratingvalue        146 non-null    float64
 8   RT_norm                     146 non-null    float64
 9   RT_user_norm                146 non-null    float64
 10  Metacritic_norm             146 non-null    float64
 11  Metacritic_user_nom         146 non-null    float64
 12  IMDB_norm                   146 non-null    float64
 13  RT_norm_round               146 non

In [12]:
reviews.head(3)

Unnamed: 0,FILM,RottenTomatoes,RottenTomatoes_User,Metacritic,Metacritic_User,IMDB,Fandango_Stars,Fandango_Ratingvalue,RT_norm,RT_user_norm,Metacritic_norm,Metacritic_user_nom,IMDB_norm,RT_norm_round,RT_user_norm_round,Metacritic_norm_round,Metacritic_user_norm_round,IMDB_norm_round,Metacritic_user_vote_count,IMDB_user_vote_count,Fandango_votes,Fandango_Difference
0,Avengers: Age of Ultron (2015),74,86,66,7.1,7.8,5.0,4.5,3.7,4.3,3.3,3.55,3.9,3.5,4.5,3.5,3.5,4.0,1330,271107,14846,0.5
1,Cinderella (2015),85,80,67,7.5,7.1,5.0,4.5,4.25,4.0,3.35,3.75,3.55,4.5,4.0,3.5,4.0,3.5,249,65709,12640,0.5
2,Ant-Man (2015),80,90,64,8.1,7.8,5.0,4.5,4.0,4.5,3.2,4.05,3.9,4.0,4.5,3.0,4.0,4.0,627,103660,12055,0.5


#### __*Note about Ratings__

Not all the sites use the same metric to determine their movie ratings. For example, Metacritic and Rotten Tomatoes aggregate scores from both users and film critics, while IMDB and Fandango aggregate only from their users. 

Since not all the sites have scores from critics, the rest of analysis will focus on user generated ratings  (_normalized to a 0 to 5 point scale_) from these columns: 


|Column Name|Description|
|:----------|:----------|
| FILM|film name  |
|RT_user_norm |average user rating from Rotten Tomatoes, normalized to a 1 to 5 point scale|
|Metacritic_user_nom | average user rating from Metacritic, normalized to a 1 to 5 point scale|
|IMDB_norm | average user rating from IMDB, normalized to a 1 to 5 point scale|
|Fandango_Ratingvalue | average user rating from Fandango, normalized to a 1 to 5 point scale|
|Fandango_Stars | the rating displayed on the Fandango website (rounded to nearest star, 1 to 5 point scale)|


In [11]:
# isolate db to show only user generated reviews
norm_reviews = reviews[['FILM', 'RT_user_norm', 'Metacritic_user_nom', 'IMDB_norm', 'Fandango_Ratingvalue', 'Fandango_Stars']]
norm_reviews

Unnamed: 0,FILM,RT_user_norm,Metacritic_user_nom,IMDB_norm,Fandango_Ratingvalue,Fandango_Stars
0,Avengers: Age of Ultron (2015),4.30,3.55,3.90,4.5,5.0
1,Cinderella (2015),4.00,3.75,3.55,4.5,5.0
2,Ant-Man (2015),4.50,4.05,3.90,4.5,5.0
3,Do You Believe? (2015),4.20,2.35,2.70,4.5,5.0
4,Hot Tub Time Machine 2 (2015),1.40,1.70,2.55,3.0,3.5
...,...,...,...,...,...,...
141,Mr. Holmes (2015),3.90,3.95,3.70,4.0,4.0
142,'71 (2015),4.10,3.75,3.60,3.5,3.5
143,"Two Days, One Night (2014)",3.90,4.40,3.70,3.5,3.5
144,Gett: The Trial of Viviane Amsalem (2015),4.05,3.65,3.90,3.5,3.5


In [13]:
norm_reviews.describe()

Unnamed: 0,RT_user_norm,Metacritic_user_nom,IMDB_norm,Fandango_Ratingvalue,Fandango_Stars
count,146.0,146.0,146.0,146.0,146.0
mean,3.193836,3.259589,3.368493,3.845205,4.089041
std,1.001222,0.755356,0.479368,0.502831,0.540386
min,1.0,1.2,2.0,2.7,3.0
25%,2.5,2.85,3.15,3.5,3.5
50%,3.325,3.425,3.45,3.9,4.0
75%,4.05,3.75,3.7,4.2,4.5
max,4.7,4.8,4.3,4.8,5.0


### __Norm Reviews__

- Fandango and Metacritic have simliar max ratings (4.8 vs. 4.8).
- Fandango has the highest average (3.8).  +0.5 higher than the next website, IMDB.
- Rotten Tomatoes has both the lowest rating (1.0) and lowest average (3.1).