# Investigating Fandango Movie Ratings

In October 2015, Walt Hickey published [an article](https://fivethirtyeight.com/features/fandango-movies-ratings/) on FiveThirtyEight showing that Fandango was artificially inflating movie ratings by rounding up and possibly other methods. Fandango claimed this to be a bug, but there were legitimate questions as to whether this was intentional and designed to drive up profits. While the methodology originally used to prove the rating inflation are no longer available to us as this data has been removed from the HTML of Fandango's website, can we use other methods to determine whether these or similar issues are still occurring?

## Understanding the Data

We have access to Hickey's original research as well as a dataset used in writing [an article](https://www.freecodecamp.org/news/whose-reviews-should-you-trust-imdb-rotten-tomatoes-metacritic-or-fandango-7d1010c6cf19) examining the most reliable rating site. We'll read these datasets into dataframes and explore what information they contain.

In [1]:
import pandas as pd

ratings = pd.read_csv('fandango_score_comparison.csv')
ratings.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 146 entries, 0 to 145
Data columns (total 22 columns):
FILM                          146 non-null object
RottenTomatoes                146 non-null int64
RottenTomatoes_User           146 non-null int64
Metacritic                    146 non-null int64
Metacritic_User               146 non-null float64
IMDB                          146 non-null float64
Fandango_Stars                146 non-null float64
Fandango_Ratingvalue          146 non-null float64
RT_norm                       146 non-null float64
RT_user_norm                  146 non-null float64
Metacritic_norm               146 non-null float64
Metacritic_user_nom           146 non-null float64
IMDB_norm                     146 non-null float64
RT_norm_round                 146 non-null float64
RT_user_norm_round            146 non-null float64
Metacritic_norm_round         146 non-null float64
Metacritic_user_norm_round    146 non-null float64
IMDB_norm_round               146 n

In [2]:
ratings.head()

Unnamed: 0,FILM,RottenTomatoes,RottenTomatoes_User,Metacritic,Metacritic_User,IMDB,Fandango_Stars,Fandango_Ratingvalue,RT_norm,RT_user_norm,...,IMDB_norm,RT_norm_round,RT_user_norm_round,Metacritic_norm_round,Metacritic_user_norm_round,IMDB_norm_round,Metacritic_user_vote_count,IMDB_user_vote_count,Fandango_votes,Fandango_Difference
0,Avengers: Age of Ultron (2015),74,86,66,7.1,7.8,5.0,4.5,3.7,4.3,...,3.9,3.5,4.5,3.5,3.5,4.0,1330,271107,14846,0.5
1,Cinderella (2015),85,80,67,7.5,7.1,5.0,4.5,4.25,4.0,...,3.55,4.5,4.0,3.5,4.0,3.5,249,65709,12640,0.5
2,Ant-Man (2015),80,90,64,8.1,7.8,5.0,4.5,4.0,4.5,...,3.9,4.0,4.5,3.0,4.0,4.0,627,103660,12055,0.5
3,Do You Believe? (2015),18,84,22,4.7,5.4,5.0,4.5,0.9,4.2,...,2.7,1.0,4.0,1.0,2.5,2.5,31,3136,1793,0.5
4,Hot Tub Time Machine 2 (2015),14,28,29,3.4,5.1,3.5,3.0,0.7,1.4,...,2.55,0.5,1.5,1.5,1.5,2.5,88,19560,1021,0.5


In [3]:
new_ratings = pd.read_csv('movie_ratings_16_17.csv')
new_ratings.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 214 entries, 0 to 213
Data columns (total 15 columns):
movie           214 non-null object
year            214 non-null int64
metascore       214 non-null int64
imdb            214 non-null float64
tmeter          214 non-null int64
audience        214 non-null int64
fandango        214 non-null float64
n_metascore     214 non-null float64
n_imdb          214 non-null float64
n_tmeter        214 non-null float64
n_audience      214 non-null float64
nr_metascore    214 non-null float64
nr_imdb         214 non-null float64
nr_tmeter       214 non-null float64
nr_audience     214 non-null float64
dtypes: float64(10), int64(4), object(1)
memory usage: 25.2+ KB


In [4]:
new_ratings.head()

Unnamed: 0,movie,year,metascore,imdb,tmeter,audience,fandango,n_metascore,n_imdb,n_tmeter,n_audience,nr_metascore,nr_imdb,nr_tmeter,nr_audience
0,10 Cloverfield Lane,2016,76,7.2,90,79,3.5,3.8,3.6,4.5,3.95,4.0,3.5,4.5,4.0
1,13 Hours,2016,48,7.3,50,83,4.5,2.4,3.65,2.5,4.15,2.5,3.5,2.5,4.0
2,A Cure for Wellness,2016,47,6.6,40,47,3.0,2.35,3.3,2.0,2.35,2.5,3.5,2.0,2.5
3,A Dog's Purpose,2017,43,5.2,33,76,4.5,2.15,2.6,1.65,3.8,2.0,2.5,1.5,4.0
4,A Hologram for the King,2016,58,6.1,70,57,3.0,2.9,3.05,3.5,2.85,3.0,3.0,3.5,3.0


There is a lot of information in these datasets. Let's isolate the columns containing the information that is relevant to our research before proceeding.

In [5]:
ratings_2015 = ratings[['FILM', 'Fandango_Stars', 'Fandango_Ratingvalue', 'Fandango_votes', 'Fandango_Difference']].copy()
ratings_2015.head()

Unnamed: 0,FILM,Fandango_Stars,Fandango_Ratingvalue,Fandango_votes,Fandango_Difference
0,Avengers: Age of Ultron (2015),5.0,4.5,14846,0.5
1,Cinderella (2015),5.0,4.5,12640,0.5
2,Ant-Man (2015),5.0,4.5,12055,0.5
3,Do You Believe? (2015),5.0,4.5,1793,0.5
4,Hot Tub Time Machine 2 (2015),3.5,3.0,1021,0.5


In [6]:
ratings_2016_17 = new_ratings[['movie', 'year', 'fandango']].copy()
ratings_2016_17.head()

Unnamed: 0,movie,year,fandango
0,10 Cloverfield Lane,2016,3.5
1,13 Hours,2016,4.5
2,A Cure for Wellness,2016,3.0
3,A Dog's Purpose,2017,4.5
4,A Hologram for the King,2016,3.0


## Examining the Data

Is this sampling sufficiently random for our purposes? Are these two samples representative for the population we are trying to describe?

In the case of Hickey's data, the sample is not completely random. The movie had to receive at least 30 fan ratings on Fandango, and the movie had to be released in 2015.

The other data is also not random. It contains "214 of the most popular movies (with a significant number of votes) released in 2016 and 2017". There are no specific criteria listed for how this is defined, but it must be considered popular by the research and have been released in 2016 and 2017.

While Hickey's data was sufficient for his work as he was able to compare the calculated rating with the published rating, we have no such ability. The data for the actual, calculated rating was removed from the HTML of Fandango's website, so we no longer have access to this data. This means that we have two possible recourses to proceed with this analysis:

1. Compare Fandango's ratings before and after Hickey's article to see if there is a difference in Fandango's ratings and rating distribution, indicating that the ratings are no longer inflated. The problem with this method is that we only have access to some movies that were released in a single year. Perhaps one year had an abnormally number of "good" or "bad" movies. This method seems highly unlikely to show us anything of value.

2. Compare Fandango's ratings to the ratings of other sites. While we know that Fandango's ratings are already higher than other sites', if we can show that the differences are smaller, perhaps we can get some indication as to whether or not the rating inflation is still occurring.

The problem with both of these methods is that they are attempting to use statistical analysis to make assumptions about things that are going on behind the scenes. In the case of Hickey's article, he had access to two different sets of data: the calculated ratings and the displayed ratings. His research focused primarily on this discrepancy, and we no longer have the ability to check for such a discrepancy. If we show that Fandango's ratings are, on average, higher or lower after Hickey's article than before, this doesn't necessarily tell us anything of value. Perhaps viewers didn't like the movies released in 2016 as much as they did in 2015. Hickey's data includes movies with as few as 30 reviews, and it is unclear the criteria used in the other dataset. This makes us potential victims of small sample size. The rating on Fandango may not be very representative because of a small number of reviews, and the relatively small number of movies means that a few movies can skew our entire results.

Looking at the distribution of ratings for movies on Fandango and comparing those distributions to other sites is also not particularly helpful. Ratings can be different on varying sites because of the differences in how ratings are collected, how movies are presented, and the different groups actually doing the rating.

## Conclusions

Hickey's initial analysis was dealt with the question of fraudulent ratings and misinformation. He used the information on Fandango's website to show that, whether due to a bug or for more nefarious reasons, Fandango's published ratings were consistently higher than the actual ratings the site had aggregated. 

The other dataset is from a very flawed article. It uses the writer's personal experiences and opinions about movies to draw conclusions about the population-at-large, assuming that most people do things the same way. It classes the best site for reviews the one that has the most normal distribution. The flaws in this article are a reflection of the flaws in any attempt to analyze the current question, namely:


>*When you use subjective criteria to make objective determinations, you run a great risk of drawing flawed conclusions.*


[This article](https://nofilmschool.com/rotten-tomatoes-scores) analyzes how reviews by movie critics have increasingly been on the rise, even questioning whether the purchase of Rotten Tomatoes, first by Warner Bros. and then by Comcast could have influenced the rise in the average rating on the site. It also describes the methodology used to give a Rotten Tomatoes score and calls into question its objectivity.

[This article](https://www.bloomberg.com/news/newsletters/2022-08-28/critics-and-fans-have-never-disagreed-more-about-movies?leadSource=uverify%20wall) indicates that critics generally give a slightly lower score than fans to the top movies but that the divide was much larger than normal in 2022. It looks at different years and different types of movies in an attempt to explain the similarities and differences in the ratings between these two groups of movie reviewers.

The short of all of this is that movie reviews can be incredibly unreliable as a metric because of the wide variances in sites, reviewers, and methodology. Ratings are also rumored to be influenced by production companies and others with a vested interest in a movie's success because all that is required to submit a rating on many sites is simply an email address. This is before taking into account the incredible subjectivity involved in determining a movie's quality and and giving a numerical score to how someone felt about watching it.

Hickey's research was valuable because it analyzed a measurable datapoint: the difference between Fandango's published ratings and the value used in order to calculate this rating. While we may be able to determine that Fandango's ratings are lower in 2016 than in 2015, we still can't determine whether they are accurate. It is possible that Fandango is manipulating the ratings to a lesser degree. As we no longer have access to the data used by Hickey to investigate these types of issues, further analysis is pointless.