# Investigating Fandango Movie Ratings

In October 2015, a data journalist named Walt Hickey analyzed movie ratings data and found strong evidence to suggest that Fandango's rating system was biased and dishonest (Fandango is an online movie ratings aggregator). He published his analysis in this [article](https://fivethirtyeight.com/features/fandango-movies-ratings/) — a great piece of data journalism that's
worth reading.

Fandango displays a 5-star rating system on their website, where the minimum rating is 0 stars and the maximum is 5 stars.
Hickey found that there's a significant discrepancy between the number of stars displayed to users and the actual rating, which he was able to find in the HTML of the page. He was able to find that:

1. The actual rating was almost always rounded up to the nearest half-star. For instance, a 4.1 movie would be rounded off to 4.5 stars, not to 4 stars, as you may expect.
2. In the case of 8% of the ratings analyzed, the rounding up was done to the nearest whole star. For instance, a 4.5 rating would be "rounded off" to 5 stars.
3. For one movie rating, the rounding off was completely bizarre: from a rating of 4 in the HTML of the page to a displayed rating of 5 stars.

Fandango's officials replied that the biased rounding off was caused by a bug in their system rather than being intentional, and they promised to fix the bug as soon as possible. Presumably, this has already happened, although we can't tell for sure since the actual rating value doesn't seem to be displayed anymore in the pages' HTML.

In this project, we'll analyze more recent movie ratings data to determine whether there has been any change in Fandango's rating system after Hickey's analysis.

Here is the data dictionary for FiveThirtyEight's data set:

![fandango_scores](img/fandango_score_ddict.png)

And here is the corresponding data dictionary for the follow up record of film ratings from 2016-2017:

![ratings_16_17](img/movie_ratings_16_17_ddict.png)

## Data exploration

In [8]:
# Basic import statements
import pandas as pd

In [9]:
original_scores = pd.read_csv('fandango_score_comparison.csv')
updated_scores = pd.read_csv('movie_ratings_16_17.csv')

original_scores.head(3)

Unnamed: 0,FILM,RottenTomatoes,RottenTomatoes_User,Metacritic,Metacritic_User,IMDB,Fandango_Stars,Fandango_Ratingvalue,RT_norm,RT_user_norm,...,IMDB_norm,RT_norm_round,RT_user_norm_round,Metacritic_norm_round,Metacritic_user_norm_round,IMDB_norm_round,Metacritic_user_vote_count,IMDB_user_vote_count,Fandango_votes,Fandango_Difference
0,Avengers: Age of Ultron (2015),74,86,66,7.1,7.8,5.0,4.5,3.7,4.3,...,3.9,3.5,4.5,3.5,3.5,4.0,1330,271107,14846,0.5
1,Cinderella (2015),85,80,67,7.5,7.1,5.0,4.5,4.25,4.0,...,3.55,4.5,4.0,3.5,4.0,3.5,249,65709,12640,0.5
2,Ant-Man (2015),80,90,64,8.1,7.8,5.0,4.5,4.0,4.5,...,3.9,4.0,4.5,3.0,4.0,4.0,627,103660,12055,0.5


In [6]:
updated_scores.head(3)

Unnamed: 0,movie,year,metascore,imdb,tmeter,audience,fandango,n_metascore,n_imdb,n_tmeter,n_audience,nr_metascore,nr_imdb,nr_tmeter,nr_audience
0,10 Cloverfield Lane,2016,76,7.2,90,79,3.5,3.8,3.6,4.5,3.95,4.0,3.5,4.5,4.0
1,13 Hours,2016,48,7.3,50,83,4.5,2.4,3.65,2.5,4.15,2.5,3.5,2.5,4.0
2,A Cure for Wellness,2016,47,6.6,40,47,3.0,2.35,3.3,2.0,2.35,2.5,3.5,2.0,2.5


In [14]:
orig_fandango_cols = ['FILM', 'Fandango_Stars', 'Fandango_Ratingvalue', 'Fandango_votes', 'Fandango_Difference']
fandango_original = original_scores[orig_fandango_cols].copy()

updated_fandango_cols = ['movie', 'year', 'fandango']
fandango_updated = updated_scores[updated_fandango_cols].copy()

print(fandango_original.info())
print(fandango_updated.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 146 entries, 0 to 145
Data columns (total 5 columns):
FILM                    146 non-null object
Fandango_Stars          146 non-null float64
Fandango_Ratingvalue    146 non-null float64
Fandango_votes          146 non-null int64
Fandango_Difference     146 non-null float64
dtypes: float64(3), int64(1), object(1)
memory usage: 5.8+ KB
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 214 entries, 0 to 213
Data columns (total 3 columns):
movie       214 non-null object
year        214 non-null int64
fandango    214 non-null float64
dtypes: float64(1), int64(1), object(1)
memory usage: 5.1+ KB
None


The population of interest for our analysis is the set of all flims scored on Fandango over the years. Our objective is to assess whether the population parameters (namely apparent score) have changed in the time since the FiveThirtyEight article. 

The sample of films included in the FiveThirtyEight data set were obtained using the following conditions measured on the sampling date of Aug. 24, 2015:
1. A Rotten Tomatoes rating and User rating
2. A Metacritic score and Metacritic User score
3. An IMDb score
4. Had at least 30 fan ratings on Fandango's website at the time of sampling

The sampling is not random, since it explicitly excludes films that did not have enough Fandango fan ratings at the time the sample was taken. While this indicates that the sample might not be *truly* representative of the total population of Fandango-rated films, I would argue that this shifts the population of interest from all films listed towards films with a degree of certainty about their rating (as measured by the number of fan ratings).

On the other hand, the `README` file for the updated Fandango data set specifies that films were sampled on Mar. 22, 2017 according to:
1. A release date between 2016 and 2017
2. A "significant number" of votes/reviews

This sampling is also not randomized and is furthermore vague about it's selection criteria since it is unclear what qualifies as a "significant number" of votes or reviews. Nevertheless, it still matches the intent of the FiveThirtyEight sample to maintain some certainty about the rating of the films included.

With these factors in mind, we conclude that these data sets are unlikely to represent our original population of every film scored on Fandango. To address this, we will slightly adjust the target population of interest and our stated objective. We will specifically look at popular films released in 2015 and 2016, and try to assess if there is any significant difference between the two groups of ratings. Although not as meaningful as the first objective, this adjusted goal is a reasonable proxy for the original.