# Fandango Movie Rating Analysis

In October 2015, Walt Hickey of FiveThirtyEight [published an article](https://fivethirtyeight.com/features/fandango-movies-ratings/) in which he analyzed data on Fandango's website to show that the movie rating website was artifically inflating its star ratings. His analysis compared the star rating displayed on the website to the actual rating that he was able to find in website's HTML.

 - For 32% of movies, Fandango increased the rating by 0.3 or 0.4 stars by rounding up when you would normally round down.
 - For 8% of movies, Fandango added an entire half-star to the rating, showing a 4.5 star rating as 5 stars.
 - In one case, a movie's rating was inflated by an entire star, increasing its actual rating of 4 to 5 stars.

In response to questions from Hickey, Fandango replied that the overly generous ratings were the result of a "software glitch" that would be fixed "as soon as possible". Three years after this article was published, this fix has presumably already been implemented. However, we can't tell for sure since the actual rating value is no longer included in the HTML for Fandango's pages.

The goal of this project is to analyze more recent movie ratings data to determine whether there has been any change in Fandango's rating system after Hickey's analysis.

## Data sources

Walt Hickey released the [data that he used](https://github.com/fivethirtyeight/data/tree/master/fandango) in his analysis on GitHub. This data source will be used to examine the Fandango rating system prior to his article being published.

Alex Olteanu [wrote an article](https://medium.freecodecamp.org/whose-reviews-should-you-trust-imdb-rotten-tomatoes-metacritic-or-fandango-7d1010c6cf19) in April 2017 that analyzed the rating systems of multiple movie rating websites. The [data for this article](https://github.com/mircealex/Movie_ratings_2016_17) includes movie ratings for 2016 and 2017 and will be used to examine the Fandango rating system after the Hickey's article was published.

In [1]:
# Import pandas
import pandas as pd

# Read the two data sources into DataFrames
ratings_15 = pd.read_csv('fandango_score_comparison.csv')
ratings_16_17 = pd.read_csv('movie_ratings_16_17.csv')

In [2]:
# Explore the two data sets
ratings_15.shape

(146, 22)

In [3]:
ratings_15.head()

Unnamed: 0,FILM,RottenTomatoes,RottenTomatoes_User,Metacritic,Metacritic_User,IMDB,Fandango_Stars,Fandango_Ratingvalue,RT_norm,RT_user_norm,...,IMDB_norm,RT_norm_round,RT_user_norm_round,Metacritic_norm_round,Metacritic_user_norm_round,IMDB_norm_round,Metacritic_user_vote_count,IMDB_user_vote_count,Fandango_votes,Fandango_Difference
0,Avengers: Age of Ultron (2015),74,86,66,7.1,7.8,5.0,4.5,3.7,4.3,...,3.9,3.5,4.5,3.5,3.5,4.0,1330,271107,14846,0.5
1,Cinderella (2015),85,80,67,7.5,7.1,5.0,4.5,4.25,4.0,...,3.55,4.5,4.0,3.5,4.0,3.5,249,65709,12640,0.5
2,Ant-Man (2015),80,90,64,8.1,7.8,5.0,4.5,4.0,4.5,...,3.9,4.0,4.5,3.0,4.0,4.0,627,103660,12055,0.5
3,Do You Believe? (2015),18,84,22,4.7,5.4,5.0,4.5,0.9,4.2,...,2.7,1.0,4.0,1.0,2.5,2.5,31,3136,1793,0.5
4,Hot Tub Time Machine 2 (2015),14,28,29,3.4,5.1,3.5,3.0,0.7,1.4,...,2.55,0.5,1.5,1.5,1.5,2.5,88,19560,1021,0.5


In [5]:
ratings_16_17.shape

(214, 15)

In [6]:
ratings_16_17.head()

Unnamed: 0,movie,year,metascore,imdb,tmeter,audience,fandango,n_metascore,n_imdb,n_tmeter,n_audience,nr_metascore,nr_imdb,nr_tmeter,nr_audience
0,10 Cloverfield Lane,2016,76,7.2,90,79,3.5,3.8,3.6,4.5,3.95,4.0,3.5,4.5,4.0
1,13 Hours,2016,48,7.3,50,83,4.5,2.4,3.65,2.5,4.15,2.5,3.5,2.5,4.0
2,A Cure for Wellness,2016,47,6.6,40,47,3.0,2.35,3.3,2.0,2.35,2.5,3.5,2.0,2.5
3,A Dog's Purpose,2017,43,5.2,33,76,4.5,2.15,2.6,1.65,3.8,2.0,2.5,1.5,4.0
4,A Hologram for the King,2016,58,6.1,70,57,3.0,2.9,3.05,3.5,2.85,3.0,3.0,3.5,3.0


Since these DataFrames are large and contain many columns that aren't needed for this analysis, we'll select only the relevant columns. To avoid [potential issues with the `SettingWithCopyWarning`](https://www.dataquest.io/blog/settingwithcopywarning/) in Pandas, we'll make sure that these new, smaller DataFrames are copies.

In [7]:
# Pull out only the columns that are relevant for 2015
ratings_15 = ratings_15[['FILM', 'Fandango_Stars', 
                         'Fandango_Ratingvalue', 
                         'Fandango_votes', 
                         'Fandango_Difference']].copy()
ratings_15.head()

Unnamed: 0,FILM,Fandango_Stars,Fandango_Ratingvalue,Fandango_votes,Fandango_Difference
0,Avengers: Age of Ultron (2015),5.0,4.5,14846,0.5
1,Cinderella (2015),5.0,4.5,12640,0.5
2,Ant-Man (2015),5.0,4.5,12055,0.5
3,Do You Believe? (2015),5.0,4.5,1793,0.5
4,Hot Tub Time Machine 2 (2015),3.5,3.0,1021,0.5


In [8]:
# Pull out only the columns that are relevant for 2017
ratings_16_17 = ratings_16_17[['movie', 'year', 
                               'fandango']].copy()
ratings_16_17.head()

Unnamed: 0,movie,year,fandango
0,10 Cloverfield Lane,2016,3.5
1,13 Hours,2016,4.5
2,A Cure for Wellness,2016,3.0
3,A Dog's Purpose,2017,4.5
4,A Hologram for the King,2016,3.0


To determine whether there has been any change in Fandango's rating system since Hickey's analysis, the population of interest is every movie that has been rated by Fandango prior to Hickey's analysis and in the time since. 

According to the article, the data from Hickey includes all movies that had at least 30 fan reviews on Fandango and had tickets for sale in 2015. However, there appears to be some sort of discrepency in the data: the articles states that 209 films fit these sampling parameters, whereas the DataFrame above contains only 146 rows. The data from Olteanu contains data for the 214 most reviewed movies released in 2016 and 2017. The actual parameters used to determine which movies were "most reviewed" are not given. 

This sampling clearly isn't random. Part of the analysis for both of the articles was to compare movie ratings from multiple websites. As a result, this sampling method was used to select movies that have stable reviews from a sufficiently large number of users, omitting movies that have only a few ratings that are therefore likely to be highly skewed. Since the goal of the current analysis is to compare published Fandango ratings to the internal Fandango ratings, there is no need to limit the sample based on the number of user reviews.

Furthermore, the temporal bounds on the sampling were most likely selected for ease of use and relevancy. Olteanu specifically addressed the temporal bounds on his sampling in [his article](https://medium.freecodecamp.org/whose-reviews-should-you-trust-imdb-rotten-tomatoes-metacritic-or-fandango-7d1010c6cf19): "I haven’t collected ratings for movies released before 2016, simply because a slight change has occurred in Fandango’s rating system soon after Walt Hickey’s analysis..."

Because these samples do not fully encompass the desired population for the goal stated above, we can either collect new data or alter the goal of this analysis; the latter is a much faster option and the one we'll select here. Based on the available data, **the new goal of this analysis is to examine any differences between Fandango ratings for popular movies released in 2015 (prior to Hickey's article) and those for popular movies released in 2016 (after Hickey's article)**. We'll omit any movies that were released in 2017 due to their unreliability based on the following note in the [README](https://github.com/mircealex/Movie_ratings_2016_17) for Olteanu's data: "As of March 22, 2017, the ratings were up to date. Significant changes should be expected mostly for movies released in 2017."

## Cleaning the data

Based on our new goal, we want to compare two populations
 - all Fandango ratings for popular movies released in 2015
 - all Fandango ratings for popular movies released in 2016
 
However, the term "popular" is quite vague and should be precisely defined before moving further. In this case, we'll borrow Hickey's definition of movies having at least 30 user ratings on Fandango's website.

Because this was the criterion used by Hickey, his data should already meet this requirement.

In [10]:
ratings_15[ratings_15['Fandango_votes'] < 30]['Fandango_votes'].sum()

0

Now that we have confirmed that our 2015 data meets our "popular" definition, we can move on to the 2016-17 dataset. Confirming that these movies all have at least 30 user reviews is much more challenging as this parameter was not included in the data. As a proxy, we will randomly select 10 movies and manually check the website to find the number of user reviews for each of the randomly selected movies.

In [12]:
ratings_16_17.sample(10, random_state=0)

Unnamed: 0,movie,year,fandango
197,The Take (Bastille Day),2016,4.0
37,Come and Find Me,2016,4.0
89,Kickboxer,2016,4.0
176,The Founder,2016,4.0
170,The Darkness,2016,2.5
75,Ice Age: Collision Course,2016,4.0
96,Lion,2016,4.0
137,Ride Along 2,2016,4.0
5,A Monster Calls,2016,4.0
83,Jane Got a Gun,2016,3.5


| Movie | Number of Fandango User Ratings |
|------|------|
| The Take (Bastille Day)  | **29** |
| Come and Find Me | **2** |
| Kickboxer | **13** |
| The Founder | 1033 |
| The Darkness | 911 |
| Ice Age: Collision Course | 2242 |
| Lion | 3706 |
| Ride Along 2 | 6662 |
| A Monster Calls | 499 |
| Jane Got a Gun | 365 |

In the above movie list, *Kickboxer* was assumed to be *Kickboxer: Vengence*, which was released in 2016, as opposed to the original *Kickboxer* movie, which was released in 1989. 

In this sample, 3 out of the 10 movies have less than 30 Fandango user ratings, so only 70% of the sample meets the intended requirements. 

In [13]:
ratings_16_17.sample(10, random_state=5)

Unnamed: 0,movie,year,fandango
21,Before the Flood,2016,3.5
54,Fifty Shades of Black,2016,2.5
84,Jason Bourne,2016,4.0
102,Manchester by the Sea,2016,3.5
26,Blood Father,2016,4.0
202,Under the Shadow,2016,4.0
208,Why Him?,2016,4.0
28,Busanhaeng,2016,4.5
6,A Street Cat Named Bob,2016,4.5
161,The Autopsy of Jane Doe,2016,4.5


| Movie | Number of Fandango User Ratings |
|------|------|
| Before the Flood  | **7** |
| Fifty Shades of Black | 1500 |
| Jason Bourne | 16,304 |
| Manchester by the Sea | 3486 |
| Blood Father | 45 |
| Under the Shadow | **8** |
| Why Him? | 2736 |
| Busanhaeng | 276 |
| A Street Cat Named Bob | 40 |
| The Autopsy of Jane Doe | 41 |


In the above movie list, *Busanhaeng* was not found on the Fandango website. The American title of this movie, *Train to Busan* was found instead.

In this sample, 2 out of the 10 movies have less than 30 Fandango user reviews, so 80% of the sample meets the indended requirements.

## Next Steps

1. Next, we need to remove any movies that are not from 2015 in the first data set and any movies that are not from 2016 in the second data set.

Generate plots:



Generate two kernel density plots on the same figure for the distribution of movie ratings of each sample. Customize the graph such that:

    It has a title with an increased font size.
    It has labels for both the x and y-axis.
    It has a legend which explains which distribution is for 2015 and which is for 2016.
    The x-axis starts at 0 and ends at 5 because movie ratings on Fandango start at 0 and end at 5.
    The tick labels of the x-axis are: [0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0].
    It has the fivethirtyeight style (this is optional). You can change to this style by using plt.style.use('fivethirtyeight'). This line of code must be placed before the code that generates the kernel density plots.



Analyze the two kernel density plots. Try to answer the following questions:

    What is the shape of each distribution?
    How do their shapes compare?
    If their shapes are similar, is there anything that clearly differentiates them?
    Can we see any evidence on the graph that suggests that there is indeed a change between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016?
    Provided there's a difference, can we tell anything about the direction of the difference? In other words, were movies in 2016 rated lower or higher compared to 2015?

The kernel density plots from the previous screen showed that there's a clear difference between the two distributions. They also provided us with information about the direction of the difference: movies in 2016 were rated slightly lower than those in 2015.

While comparing the distributions with the help of the kernel density plots was a great start, we now need to analyze more granular information.



    Examine the frequency distribution tables of the two distributions.
        The samples have different number of movies. Does it make sense to compare the two tables using absolute frequencies?
        If absolute frequencies are not useful here, would relative frequencies be of more help? If so, what would be better for readability — proportions or percentages?

    Analyze the two tables and try to answer the following questions:
        Is it still clear that there is a difference between the two distributions?
        What can you tell about the direction of the difference just from the tables? Is the direction still that clear anymore?


We confirmed with the two tables before that there is indeed a clear difference between the two distributions. However, the direction of the difference is not as clear as it was on the kernel density plots.

We'll take a couple of summary statistics (remember the distinction between sample statistics and population parameters) to get a more precise picture about the direction of the difference. We'll take each distribution of movie ratings and compute its mean, median, and mode, and then compare these statistics to determine what they tell about the direction of the difference.

We've already learned a bit about these three summary metrics in the pandas course, and we'll learn more about them right in the next mission of the next course. For now, here are the pandas methods you can use to compute these summary metrics:

    Series.mean()
    Series.median()
    Series.mode()

    
Instructions:
    

    Compute the mean, median, and mode for each distribution.
    Compare these metrics and determine what they tell about the direction of the difference.
    What's magnitude of the difference? Is there a big difference or just a slight difference?
    Generate a grouped bar plot to show comparatively how the mean, median, and mode varied for 2015 and 2016. You should arrive at a graph that looks similar (not necessarily identical) to this:

    

Our analysis showed that there's indeed a slight difference between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. We also determined that, on average, popular movies released in 2016 were rated lower on Fandango than popular movies released in 2015.

Try to wrap up your work by writing a conclusion that's no more than two paragraphs. In one of the paragraphs, try to answer what caused the change revealed by our analysis.

These are a few next steps to consider:

    Customize your graphs more by reproducing almost completely the FiveThirtyEight style. You can take a look at this tutorial if you want to do that.
    Improve your project from a stylistical point of view by following the guidelines discussed in this style guide.
    Use the two samples to compare ratings of different movie ratings aggregators and recommend what's the best website to check for a movie rating. There are many approaches you can take here — you can take some inspiration from this article.
    Collect recent movie ratings data and formulate your own research questions. You can take a look at this blog post to learn how to scrape movie ratings for IMDB and Metacritic.
