In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
movies = pd.read_csv('fandango_score_comparison.csv')
movies

In [None]:
plt.hist(movies['Metacritic_norm_round'])

In [None]:
plt.hist(movies['Fandango_Stars'])

## Fandango vs Metacritic Scores

- No movie has scored below 3 in Fandango review.
- The Fandango reviews also tend to center around 4.5 and 4.0, whereas the Metacritic reviews seem to center around 3.0 and 3.5.


In [None]:
fandango_mean = movies['Fandango_Stars'].mean()
metacritic_mean = movies['Metacritic_norm_round'].mean()

fandango_median = movies['Fandango_Stars'].median()
metacritic_median = movies['Metacritic_norm_round'].median()

fandango_std = np.std(movies['Fandango_Stars'])
metacritic_std = np.std(movies['Metacritic_norm_round'])

In [None]:
print(fandango_mean)
print(fandango_median)
print(fandango_std)

print(metacritic_mean)
print(metacritic_median)
print(metacritic_std)

## Metacritic vs Fandango Review Methodologies
Ratings at Fandango are higher and methodology is not transparent. Metacritic's methodology is transparent about how they aggregate them to get final rating.

## Fandango vs Metacritic number differences
- The median metacritic score appears higher than the mean metacritic score because a few very low reviews "drag down" the median. The median fandango score is lower than the mean fandango score because a few very high ratings "drag up" the mean.
- Fandango ratings appear clustered between 3 and 5, and have a much narrower random than Metacritic reviews, which go from 0 to 5.
- Fandango ratings in general appear to be higher than metacritic ratings.
- These may be due to movie studio influence on Fandango ratings, and the fact that Fandango calculates its ratings in a hidden way.

In [None]:
plt.scatter(movies['Metacritic_norm_round'], movies['Fandango_Stars'])

In [None]:
fm_diff = np.abs(movies['Fandango_Stars'] - movies['Metacritic_norm_round'])

In [None]:
movies['fm_diff'] = fm_diff

In [None]:
movies.sort_values(by='fm_diff', ascending=False).head(5)

In [None]:
from scipy.stats import pearsonr

In [None]:
r_value, p_value = pearsonr?

In [None]:
r_value, p_value = pearsonr(movies['Fandango_Stars'], movies['Metacritic_norm_round'])

In [None]:
r_value

## Fandango and Metacritic correlation
The low correlation between Fandango and Metacritic scores indicates that Fandango scores aren't just inflated, they are fundamentally different. For whatever reason, it appears like Fandango both inflates scores overall, and inflates scores differently depending on the movie.

In [None]:
from scipy.stats import linregress

In [None]:
slope, intercept, r_value, p_value, stderr_slope = linregress(movies["Metacritic_norm_round"], movies["Fandango_Stars"])

In [None]:
# Predicting Fandango score for movie with score of 3 on Metacritic
pred_3 = 3 * slope + intercept
pred_3

## Finding Residuals

In [None]:
pred_1 = 1 * slope + intercept
pred_5 = 5 * slope + intercept

In [None]:
plt.scatter(movies["Metacritic_norm_round"], movies["Fandango_Stars"])
plt.plot([1,5],[pred_1,pred_5])
plt.xlim(1,5)