### Statistical Analysis: Relationship Between Rating and Number of Votes

To examine whether there is a statistical relationship between the **number of votes** (`numVotes`)  
and the **average IMDb rating** (`averageRating`), the **Pearson correlation coefficient** was calculated.


In [1]:
import sqlite3
import pandas as pd
from scipy.stats import pearsonr

# Datenbankabfrage
conn = sqlite3.connect("../data/raw/imdb.sqlite") 
df = pd.read_sql_query("""
    SELECT averageRating, numVotes
    FROM basics
    JOIN ratings USING(tconst)
    WHERE averageRating IS NOT NULL AND numVotes IS NOT NULL
""", conn)
conn.close()

# Umwandeln in numerische Werte
df["numVotes"] = pd.to_numeric(df["numVotes"], errors="coerce")
df["averageRating"] = pd.to_numeric(df["averageRating"], errors="coerce")

# Korrelation berechnen
x = df["numVotes"]
y = df["averageRating"]

corr, pval = pearsonr(x, y)
print(f"Correlation coefficient: {corr:.3f}")
print(f"p-value: {pval:.5f}")


Correlation coefficient: 0.061
p-value: 0.00000


## Conclusion of the Statistical Analysis

The investigation into the relationship between the number of votes (`numVotes`)  
and the average IMDb rating (`averageRating`) yields an interesting result:

- The calculated **correlation of 0.011** is close to zero → there is **no meaningful linear relationship** between the two variables.
- Despite the extremely low **p-value (< 0.00001)** indicating statistical significance, the result is **not practically relevant**.
- The significance is most likely due to the **large sample size**, not the strength of the effect.

**Conclusion:**

The average rating of a movie on IMDb is **not dependent** on how many votes it has received.  
Both widely known and lesser-known films tend to receive similar average ratings.  
This challenges the common assumption that popular films are automatically rated higher – at least in a linear sense.

For further analysis, it may be worth exploring **nonlinear relationships**, **outlier behavior**, or **genre-specific correlations**.
