# 05 - Dashboard: The Money Trail
**Project**: Premier League Competitiveness Analysis (2000-2024)  
**Purpose**: Reveal the financial inequality driving competitive decline

## The Story We're Telling
> Notebook 04 showed the Premier League is becoming less competitive. 
> This notebook reveals WHY — the financial forces reshaping English football.

## The Paradox
- Financial Gini is DECLINING (money spreading out relatively)
- Market Value Std Dev is EXPLODING (absolute gaps widening)
- Result: A threshold effect where only mega-spending clubs can compete

## Charts We'll Create
1. Financial Gini over time
2. Market Value Standard Deviation over time  
3. The Paradox — dual axis: Financial Gini vs Points Gini
4. Squad Value vs Position scatter (single season)
5. Money-Position Correlation trend

In [0]:
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Load metrics table
df_metrics = spark.table("competitiveness_metrics").toPandas()

# Load unified data for scatter plots
df_unified = spark.table("unified_season_data").toPandas()

## Chart 1: Financial Gini Coefficient (2004-2024)
The Gini coefficient applied to squad market values measures how unequally money is distributed across the 20 clubs each season.

**What to expect:** If financial inequality were driving competitive decline, we'd expect this to be RISING. But look at the trendline...

**The surprise:** Financial Gini is actually DECLINING — from ~0.36 in 2004 to ~0.27 in 2024. Money appears to be spreading out more evenly across clubs. 

So if money is becoming more equal, why is the league becoming LESS competitive? Chart 2 reveals the answer.

In [0]:
gini_fig = px.scatter(df_metrics, x = 'season_start_year', y='financial_gini', trendline='ols', title='Growing Inequality')

gini_fig.update_layout(xaxis_title='Season', yaxis_title='Financial Gini Coefficient', xaxis_range=[2003, 2025])

gini_fig.add_vrect(
    x0=2000, x1=2003,          
    fillcolor="lightblue",      
    opacity=0.15,                
    layer="below",              
    line_width=0,
)

gini_fig.add_vrect(
    x0=2003, x1=2007,          
    fillcolor="grey",      
    opacity=0.15,                
    layer="below",              
    line_width=0
)

gini_fig.add_vrect(
    x0=2007, x1=2010,          
    fillcolor="lightgreen",      
    opacity=0.15,                
    layer="below",              
    line_width=0
)

gini_fig.add_vrect(
    x0=2010, x1=2015,          
    fillcolor="purple",      
    opacity=0.15,                
    layer="below",              
    line_width=0
)

gini_fig.add_vrect(
    x0=2015, x1=2019,          
    fillcolor="lightyellow",      
    opacity=0.15,                
    layer="below",              
    line_width=0
)

gini_fig.add_vrect(
    x0=2019, x1=2024,          
    fillcolor="red",      
    opacity=0.15,                
    layer="below",              
    line_width=0
)

gini_fig.add_annotation(x=2001.5, y=0.17, textangle=-45, text="Pre-Abra", showarrow=False, font=dict(size=15))
gini_fig.add_annotation(x=2005, y=0.17, textangle=-45, text="Abramovich", showarrow=False, font=dict(size=15))
gini_fig.add_annotation(x=2008.5, y=0.17, textangle=-45, text="Man City", showarrow=False, font=dict(size=15))
gini_fig.add_annotation(x=2012.5, y=0.17, textangle=-45, text="FFP ", showarrow=False, font=dict(size=15))
gini_fig.add_annotation(x=2017, y=0.17, textangle=-45, text="TV Money", showarrow=False, font=dict(size=15))
gini_fig.add_annotation(x=2021.5, y=0.17, textangle=-45, text="State Ownership", showarrow=False, font=dict(size=15))

## Chart 2: Market Value Standard Deviation (2004-2024)
While the Gini coefficient measures *relative* inequality (how the pie is divided), standard deviation measures the *absolute* spread in euros.

**The reveal:** The standard deviation has nearly QUADRUPLED — from €86m in 2004 to €324m in 2024.

**What this means:** Even though money is more "evenly distributed" in relative terms, the actual euro gap between top and bottom clubs has exploded. A club that was €50m behind in 2004 might now be €300m behind — an insurmountable chasm.

In [0]:
df_mv_std = df_metrics[df_metrics['mv_std'].notnull()]

mv_std_fig = px.bar(
    df_mv_std,
    x='season_start_year',
    y='mv_std',
    title='The Absolute Gap Explodes'
)

mv_std_fig.update_layout(
    xaxis_title='Season',
    yaxis_title='Market Value Std (Millions)',
    xaxis_range=[2003, 2025]
)

## Chart 3: The Paradox — Financial Gini vs Points Gini
This is the heart of the story. Two trends that seem like they should move together are actually moving in OPPOSITE directions.

- **Financial Gini (declining):** Money is spreading out across clubs
- **Points Gini (rising):** The league is becoming less competitive

**The paradox:** If money is becoming more equal, why are results becoming LESS equal?

**The answer:** It's a threshold effect. The baseline cost of competing has risen so dramatically that even though money is "spreading out," most clubs still can't cross the spending threshold required to compete for titles. The absolute gap (Chart 2) matters more than the relative distribution (Chart 1).

In [0]:
from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add Financial Gini on left y-axis
fig.add_trace(
    go.Scatter(x=df_metrics['season_start_year'], y=df_metrics['financial_gini'], name="Financial Gini", line=dict(color="green")),
    secondary_y=False
)

# Add Points Gini on right y-axis  
fig.add_trace(
    go.Scatter(x=df_metrics['season_start_year'], y=df_metrics['gini'], name="Points Gini", line=dict(color="red")),
    secondary_y=True
)

# Update layout with titles for both y-axes
fig.update_yaxes(title_text="Financial Gini", secondary_y=False)
fig.update_yaxes(title_text="Points Gini", secondary_y=True)

fig.update_layout(
    title="The Paradox: Money Spreads, Competition Shrinks<br><sup>Financial Gini only starts in 2004 (green line)</sup>"
)

## Chart 4: Squad Value vs Final Position (2023/24)
Does money actually buy league position? Let's look at the most recent complete season.

Each dot represents one club. The x-axis shows their squad market value, the y-axis shows where they finished.

**What to expect:** A clear negative correlation — richer clubs finish higher (lower position number = better).

In [0]:
df_2023 = df_unified[df_unified['season_start_year'] == 2023].copy()
df_2023['market_value_million'] = df_2023['total_market_value_eur'] / 1_000_000

fig = px.scatter(
    df_2023,
    x='market_value_million',
    y='position',
    text='club',
    title='2023/24: Market Value vs. League Position'
)
fig.update_traces(textposition='top center')
fig.update_yaxes(autorange="reversed", title='Position')
fig.update_xaxes(title='Total Market Value (Million EUR)')

## Chart 5: Money-Position Correlation Over Time (2004-2024)
How reliably does squad value predict final league position each season?

A correlation of -1.0 means money perfectly predicts position (negative because higher value = lower position number). A correlation of 0 means no relationship.

**What we see:** The correlation is consistently strong (between -0.6 and -0.9), meaning money reliably buys league position year after year. There's no escape from financial determinism.

In [0]:
fig = px.line(
    df_metrics,
    x='season_start_year',
    y='financial_corr',
    title='Financial Correlation Over Time'
)

fig.add_hline(
    y=-0.7,
    line_dash="dash",
    line_color="red",
    annotation_text="Strong Correlation Threshold (-0.7)",
    annotation_position="top left"
)

fig.update_layout(
    xaxis_title='Season',
    yaxis_title='Financial Correlation',
    xaxis_range=[2003, 2025]
)

## Summary: The Money Trail

### Key Findings
1. **The Misleading Good News:** Financial Gini has declined from 0.36 to 0.27 — money appears to be spreading out
2. **The Hidden Reality:** Market Value Standard Deviation has quadrupled (€86m → €324m) — the absolute gap is enormous
3. **The Paradox:** Relative equality is improving while competitive inequality worsens
4. **The Mechanism:** Money consistently explains 60-90% of league outcomes — there's no escape from financial determinism
5. **The Threshold Effect:** The cost of competing has risen so high that even "spreading" money can't close the gap

### The Bottom Line
The Premier League hasn't become uncompetitive because money is concentrated — it's become uncompetitive because the *scale* of money required to compete has grown beyond what most clubs can achieve. The barrier to entry isn't inequality; it's magnitude.