# Day 32 Analysis

I came across an interesting [article](https://www.washingtonpost.com/climate-environment/2022/11/14/dolphins-hard-rock-sun-shade/) the other day talking about the design of Miami's football stadium. When the Miami stadium was renovated between 2015 and 2016, engineers strategically planned to point the sun directly on the opponent’s sideline for the entire game while the home team’s sideline sat in the shade. During the warmest months of the NFL season, temperature differences between the home and away team sidelines can get up to 30 degrees F!  

Naturally, I want to use data to see if this "advantage" really makes a difference. I'll start with finding the winning percentage at home for all teams since 1999 since I have that data available.

In [1]:
import pandas as pd
import sqlite3

# Create database connection
conn = sqlite3.connect('../../data/db/database.db')

In [9]:
query = """
WITH home_games AS (
    SELECT
        home_team,
        home_score,
        away_score,
        result,
        CASE
            WHEN result > 0 THEN 'win'
            ELSE 'loss'
        END AS win_loss,
        total AS total_score,
        temp,
        wind,
        COUNT() OVER (PARTITION BY home_team) AS num_games
    FROM schedules 
), aggregations AS (
    SELECT
        home_team,
        win_loss,
        ROUND(AVG(result),2) AS avg_result,
        ROUND(AVG(home_score),2) AS avg_home_score,
        COUNT(*) AS num_win_loss,
        ROUND((1.0 * COUNT(*) / num_games) * 100,2) AS win_loss_pct
    FROM home_games
    GROUP BY home_team, win_loss
)
SELECT *
FROM aggregations
WHERE win_loss = 'win'
ORDER BY win_loss_pct DESC
"""

pd.read_sql(query, conn).head(20)

Unnamed: 0,home_team,win_loss,avg_result,avg_home_score,num_win_loss,win_loss_pct
0,NE,win,14.95,30.05,169,76.13
1,GB,win,13.79,30.1,146,69.86
2,BAL,win,14.26,27.54,138,69.0
3,PIT,win,12.72,27.54,139,66.83
4,IND,win,12.32,29.04,136,66.02
5,SEA,win,13.11,28.51,136,65.7
6,MIN,win,11.94,28.69,128,64.32
7,KC,win,13.19,29.33,126,60.87
8,DEN,win,11.88,28.14,123,60.29
9,PHI,win,14.38,29.12,120,57.69


Historically, Miami doesn't do well at home...they've won only 54% of their home games since 1999 – not much of an advantage!  

The article mentioned that the stadium as renovated between the 2015 and 2016 seasons. I'll update the query and see if Miami has enjoyed a stronger home-field advantage from 2016 onwards.

In [14]:
query = """
WITH home_games AS (
    SELECT
        home_team,
        home_score,
        away_score,
        result,
        CASE
            WHEN result > 0 THEN 'win'
            ELSE 'loss'
        END AS win_loss,
        total AS total_score,
        temp,
        wind,
        COUNT() OVER (PARTITION BY home_team) AS num_games
    FROM schedules 
    -- Get home-field record starting when stadium renovation completed
    WHERE season >= 2016
), aggregations AS (
    SELECT
        home_team,
        win_loss,
        ROUND(AVG(result),2) AS avg_result,
        ROUND(AVG(home_score),2) AS avg_home_score,
        COUNT(*) AS num_win_loss,
        ROUND((1.0 * COUNT(*) / num_games) * 100,2) AS win_loss_pct
    FROM home_games
    GROUP BY home_team, win_loss
)
SELECT *
FROM aggregations
WHERE win_loss = 'win'
ORDER BY win_loss_pct DESC
"""

pd.read_sql(query, conn).head(15)

Unnamed: 0,home_team,win_loss,avg_result,avg_home_score,num_win_loss,win_loss_pct
0,KC,win,12.55,29.9,49,71.01
1,GB,win,11.93,29.44,43,69.35
2,NE,win,17.35,31.19,43,67.19
3,BUF,win,14.18,29.97,38,63.33
4,PIT,win,10.92,28.32,38,63.33
5,BAL,win,14.27,28.76,37,62.71
6,MIN,win,11.58,28.31,36,62.07
7,TEN,win,10.0,28.36,36,61.02
8,NO,win,12.0,31.74,38,60.32
9,DAL,win,15.03,33.22,36,60.0


There's an improvement – winning percentage at home goes from 54% to ~60%. However, they are still outside the top 10 when ranking teams by their win percentage at home.  

I'll try to narrow things down further...how is Miami's win percentage at home when **temperature is high**? I'll start by filtering for games where the temperature is >= 80 degrees F.

In [13]:
query = """
WITH home_games AS (
    SELECT
        home_team,
        home_score,
        away_score,
        result,
        CASE
            WHEN result > 0 THEN 'win'
            ELSE 'loss'
        END AS win_loss,
        total AS total_score,
        temp,
        wind,
        COUNT() OVER (PARTITION BY home_team) AS num_games
    FROM schedules 
    -- Get home-field record starting when stadium renovation completed
    WHERE season >= 2016
        AND temp >= 80
), aggregations AS (
    SELECT
        home_team,
        win_loss,
        ROUND(AVG(temp),2) AS avg_temp,
        ROUND(AVG(result),2) AS avg_result,
        ROUND(AVG(home_score),2) AS avg_home_score,
        COUNT(*) AS num_win_loss,
        ROUND((1.0 * COUNT(*) / num_games) * 100,2) AS win_loss_pct
    FROM home_games
    GROUP BY home_team, win_loss
)
SELECT *
FROM aggregations
WHERE win_loss = 'win'
ORDER BY win_loss_pct DESC
"""

pd.read_sql(query, conn).head(15)

Unnamed: 0,home_team,win_loss,avg_temp,avg_result,avg_home_score,num_win_loss,win_loss_pct
0,NO,win,86.0,35.0,38.0,1,100.0
1,BAL,win,82.0,6.75,24.0,4,80.0
2,LA,win,85.17,18.17,30.0,6,75.0
3,PHI,win,85.0,9.33,24.67,3,75.0
4,DEN,win,85.6,9.0,25.6,5,71.43
5,BUF,win,83.0,7.0,23.5,2,66.67
6,CHI,win,84.5,4.5,21.5,2,66.67
7,NE,win,82.5,9.5,33.0,2,66.67
8,TB,win,84.44,14.75,30.81,16,66.67
9,CAR,win,81.88,9.88,26.25,8,61.54


Miami's win percentage at home actually **drops when temperature >= 80**! That's a little surprising. Maybe it's not hot enough...I'll increase the threshold to 90 and check the results. I'll also add a condition for number of games to reduce the noise. Teams located outside Florida aren't in warm enough cities.

In [19]:
query = """
WITH home_games AS (
    SELECT
        home_team,
        home_score,
        away_score,
        result,
        CASE
            WHEN result > 0 THEN 'win'
            ELSE 'loss'
        END AS win_loss,
        total AS total_score,
        temp,
        wind,
        COUNT() OVER (PARTITION BY home_team) AS num_games
    FROM schedules 
    -- Get home-field record starting when stadium renovation completed
    WHERE season >= 2016
        AND temp >= 85
), aggregations AS (
    SELECT
        home_team,
        win_loss,
        ROUND(AVG(temp),2) AS avg_temp,
        ROUND(AVG(result),2) AS avg_result,
        ROUND(AVG(home_score),2) AS avg_home_score,
        COUNT(*) AS num_win_loss,
        ROUND((1.0 * COUNT(*) / num_games) * 100,2) AS win_loss_pct
    FROM home_games
    GROUP BY home_team, win_loss
)
SELECT *
FROM aggregations
WHERE win_loss = 'win'
    AND num_win_loss >= 5
ORDER BY win_loss_pct DESC
"""

pd.read_sql(query, conn).head(15)

Unnamed: 0,home_team,win_loss,avg_temp,avg_result,avg_home_score,num_win_loss,win_loss_pct
0,TB,win,88.0,14.0,30.57,7,58.33
1,MIA,win,86.92,9.08,26.17,12,54.55


So not much of a home-field advantage after all! To be fair, Miami hasn't been a great team in recent years. The temperature of the opposing team's sideline is not going to make a difference when you don't have good players or coaching. 