# NBA Data :: Home vs Away Analysis

## Trevor Rowland, Abhishek Menothu, Johnathen Wigfall, Scott Campbell :: 2-18-2025

This notebook covers data transformation, data visualization, and hypothesis testing of NBA game data from 2004 to 2024. The goal of this analysis is to see if there is a meaningful difference in a team playing at home vs on the road.

## Ideas for Testing

- Distributions, Means, and Variances for Home vs Away Teams win percentages
  - Group by Home vs Away
  - Check Dists for Home and Away using qqplot, then AD/SW tests
  - If Normal, check Means with T Test and Variances with F Test
  - Report Results


## 1. Importing Packages and Data

In [2]:
import numpy as np
import pandas as pd

from scipy import stats
from scipy.stats import shapiro

import statsmodels.api as sm
import matplotlib.pyplot as plt

In [3]:
file_id = '1U2UaHWRSkUXfJBn4kBHPYttd3dvw_CZF'
url = f'https://drive.google.com/uc?id={file_id}'
df = pd.read_csv(url, encoding='utf-8')

In [4]:
df.head(10)

Unnamed: 0,game_id,season,team_id,team_name,tri_code,team_slug,minutes,field_goals_made,field_goals_attempted,field_goals_percentage,...,uncontested_field_goals_percentage,field_goal_percentage,defended_at_rim_field_goals_made,defended_at_rim_field_goals_attempted,defended_at_rim_field_goal_percentage,opponent_points,is_home_team,won_game,is_playoff_game,is_regular_game
0,40400407,2004-05,1610612759,Spurs,SAS,spurs,240:00,29.0,68.0,0.426,...,0.0,0.426,0.0,0.0,0.0,74.0,1,1,1,0
1,40400406,2004-05,1610612759,Spurs,SAS,spurs,240:00,31.0,75.0,0.413,...,0.0,0.413,0.0,0.0,0.0,95.0,1,0,1,0
2,40400405,2004-05,1610612765,Pistons,DET,pistons,265:00,37.0,84.0,0.44,...,0.0,0.44,0.0,0.0,0.0,96.0,1,0,1,0
3,40400404,2004-05,1610612765,Pistons,DET,pistons,240:00,41.0,90.0,0.456,...,0.0,0.456,0.0,0.0,0.0,71.0,1,1,1,0
4,40400403,2004-05,1610612765,Pistons,DET,pistons,240:00,40.0,85.0,0.471,...,0.0,0.471,0.0,0.0,0.0,79.0,1,1,1,0
5,40400402,2004-05,1610612759,Spurs,SAS,spurs,240:00,29.0,62.0,0.468,...,0.0,0.468,0.0,0.0,0.0,76.0,1,1,1,0
6,40400401,2004-05,1610612759,Spurs,SAS,spurs,240:00,34.0,79.0,0.43,...,0.0,0.43,0.0,0.0,0.0,69.0,1,1,1,0
7,40400307,2004-05,1610612748,Heat,MIA,heat,240:00,32.0,69.0,0.464,...,0.0,0.464,0.0,0.0,0.0,88.0,1,0,1,0
8,40400306,2004-05,1610612765,Pistons,DET,pistons,240:00,36.0,86.0,0.419,...,0.0,0.419,0.0,0.0,0.0,66.0,1,1,1,0
9,40400305,2004-05,1610612748,Heat,MIA,heat,240:00,36.0,69.0,0.522,...,0.0,0.522,0.0,0.0,0.0,76.0,1,1,1,0


In [7]:
print(df.columns.tolist())

['game_id', 'season', 'team_id', 'team_name', 'tri_code', 'team_slug', 'minutes', 'field_goals_made', 'field_goals_attempted', 'field_goals_percentage', 'three_pointers_made', 'three_pointers_attempted', 'three_pointers_percentage', 'free_throws_made', 'free_throws_attempted', 'free_throws_percentage', 'rebounds_offensive', 'rebounds_defensive', 'rebounds_total', 'steals', 'blocks', 'turnovers', 'fouls_personal', 'points', 'plus_minus_points', 'estimated_offensive_rating', 'offensive_rating', 'estimated_defensive_rating', 'defensive_rating', 'estimated_net_rating', 'net_rating', 'assist_percentage', 'assist_to_turnover', 'assist_ratio', 'offensive_rebound_percentage', 'defensive_rebound_percentage', 'rebound_percentage', 'estimated_team_turnover_percentage', 'turnover_ratio', 'effective_field_goal_percentage', 'true_shooting_percentage', 'usage_percentage', 'estimated_usage_percentage', 'estimated_pace', 'pace', 'pace_per40', 'possessions', 'p_i_e', 'distance', 'rebound_chances_offensi

## 2. Grouping Data by Home and Away Games

We will be grouping the data by the field `is_home_team` in the DataFrame.

In [5]:
grouped = df.groupby('is_home_team')

## 3. Examining Home vs Away Distributions and Testing for Normality

In [8]:
# Create separate home and away win columns
grouped["home_win"] = df["won_game"] * df["is_home_team"]
grouped["away_win"] = df["won_game"] * (1 - df["is_home_team"])

# Compute cumulative home and away wins per team
df["rolling_home_wins"] = grouped["home_win"].cumsum()
df["rolling_away_wins"] = grouped["away_win"].cumsum()

KeyError: 'team'