#### Business Objective: What was the impact of the COVID-19 pandemic on the level of home court advantage?
- Question 1: How has the difference in win-loss ratio between home and away teams changed from the 2019-20 season to the 2022-23 season? (wl_home and wl_away columns in game table)
- Question 2: How has the difference in average points scored between home and away teams changed from the 2019-20 season to the 2022-23 season? (pts_home and pts_away columns in game table)
- Question 3: How has the difference in offensive and defensive rebounds between home and away teams changed from 2019-20 to 2022-23?(oreb_home, oreb_away, dreb_home, dreb_away in the game table)
- Question 4: How does the average number of three-point field goals made by home teams compare to that of away teams from the 2019-20 season to the 2022-23 season? (fg3_pct_home and fg3_pct_away columns in game table)
- Question 5: How does the free throw percentage of home teams compare to that of away teams from the 2019-20 season to the 2022-23 season? (ft_pct_home and ft_pct_away in the game table)

In [None]:
import pandas as pd
import sqlite3

# Connect to the SQLite database
con = sqlite3.connect("data/nba.sqlite")

# Define relevant season IDs for analysis
seasons = ['22019', '22020', '22021', '22022']

# Question 1

## Intro
#### Objective of the script: calculating the win-loss ratio difference between home and away games by season

In [None]:
print("Question 1: Win-Loss Ratio Difference (Home vs. Away) by Season")

## Rising Action
#### Gather data for home and away win-loss ratios.

In [None]:
# Rising Action: Fetching and preparing the home win-loss data
wl_home = pd.read_sql_query(
    f"SELECT team_name_home AS team, wl_home AS wl, season_id FROM game WHERE season_id IN ({', '.join(seasons)})", 
    con
)
wl_home['wl'] = wl_home['wl'].map({'W': 1, 'L': 0})  # Convert 'W' and 'L' to numerical values
wl_home_grouped = wl_home.groupby(['team', 'season_id'])['wl'].mean().reset_index(name='home_win_ratio')

# Fetching and preparing the away win-loss data
wl_away = pd.read_sql_query(
    f"SELECT team_name_away AS team, wl_away AS wl, season_id FROM game WHERE season_id IN ({', '.join(seasons)})", 
    con
)
wl_away['wl'] = wl_away['wl'].map({'W': 1, 'L': 0})  # Convert 'W' and 'L' to numerical values
wl_away_grouped = wl_away.groupby(['team', 'season_id'])['wl'].mean().reset_index(name='away_win_ratio')


## Climax
#### Combine the home and away data and compute the win-loss ratio difference.

In [None]:
# Climax: Combining data and calculating win-loss ratio difference
win_loss_diff = pd.merge(wl_home_grouped, wl_away_grouped, on=['team', 'season_id'])
win_loss_diff['win_loss_diff'] = win_loss_diff['home_win_ratio'] - win_loss_diff['away_win_ratio']

## Falling Action
#### Examine to ensure the calculations are correct

In [None]:
print("Calculated Win-Loss Ratio Differences (Sample):")
print(win_loss_diff.head(), "\n")

plt.figure(figsize=(12, 6))
for season in points_diff['season_id'].unique():
    data = points_diff[points_diff['season_id'] == season]
    plt.plot(data['team'], data['points_diff'], label=f'Season {season}', marker='o')
plt.axhline(0, color='black', linestyle='--', linewidth=0.8)
plt.xticks(rotation=90)
plt.title('Average Points Difference (Home vs. Away) by Team and Season')
plt.xlabel('Team')
plt.ylabel('Points Difference')
plt.legend(title='Season')
plt.tight_layout()
plt.show()

## Conclusion
#### Reviewed the code

# Question 2

## Intro
#### Comparing the average points scored by teams at home versus away, grouped by season

In [None]:
print("\nQuestion 2: Average Points Difference (Home vs. Away) by Season")

## Rising Action
#### Fetch data for average points scored at home and away

In [None]:
# Rising Action: Fetching and calculating average points scored at home
pts_home = pd.read_sql_query(
    f"SELECT team_name_home AS team, AVG(pts_home) AS avg_pts_home, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_home, season_id", 
    con
)

# Fetching and calculating average points scored away
pts_away = pd.read_sql_query(
    f"SELECT team_name_away AS team, AVG(pts_away) AS avg_pts_away, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_away, season_id", 
    con
)

## Climax
### Combine the home and away data to compute the average points difference

In [None]:
# Climax: Merging the data and calculating the points difference
points_diff = pd.merge(pts_home, pts_away, on=['team', 'season_id'])
points_diff['points_diff'] = points_diff['avg_pts_home'] - points_diff['avg_pts_away']

## Falling Action
#### Check results to ensure calculations are accurate

In [None]:
# Falling Action: Inspecting the calculated points difference
print("Calculated Average Points Differences (Sample):")
print(points_diff.head(), "\n")


plt.figure(figsize=(12, 6))
for season in points_diff['season_id'].unique():
    data = points_diff[points_diff['season_id'] == season]
    plt.plot(data['team'], data['points_diff'], label=f'Season {season}', marker='o')
plt.axhline(0, color='black', linestyle='--', linewidth=0.8)
plt.xticks(rotation=90)
plt.title('Average Points Difference (Home vs. Away) by Team and Season')
plt.xlabel('Team')
plt.ylabel('Points Difference')
plt.legend(title='Season')
plt.tight_layout()
plt.show()

## Conclusion
#### Reviewed the code

Question 3

## Introduction
#### Calculating the difference in offensive and defensive rebounds between home and away games, grouped by season

In [None]:
# Introduction
print("\nQuestion 3: Rebounds Difference (Offensive and Defensive) by Season")

## Rising Action
#### Calculate average rebounds (offensive and defensive) at home and away

In [None]:
# Rising Action: Fetching and calculating average offensive and defensive rebounds at home
reb_home = pd.read_sql_query(
    f"SELECT team_name_home AS team, AVG(oreb_home) AS avg_oreb_home, AVG(dreb_home) AS avg_dreb_home, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_home, season_id", 
    con
)

# Fetching and calculating average offensive and defensive rebounds away
reb_away = pd.read_sql_query(
    f"SELECT team_name_away AS team, AVG(oreb_away) AS avg_oreb_away, AVG(dreb_away) AS avg_dreb_away, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_away, season_id", 
    con
)


## Climax
#### Combine the home and away data to compute the rebound differences.

In [None]:
# Climax: Merging data and calculating the offensive and defensive rebound differences
rebounds_diff = pd.merge(reb_home, reb_away, on=['team', 'season_id'])
rebounds_diff['offensive_rebound_diff'] = rebounds_diff['avg_oreb_home'] - rebounds_diff['avg_oreb_away']
rebounds_diff['defensive_rebound_diff'] = rebounds_diff['avg_dreb_home'] - rebounds_diff['avg_dreb_away']


## Falling Action
#### verify the calculated rebound differences

In [None]:
plt.figure(figsize=(12, 6))
for season in rebounds_diff['season_id'].unique():
    data = rebounds_diff[rebounds_diff['season_id'] == season]
    plt.bar(data['team'], data['offensive_rebound_diff'], label=f'Season {season} (Offensive)', alpha=0.6)
plt.axhline(0, color='black', linestyle='--', linewidth=0.8)
plt.xticks(rotation=90)
plt.title('Offensive Rebounds Difference (Home vs. Away) by Team and Season')
plt.xlabel('Team')
plt.ylabel('Offensive Rebound Difference')
plt.legend(title='Season')
plt.tight_layout()
plt.show()

plt.figure(figsize=(12, 6))
for season in rebounds_diff['season_id'].unique():
    data = rebounds_diff[rebounds_diff['season_id'] == season]
    plt.bar(data['team'], data['defensive_rebound_diff'], label=f'Season {season} (Defensive)', alpha=0.6)
plt.axhline(0, color='black', linestyle='--', linewidth=0.8)
plt.xticks(rotation=90)
plt.title('Defensive Rebounds Difference (Home vs. Away) by Team and Season')
plt.xlabel('Team')
plt.ylabel('Defensive Rebound Difference')
plt.legend(title='Season')
plt.tight_layout()
plt.show()

## Conclusion
#### Reviewed the code

# Question 4

## Introduction
#### Determining the difference in three-point field goal percentages between home and away games by season.

In [None]:
# Introduction
print("\nQuestion 4: Three-Point FG% Difference (Home vs. Away) by Season")

## Rising Action
#### Find the AVG three-point field goal percentage at home and away

In [None]:
# Rising Action: Fetching and calculating the average three-point field goal percentage at home
fg3_home = pd.read_sql_query(
    f"SELECT team_name_home AS team, AVG(fg3_pct_home) AS avg_fg3_pct_home, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_home, season_id", 
    con
)

# Fetching and calculating the average three-point field goal percentage away
fg3_away = pd.read_sql_query(
    f"SELECT team_name_away AS team, AVG(fg3_pct_away) AS avg_fg3_pct_away, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_away, season_id", 
    con
)


## climax 
#### Combine the home and away data to find the difference in three-point field goal percentages.

In [None]:
# Climax: Merging data and calculating the three-point FG% difference
fg3_diff = pd.merge(fg3_home, fg3_away, on=['team', 'season_id'])
fg3_diff['fg3_pct_diff'] = fg3_diff['avg_fg3_pct_home'] - fg3_diff['avg_fg3_pct_away']

## Falling Action
#### Verify the differences.

In [None]:
# Falling Action: Inspecting the calculated three-point FG% differences
print("Calculated Three-Point FG% Differences (Sample):")
print(fg3_diff.head(), "\n")  # Display the first few rows of the result

plt.figure(figsize=(12, 6))
for season in fg3_diff['season_id'].unique():
    data = fg3_diff[fg3_diff['season_id'] == season]
    plt.scatter(data['team'], data['fg3_pct_diff'], label=f'Season {season}', alpha=0.7)
plt.axhline(0, color='black', linestyle='--', linewidth=0.8)
plt.xticks(rotation=90)
plt.title('Three-Point FG% Difference (Home vs. Away) by Team and Season')
plt.xlabel('Team')
plt.ylabel('Three-Point FG% Difference')
plt.legend(title='Season')
plt.tight_layout()
plt.show()

## Conclusion
#### Reviewed the code

# Question 5

## Introduction
#### Comparing the free throw shooting percentages between home and away games by season.

In [None]:
# Introduction
print("\nQuestion 5: Free Throw Percentage Difference (Home vs. Away) by Season")

## Rising Action
#### Calculate AVG free throw percentage at home and away

In [None]:
# Rising Action: Fetching and calculating the average free throw percentage at home
ft_home = pd.read_sql_query(
    f"SELECT team_name_home AS team, AVG(ft_pct_home) AS avg_ft_pct_home, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_home, season_id", 
    con
)

# Fetching and calculating the average free throw percentage away
ft_away = pd.read_sql_query(
    f"SELECT team_name_away AS team, AVG(ft_pct_away) AS avg_ft_pct_away, season_id FROM game WHERE season_id IN ({', '.join(seasons)}) GROUP BY team_name_away, season_id", 
    con
)

## Climax
#### Combine the home and away data to calculate the free throw percentage difference.

In [None]:
# Climax: Merging data and calculating the free throw percentage difference
ft_diff = pd.merge(ft_home, ft_away, on=['team', 'season_id'])
ft_diff['ft_pct_diff'] = ft_diff['avg_ft_pct_home'] - ft_diff['avg_ft_pct_away']

## Falling Action
#### Verify the calculated differences

In [None]:
# Falling Action: Inspecting the calculated free throw percentage differences
print("Calculated Free Throw Percentage Differences (Sample):")
print(ft_diff.head(), "\n")  # Display the first few rows of the result

plt.figure(figsize=(10, 8))
plt.imshow(ft_pivot, cmap='coolwarm', aspect='auto')
plt.colorbar(label='Free Throw % Difference')
plt.xticks(range(len(ft_pivot.columns)), ft_pivot.columns, rotation=45)
plt.yticks(range(len(ft_pivot.index)), ft_pivot.index)
plt.title('Free Throw Percentage Difference (Home vs. Away) by Team and Season')
plt.tight_layout()
plt.show()

## Conclusion
#### Reviewed the code