# A deep-dive on the effects of head coaching changes in the NBA

The purpose of this project is to investigate the impact of head coach changes on underperforming NBA teams. In the NBA, head coaches are at the forefront of critique when a team underperforms. This usually results in a change of head coaching and, sometimes, a change in the entire coaching staff; but, how effective is this strategy? This project aims to analyze the tangible effects of such coaching transitions on team performance in the subsequent seasons. My focus is on providing a data-driven exploration of the role head coaches play in the NBA and how these changes may influence the team's future performance.

At the end of the 2022-2023 regular season, the Toronto Raptors placed 9th in the Eastern Conference Standings, marking their 2nd missed playoffs in the last 3 seasons after their 2019 Championship run. This led the Raptors parting ways with head coach Nick Nurse as well with the majority of the coaching staff in the summer of 2023. This season, the Raptors hired Darko Rajakovic, an assistant coach from the Memphis Grizzlies to be their new head coach to implement new systems, offensive and defensive philosophies, and to facilitate development of the young Toronto Raptors core. 

Coming off high anticipation after the offseason, the Toronto Raptors are 2-4 to start the season. This raises the question: What level of impact can we relaistically expect from these coaching changes? This scenario provides a real-world backdrop for our comprehensive investigation intot he effects of coaching transitions across the NBA. 

We seek to answer several questions through exploratory data analysis (EDA):

What is the average number of playoffs clinched by teams that undergo coaching changes?
How does changing the head coach correlate with the average change in team win percentage in subsequent years?
While considering the nuances that might attribute to team growth, such as roster changes and player development, we will also delve into predictive modeling using neural networks (NN). The NN will help us predict the team's future win percentage and determine which seed they could potentially secure in their respective conference. Furthermore, we aim to predict the winningness of the team in subsequent years based on their regular-season records. These predictions will be categorized into:

- High seed team (1-4)
- Low seed team (5-8)
- Teams out of playoff contention

We can also redefine the team's winningness by measuring playoff performance, predicting whether they will:

- Win a championship
- Win the conference
- Advance to the second round
- Reach the first round

Through this project, we aim to provide data-driven insights into the impact of coaching changes on NBA teams and their future performance, shedding light on the strategies employed in the dynamic world of professional basketball.

In [5]:
import pandas as pd
from nba_api.stats.endpoints import commonteamroster
from nba_api.stats.static import teams
from nba_api.stats.endpoints import teamyearbyyearstats
from nba_api.stats.endpoints import playoffpicture

In [4]:
# Get a list of team info
team_info = teams.get_teams()
head_coaches_data = []

# Iterate through teams and seasons
for team in team_info:
    team_id = team['id']
    team_name = team['abbreviation']
    for season in range(2005, 2024):
        coach_data = commonteamroster.CommonTeamRoster(team_id=team_id, season=season)
        coach_data_df = coach_data.coaches.get_data_frame()
        seasons = f"{season}-{str(season+1)[-2:]}"
        if not coach_data_df.empty:
            try:
                coach_name = coach_data_df[coach_data_df['COACH_TYPE'] == 'Head Coach']['COACH_NAME'].values[0]
            except IndexError:
                coach_name = 'None'
        head_coaches_data.append({
            'Team ID': team_id,
            'Season': seasons,
            'Team': team_name,
            'Coach': coach_name
        })

head_coaches_df = pd.DataFrame(head_coaches_data)

In [41]:
# finding no data instances - data isn't really clean 
no_data = head_coaches_df[head_coaches_df['Coach'] == 'None']
no_data

Unnamed: 0,Team ID,Season,Team,Coach
66,1610612740,2014-15,NOP,
121,1610612743,2012-13,DEN,
200,1610612747,2015-16,LAL,
349,1610612755,2012-13,PHI,
374,1610612756,2018-19,PHX,
409,1610612758,2015-16,SAC,
446,1610612760,2014-15,OKC,
567,1610612766,2021-22,CHA,


In [6]:
# Get a list of team info
team_info = teams.get_teams()
team_record_data_list = []  # List to store individual data frames

# Iterate through teams and seasons
for team in team_info:
    team_id = team['id']
    team_record_data = teamyearbyyearstats.TeamYearByYearStats(team_id=team_id)
    team_record_data_df = team_record_data.team_stats.get_data_frame()
    team_df = pd.DataFrame({
        'Team ID': team_id,
        'Season': team_record_data_df['YEAR'],
        'Team Name': team_record_data_df['TEAM_NAME'],
        'Wins': team_record_data_df['WINS'],
        'Losses': team_record_data_df['LOSSES'],
        'Win PCT': team_record_data_df['WIN_PCT']
    })
    team_record_data_list.append(team_df)

# Concatenate all data frames into the final result
team_record_df = pd.concat(team_record_data_list)


In [26]:
combined_df = pd.merge(head_coaches_df, team_record_df, on=['Team ID', 'Season'], how='left')
combined_df

Unnamed: 0,Season,Team ID,Team,Coach,Team Name,Wins,Losses,Win PCT
0,2005-06,1610612737,ATL,Mike Woodson,Hawks,26,56,0.317
1,2006-07,1610612737,ATL,Mike Woodson,Hawks,30,52,0.366
2,2007-08,1610612737,ATL,Mike Woodson,Hawks,37,45,0.451
3,2008-09,1610612737,ATL,Mike Woodson,Hawks,47,35,0.573
4,2009-10,1610612737,ATL,Mike Woodson,Hawks,53,29,0.646
...,...,...,...,...,...,...,...,...
535,2018-19,1610612766,CHA,James Borrego,Hornets,39,43,0.476
536,2019-20,1610612766,CHA,James Borrego,Hornets,23,42,0.354
537,2020-21,1610612766,CHA,James Borrego,Hornets,33,39,0.458
538,2021-22,1610612766,CHA,No Data,Hornets,43,39,0.524


In [1]:
from nba_api.stats.endpoints import playoffpicture
from nba_api.stats.static import teams
data = []
teams = teams.get_teams()

for team in teams: 
    for season in range(2000, 2024):
        playoff_picture = playoffpicture.PlayoffPicture(season_id='2'+str(season))
        playoff_picture_east_df = playoff_picture.east_conf_standings.get_data_frame()
        # playoff_picture_west_df = playoff_picture.west_conf_standings.get_data_frame()
        east_team_wins = playoff_picture_east_df['WINS']
        east_team_losses = playoff_picture_east_df['LOSSES']
        east_team_pct = playoff_picture_east_df['PCT']
        east_team = playoff_picture_east_df['TEAM']
        east_team_clinched_playoffs = playoff_picture_east_df['CLINCHED_PLAYOFFS']
        data.append({
            'Season': season, 
            'Team': team,
            'East Wins': east_team_wins,
            'East Losses': east_team_losses, 
            'East PCT': east_team_pct,
            'East Clinched': east_team_clinched_playoffs})

records = pd.DataFrame(data)
records


# # Create an instance of the PlayoffPicture object
# playoff_picture = playoffpicture.PlayoffPicture(season_id='22005')

# # Call the get_data_frames method to retrieve the data frames
# data_frames = playoff_picture.east_conf_standings.get_data_frame()
# data_frames



NameError: name 'pd' is not defined