# Exploring the Volume and Efficiency of Different Types of Offensive Possessions in the NBA

The purpose of this analysis is to perform some exploratory data analysis of NBA offense, starting in the current 2022-2023 season.

Note : If the graphs aren't displaying, you might just need to open the notebook file and make sure it is trusted so that the interactive plots will render.

The NBA API breaks down offensive possessions into several types :

* Cut
* Handoff
* Isolation
* Miscellaneous
* OffScreen
* Postup
* PRBallHandler
* PRRollman
* OffRebound
* Spotup
* Transition

A good starting point for measuring offensive efficiency is points per possession (PPP). To put our analysis into context, let's start by calculating the average points per possession across all players and play types.

## Scraping the NPA Stats API

The NBA SynergyPlayTypes endpoint doesn't let us call all players and play types at once, so instead we'll just do an API call for each playtype and merge the datasets together.

In [215]:
from nba_api.stats import endpoints
import pandas as pd
import time
import numpy as np

#hi

# loop over the different play types
play_types_list = ['Cut','Handoff','Isolation','Misc','OffScreen','Postup','PRBallHandler','PRRollman','OffRebound','Spotup','Transition']
df = pd.DataFrame()

for play_type in play_types_list:

    # The NBA API often times out on calls, using a while loop to simply automate retries of our api call
    while True:
        try:
            # response = endpoints.SynergyPlayTypes(play_type_nullable=play_type, player_or_team_abbreviation='P', type_grouping_nullable='Offensive').get_data_frames()[0]
            print(f'{play_type} called successfully.')
            break # quit the loop if successful
        except:
            print(f'Error with {play_type} call.')
            time.sleep(5)

    # combine each call into one dataframe
    df = pd.concat([df,response])
    time.sleep(5)

# df.to_csv('synergy_all_offensive_possessions_by_player.csv')

Cut called successfully.
Handoff called successfully.
Isolation called successfully.
Misc called successfully.
OffScreen called successfully.
Postup called successfully.
PRBallHandler called successfully.
PRRollman called successfully.
OffRebound called successfully.
Spotup called successfully.
Transition called successfully.


Let's save this as a csv so that in the future we can just read from the csv to save some time for analysis. We can always rerun the previous cell when we want to update our data.

In [216]:
df = pd.read_csv('synergy_all_offensive_possessions_by_player.csv')
print('Read file from csv.')

Read file from csv.


## Calculating Average Points Per Possession for the Entire League

The complete csv contains many extraneous columns, let's select just a couple that we are interested in.

* PLAYER_NAME and TEAM_NAME will identify the players.
* PLAY_TYPE describes the type of offensive possession.
* PPP gives us the average number of points scored per possession by that player for a certain type of shot attempt.
* POSS is the number of possessions that each player utilizes per type of shot attempt.

In [217]:
df = df[['PLAYER_NAME','TEAM_NAME','PLAY_TYPE','PPP','POSS']]

# sort by player name then play type
df = df.sort_values(['PLAYER_NAME','PLAY_TYPE'])

Now, by calculating the weighted average of points per possession weighted by number of possessions we get our league average PPP of 1.019. That should give us some useful context for our play type analysis.

In [218]:
league_average_ppp = np.average(df['PPP'], weights=df['POSS'])
league_average_ppp

1.0194306169357525

## Calculating PPP and Possession Frequency per Play Type

Next, let's calculate PPP for each play type to see which types of plays tend to be more and less efficient.

In [219]:
# create new dataframe for this analysis
df_play_type_PPP = df.sort_values('PLAY_TYPE')

# calculate PPP using grouped weighted average
df_play_type_PPP_weights = df_play_type_PPP.groupby('PLAY_TYPE').apply(lambda x: np.average(x['PPP'], weights=x['POSS'])).reset_index(name='PPP')

# calculate sum of possessions
df_play_type_PPP_possessions = df_play_type_PPP.groupby('PLAY_TYPE')['POSS'].sum().reset_index()
df_play_type_PPP = pd.merge(df_play_type_PPP_weights,df_play_type_PPP_possessions)
df_play_type_PPP.sort_values('PPP')

Unnamed: 0,PLAY_TYPE,PPP,POSS
3,Misc,0.542987,10269
6,PRBallHandler,0.90687,31514
2,Isolation,0.947907,12065
5,OffScreen,0.951863,6508
1,Handoff,0.960385,8298
8,Postup,0.978828,7238
9,Spotup,1.039979,43021
10,Transition,1.134757,30843
4,OffRebound,1.148954,9044
7,PRRollMan,1.176522,9234


Now that we have volume and efficiency data for each shot type, let's graph our results.

In [224]:
import plotly.express as px
fig = px.scatter(df_play_type_PPP, x='POSS', y='PPP', text='PLAY_TYPE', width=1000, height=700)
fig.update_layout(title_text='Frequency and Efficiency of NBA Offensive Possessions by Shot Type (2022-2023)', title_x=0.5)
fig.update_traces(textposition='bottom center')
fig.add_hline(y=league_average_ppp, line_dash='dash', line_width = 0.5)
fig.update_xaxes(title_text='Number of Possessions')
fig.update_yaxes(title_text='Points per Possession (PPP)')
fig.show()

First, let's start by taking a look at efficiency as measured by PPP.

* Cutting to the basket is by far the most efficient type of possession at 1.31 PPP.
* Misc possessions are extremely inefficient at 0.54 PPP.
* The rest of the shot types are more tightly clustered between 0.8PPP and 1.2PPP

Next, let's look at the number of possessions utilized for each shot type.
* Spotting up is by far the most common type of possession, with over 43k occurrences in the '22 season.
* Transition and Pick and Roll Ball Handler possessions are also quite frequent at around 31k occurrences.
* All other possession types occur significantly less frequently, clustered between 6k and 12k occurrences.

A major trend in modern NBA offense has been the rise of faster paced play relying on weaker transition defenses. The data here supports this trend, showing both the high frequency and efficiency of these types of possessions.

## PPP and Possession Frequency at the Team Level

Now that we have taken a look at offensive possessions by shot type at the league-level, let's move on to analyzing shot possessions at the team level.

In [221]:
# create new dataframe for this analysis
df_team = df.sort_values('PLAY_TYPE')

# group data by team and play type, then calculate PPP with weighted average
df_team_PPP = df_team.groupby(['TEAM_NAME','PLAY_TYPE']).apply(lambda x: np.average(x['PPP'], weights=x['POSS'])).reset_index(name='PPP')

# calculate sum of possessions
df_team_possessions = df_team.groupby(['TEAM_NAME','PLAY_TYPE'])['POSS'].sum().reset_index()

df_team = pd.merge(df_team_PPP,df_team_possessions)
df_team.sort_values(['TEAM_NAME','PLAY_TYPE'])

Unnamed: 0,TEAM_NAME,PLAY_TYPE,PPP,POSS
0,Atlanta Hawks,Cut,1.388962,342
1,Atlanta Hawks,Handoff,1.019487,154
2,Atlanta Hawks,Isolation,0.977146,481
3,Atlanta Hawks,Misc,0.613968,285
4,Atlanta Hawks,OffRebound,1.114693,322
...,...,...,...,...
325,Washington Wizards,PRBallHandler,0.897693,919
326,Washington Wizards,PRRollMan,1.197404,324
327,Washington Wizards,Postup,0.996683,293
328,Washington Wizards,Spotup,1.035888,1423


In [222]:
fig = px.scatter(df_team, x='POSS', y='PPP', color='PLAY_TYPE', hover_name='TEAM_NAME')
fig.update_layout(title_text='Frequency and Efficiency of NBA Teams by Shot Type (2022-2023)', title_x=0.5, width=1000, height=700)
fig.add_hline(y=league_average_ppp, line_dash='dash', line_width = 0.5)
fig.update_xaxes(title_text='Number of Possessions')
fig.update_yaxes(title_text='Points per Possession (PPP)')
fig.show()

The graph above plots a point for each NBA team and play type. From the big picture perspective, we can see that the distribution of shot types varies widely from team to team. Let's split this into separate graphs for each play type for a clearer picture.

In [223]:
fig = px.scatter(df_team, x='POSS', y='PPP', facet_col='PLAY_TYPE', facet_col_wrap=3, color='TEAM_NAME', width=1100, height=1000)
fig.update_layout(showlegend=False)
fig.update_layout(title_text='Frequency and Efficiency of NBA Teams by Shot Type (2022-2023)', title_x=0.5)
fig.add_hline(y=league_average_ppp, line_dash='dash', line_width = 0.5)
fig.update_xaxes(title_text='Possessions')
fig.update_yaxes(title_text='PPP')
fig.show()

#### Cut
* The Golden State Warriors are an excellent cutting team. Generating effective cutting possessions requires an effective motion offense with the presence of multiple skilled ball handlers and players who excel at off ball movement. The Warriors are a veteran team with a core that has played together for many years. In contrast, the Houston Rockets are a rebuilding team with talented but inexperienced players and generate the least efficient looks from cutting possessions.
#### Handoff
* The Sacramento Kings do an excellent job of generating a large volume of efficient handoff possessions.
#### Isolation
* The San Antonio Spurs are very poor in isolation, but also take the least number of isolation attempts which is better than taking a large volume of an inefficient shot type. In contrast, the Dallas Mavericks generate a large volume of isolation looks with just above league average efficiency, driven in large part by their franchise star Luka Doncic's ability to break down defenses on his own.
#### Off Rebounds
* Off Rebound possessions are incredibly efficient, with only Oklahoma City Thunder, Golden State Warriors, and LA Clippers performing below league average. The Warriors and Clippers both run offenses that do not rely on much activity from the center. The Thunder, however, are missing the presence of 7'1" rookie Chet Holmgren. It will be interesting to see if their efficency and frequency of Off Rebound possessions improves once he returns to play.
#### Off Screen
* The Golden State Warriors generate by far the most possessions coming off of screens, but only do so at a league average level. The Houston Rockets and Philadelphia 76ers both generate very inefficient looks off of screens but do so infrequently.
#### Pick and Roll Ball Handler
* The Pick and Roll Ball Handler category is almost entirely below league average efficiency, with only the Dallas Mavericks and Portland Trail Blazers coming particularly close. Luka Donic and Damian Lillard are both excellent ball handling threats. Interestingly, while these types of possessions are inefficient, almost all teams run them with large frequency. The Pick and Roll is a simply, fundamental basketball skill and can be executed effectively even with only a limited amount of time, whereas for example generating a good spotup or off screen look might take more time to emerge in an offensive possession. It might be interesting to further analyze possessions by the time left of the shot clock when the shot attempt occurs to attempt to see what types of shots tend to be bail out shots done with the pressure of limited time left in the possession.
#### Pick and Roll Roller
* In contrast to the PR Ball Handler, the PR Roll is less frequent but much more efficient, with every team performing at an above average efficiency. The Denver Nuggets, Dallas Mavericks, and Los Angeles Lakers do the best job of utilizing these possessions.
#### Postup
* The Postup is run effectively by the Denver Nuggets, powered by MVP center Nikola Jokic's impressive individual scoring ability combined with his excellent passing.
#### Spot Up
* The Spot Up is most effectively used by the Boston Celtics, who generate a large number of highly efficient shots for their team.
#### Transition
* Finally, the Transition play is both a frequently used and incredibly efficient shot type for every team. A significant trend in the league has been the increased desire for teams to generate efficient looks in transition to take advantage of defenses who have not had time to get into position yet. The data shows that this trend is occurring for a reason.

# Future Analysis Ideas

* looking at changes in shot type frequency and efficiency over time
* comparing deviance from league average frequency and efficiency at the team level per shot type to generate uniqueness scores that describe which NBA offenses are most similar to average and which offenses deviate most from average
* regression analysis on how player variables such as height impact effectiveness of Off Rebound possessions as an example