# NBA Four Factors of Success
## Visualizing
The "four factors" of the NBA are advanced metrics on four aspects of a team's performance, namely:
   * Shooting (40%)
   * Turnovers (25%)
   * Rebounding (20%)
   * Free throws (15%)

Statistician and NBA advanced analytics pioneer Dean Oliver estimated weights, shown above in parentheses, for how important each factor is to a team's chances of winning. Each factor is tracked using the metrics below, respectively.

* eFG% - Effective field goal percent
* TOV% - Turnover rate
* OREB% - Offensive rebounding percentage
* FTA Rate - Free throw attempt rate

In this notebook I will visualize the four factors for the San Antonio Spurs (Go Spurs Go!) from the 2000/2001 season to the 2020/2021 season. I will similarly visualize the factors from their opponents' perspective. Additionally, I will combine the four factors into a simple index and plot them against the team's winning percentage.


# Steps

1. Add the [Basketball Dataset](https://www.kaggle.com/wyattowalsh/basketball) from Kaggle into the kernel
2. [Connect to the "basketball" SQLite database/file using the SQLite3 engine](#1)
3. [Query the "Game" table for the desired team and component metrics needed to calculate the four factors](#2)
4. [Aggregate the game data by season using Pandas, compute the four factors and index](#3)
5. [Create a single figure visual of 5 subplots using Plotly graph objects](#4)
    * One subplot for each factor and one for the combined index   

[Introduction section](#1)

## 2. Importing & Connecting <a class="anchor" id="1"></a>


In [None]:
!pip install -U kaleido

In [None]:
import numpy as np
import pandas as pd

import plotly.io as pio
import plotly.graph_objects as go
from plotly.subplots import make_subplots

pio.templates.default = "plotly_white"

import sqlite3 as sql

import os

if not os.path.exists("images"):
    os.mkdir("images")

In [None]:
db_path = '../input/basketball/basketball.sqlite'
connection = sql.connect(db_path)
print("SQL DB connected.")

In [None]:
team_mappings={
    'ATL':'Atlanta Hawks','BOS':'Boston Celtics','CLE':'Cleveland Cavaliers',
    'CHI':'Chicago Bulls','GSW':'Golden State Warriors','LAC':'Los Angeles Clippers',
    'CHA':'Charlotte Hornets','DAL':'Dallas Mavericks','DEN':'Denver Nuggets',
    'DET':'Detroit Pistons','HOU':'Houston Rockets','LAL':'Los Angeles Lakers',
    'IND':'Indiana Pacers','MIN':'Minnesota Timberwolves','MIL':'Milwaukee Bucks',
    'MEM':'Memphis Grizzlies','MIA':'Miami Heat','NYK':'New York Knicks',
    'NOP':'New Orleans Pelicans', 'PHI':'Philadelphia 76ers','OKC':'Oklahoma City Thunder',
    'ORL':'Orlando Magic','POR':'Portland Trail Blazers','SAC':'Sacramento Kings',
    'SAS':'San Antonio Spurs','TOR':'Toronto Raptors','BKN':'Brooklyn Nets',
    'UTA':'Utah Jazz','PHX':'Phoenix Suns','WAS':'Washington Wizards'
    
}

<a id="2"></a> <br>
## 3. Write & Query

In [None]:
def get_data(team):
    team_code = 20*[team]
    four_factors_query = """
        SELECT SEASON,
            GAME_DATE,
            MATCHUP_AWAY,
            GAME_ID,
            ? AS Team,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN WL_HOME ELSE WL_AWAY
            END AS WL,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FG3M_HOME ELSE FG3M_AWAY
            END AS FG3M,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FGA_HOME ELSE FGA_AWAY
            END AS FGA,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FGM_HOME ELSE FGM_AWAY
            END AS FGM,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FTA_HOME ELSE FTA_AWAY
            END AS FTA,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FTM_HOME ELSE FTM_AWAY
            END AS FTM,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN TOV_HOME ELSE TOV_AWAY
            END AS TOV,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN OREB_HOME ELSE OREB_AWAY
            END AS OREB, 
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN DREB_AWAY ELSE DREB_HOME
            END AS DREB_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FG3M_AWAY ELSE FG3M_HOME
            END AS FG3M_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FGA_AWAY ELSE FGA_HOME
            END AS FGA_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FGM_AWAY ELSE FGM_HOME
            END AS FGM_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FTA_AWAY ELSE FTA_HOME
            END AS FTA_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN FTM_AWAY ELSE FTM_HOME
            END AS FTM_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN TOV_AWAY ELSE TOV_HOME
            END AS TOV_OPP,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN DREB_HOME ELSE DREB_AWAY
            END AS DREB,
            CASE TEAM_ABBREVIATION_HOME
                WHEN ? THEN OREB_AWAY ELSE OREB_HOME
            END AS OREB_OPP
        FROM Game
        WHERE (TEAM_ABBREVIATION_HOME = ?
            OR TEAM_ABBREVIATION_AWAY = ?)
            AND (SEASON > '1999')
        ORDER BY GAME_DATE;
    """
    print('Query running...')
    four_factors = pd.read_sql(four_factors_query, connection, params=team_code)

    four_factors = four_factors.astype({'SEASON': 'int',
                                'FG3M':'int',
                                'FGA':'int',
                                'FGM':'int',
                                'FTA':'int',
                                'FTM':'int',
                                'TOV':'int',
                                'OREB':'int',
                                'DREB_OPP':'int',
                                'FG3M_OPP':'int',
                                'FGA_OPP':'int',
                                'FGM_OPP':'int',
                                'FTA_OPP':'int',
                                'FTM_OPP':'int',
                                'TOV_OPP':'int',
                                'DREB':'int',
                                'OREB_OPP':'int'})
    
    print('Query done.')
    return four_factors, team

<a id="3"></a> <br>
## 4. Aggregate & Compute Factors

In [None]:
def get_adv_metrics(df):    
    num_cols = list(df.select_dtypes('number').columns)
    num_cols.remove('SEASON')
    ffs = df.groupby('SEASON')[num_cols].sum()

    ffs['EFG_PCT'] = (ffs.FGM + 0.5*ffs.FG3M)/ffs.FGA
    ffs['TOV_PCT'] = ffs.TOV/(ffs.FGA + (0.44*ffs.FTA) + ffs.TOV)
    ffs['ORB_PCT'] = ffs.OREB/(ffs.OREB + ffs.DREB_OPP)
    ffs['FT_RT'] = ffs.FTA/ffs.FGA
    ffs['FF_Index'] = 0.4*ffs.EFG_PCT + 0.25*(1-ffs.TOV_PCT) + 0.2*ffs.ORB_PCT + 0.15*ffs.FT_RT

    ffs['EFG_PCT_OPP'] = (ffs.FGM_OPP + 0.5*ffs.FG3M_OPP)/ffs.FGA_OPP
    ffs['TOV_PCT_OPP'] = ffs.TOV_OPP/(ffs.FGA_OPP + (0.44*ffs.FTA_OPP) + ffs.TOV_OPP)
    ffs['ORB_PCT_OPP'] = ffs.OREB_OPP/(ffs.DREB + ffs.OREB_OPP)
    ffs['FT_RT_OPP'] = ffs.FTA_OPP/ffs.FGA_OPP
    ffs['FF_Index_OPP'] = 0.4*ffs.EFG_PCT_OPP + 0.25*(1-ffs.TOV_PCT_OPP) + 0.2*ffs.ORB_PCT_OPP + 0.15*ffs.FT_RT_OPP

    ffs.drop(num_cols, axis=1, inplace=True)
    
    wl = pd.get_dummies(df[['SEASON','WL']])
    wl = wl.groupby('SEASON').sum()
    wl['win_pct'] = wl.WL_W/wl.sum(axis=1)

    ffs = ffs.join(wl)

    return ffs

In [None]:
four_factors, team = get_data('SAS')

In [None]:
ffs = get_adv_metrics(four_factors)
ffs.tail()

<a id="4"></a> <br>
## 5. Create Visual

In [None]:
team_color ='#1f85ff'
opp_color='#868a91'

fig = make_subplots(rows=3, cols=2,
                   subplot_titles=('Four Factors Index','eFG%','TOV%','OREB%','FTA Rate'),
                    specs=[[{"colspan": 2, 'secondary_y':True}, None],
                            [{}, {}],
                            [{}, {}]]
                   )

fig.add_trace(
    go.Scatter(
        x=ffs.index, 
        y=ffs.FF_Index_OPP,
        name='Opponent',
        marker_color=opp_color,
        line_dash='dash',
        line_shape='spline',
        legendgroup='indices',
        legendgrouptitle_text='Index (Left)'
    ), 
    row=1, col=1,
    secondary_y=True
)

fig.add_trace(
    go.Scatter(
        x=ffs.index, 
        y=ffs.FF_Index,
        name=team,
        marker_color = team_color,
        line_width=2.5,
        line_shape='spline',
        legendgroup='indices'
    ),
    row=1, col=1,
    secondary_y=True
)

fig.add_trace(
    go.Bar(
        x=ffs.index, 
        y=ffs.win_pct,
        name=team,
        marker_color='#bddbff',
        legendgroup='winpct',
        legendgrouptitle_text='Win% (Right)'
    ), row=1, col=1
)

metrics = ['EFG_PCT_OPP', 'EFG_PCT', 'TOV_PCT_OPP', 'TOV_PCT',
           'ORB_PCT_OPP', 'ORB_PCT', 'FT_RT_OPP', 'FT_RT']

metric=0

for row in [2,3]:
    for col in [1,2]:  

        fig.add_trace(
            go.Scatter(
                x=ffs.index, 
                y=ffs[metrics[metric]],
                name='Opponent',
                marker_color = opp_color,
                line_shape='spline',
                line_dash='dash',
                showlegend=False
            ), row=row, col=col
        )
        metric+=1
        fig.add_trace(
            go.Scatter(
                x=ffs.index, 
                y=ffs[metrics[metric]],
                name=team,
                marker_color=team_color,
                line_shape='spline',
                showlegend=False
            ), row=row, col=col
        )
        metric+=1

for i in [1,2,3]:
    for j in [1,2,3]:
        fig.update_xaxes(showgrid=False, row=i, col=j)
        fig.update_yaxes(tickformat='%', row=i, col=j)

fig.update_yaxes(secondary_y=True,side='right', showgrid=False, tickformat='.2f',row=1, col=1)
fig.update_yaxes(secondary_y=False,side='left', row=1, col=1)

fig.layout.annotations[0].update(font_size=14)
fig.layout.annotations[1].update(font_size=14)
fig.layout.annotations[2].update(font_size=14)
fig.layout.annotations[3].update(font_size=14)
fig.layout.annotations[4].update(font_size=14)

fig.update_layout(
    title='<b>NBA Four Factors Index & Components</b><br>'+team_mappings[team],
    title_font_size=20,
    font_size=10,
    xaxis_showgrid=False,
    font_family='helvetica, verdana, arial, sans-serif',
    height=700
)

fig.write_image("images/ffindex.jpeg")
fig.write_image("images/ffindex.pdf")
fig.write_image("images/ffindex.svg")

fig.show()