# Does defense actually win championships?

In this analysis, we will attempt to answer this question with defensive team statistics on the NBA championships from the 1996-97 season up to the 2020-21 season. These are different eras which would help identify if there is a consistent underlying pattern that is fundamental in the NBA even as the style of play changes.

The relevence of this analysis is to provide evidence to coaches deciding on their next draft pick or trade should be geared towards a more defensive player. It is also relevent to players as it can help illustrate how important defense is to their championship aspirations.

Following analyses will also address attacking and the combination of attacking and defense.

In [1]:
import sqlalchemy
import pandas as pd
from os import environ


engine = sqlalchemy.create_engine("mariadb+mariadbconnector://"+environ.get("USER")+\
                                  ":"+environ.get("PSWD")+"@127.0.0.1:3306/nba")

### The first step is to collect the data from the team standings in the playoffs from the database.

Since we are interested in team's position during the playoffs in relation to the defesive statistics, we will only collect the defensive statistics, wins and the team names. The teams can then be ordered by the number of wins with respect to the season the teams participated in

We will therefore take:
- The playoff season that follows the format "004YY" where YY is the year the season starts (SEASON_ID)
- The team names of the participating teams (TEAM) and: 
- Their wins (W)
- Their average rebounds (REB)
- Their average steals (STL)
- Their average blocks (BLK)

In [2]:
fields = "SEASON_ID, Teams.Name as TEAM, W, REB, STL, BLK "

join =  "Team_standings INNER JOIN Teams on Team_standings.TEAM_ID = Teams.ID "

condition = "where SEASON_ID LIKE '004%' "

select = "SELECT "+ fields + "FROM " + join + condition + "order by SEASON_ID asc, W desc"

df = pd.read_sql(select, engine)

In [6]:
def build_year(year):
    if(2000+year> localtime().tm_year):
        return 1900+year
    return 2000+year

## Since this analysis is on championships, the data required is for the teams that appear in the NBA finals. The following function extracts the top two teams in the playoffs from  the 1996-97 season to the 2020-21 season.

In [48]:
def segment_data(df):
    new_df  = pd.DataFrame()
    seasons = df['SEASON_ID'].unique()
    
    for s in seasons:
        d = df.loc[df['SEASON_ID'] == s].head(2)
        d["POSITION"] = list(range(1,len(d)+1))
        d["YEAR"] = build_year(int(s[-2:]))
        new_df = pd.concat([new_df,d])
        
    return new_df.sort_values(["YEAR","POSITION"],ascending= [True,False])

In [49]:
df = segment_data(df)

In [101]:
df.head()

Unnamed: 0,SEASON_ID,TEAM,W,REB,STL,BLK,POSITION,YEAR
337,496,UTA,13,41.8,7.4,4.9,2,1996
336,496,CHI,15,43.5,8.5,4.8,1,1996
353,497,UTA,13,39.8,6.5,4.9,2,1997
352,497,CHI,15,41.0,9.0,4.5,1,1997
369,498,NYK,12,39.3,7.6,3.9,2,1998


## Of the two teams, we will determine the percentage of teams that have won in the NBA finals given that the team that wins:

- Does not have any defensive stats higher than the losing team
- Has one of the defensive stats higher than the losing team
- Has two defensive stats higher than the losing team
- Has all defensive stats higher than the losing teams

So the teams with one or none of the defensive stats higher
This will show if a more complete defense likely leads to a win in the Finals. 

## We first calculate the differences in the defensive stats for each year's NBA finals. Positive differences mean that the winning team had higher defensive stats then the runner up. Negative diffences mean the winning team had lower defensive stats than the runner up

In [96]:
diffs = df.groupby("SEASON_ID").diff()

In [97]:
diffs["SEASON_ID"] = df["SEASON_ID"]

In [98]:
diffs = diffs.dropna()

In [99]:
diffs = diffs[["REB","STL","BLK","SEASON_ID"]]

In [100]:
diffs

Unnamed: 0,REB,STL,BLK,SEASON_ID
336,1.7,1.1,-0.1,496
352,1.2,2.5,-0.4,497
368,0.9,-0.3,1.9,498
384,2.8,1.2,1.4,499
0,4.3,1.2,0.9,400
16,2.7,-1.1,0.3,401
32,0.4,0.3,2.6,402
48,4.6,0.7,2.9,403
64,-0.2,-1.7,-0.3,404
80,-0.7,0.4,0.3,405


In [71]:
def count_higher(df,stats,number_higher):
    count = 0
    for i in df[stats].iterrows():
        higher = 0
        idx = 0
        while(idx<len(i[1])):
            if(i[1][idx] >= 0):
                higher+=1
            idx+=1
        
        if(higher == number_higher):
            count+=1
            
    return count

In [72]:
total_finals = len(diffs)
stats = ["REB","STL","BLK"]

In [73]:
zero_higher = count_higher(diffs,stats, 0)
one_higher = count_higher(diffs,stats, 1)
two_higher = count_higher(diffs,stats, 2)
all_higher = count_higher(diffs,stats, 3)

In [80]:
highers = [zero_higher/,one_higher,two_higher,all_higher]

In [82]:
probs = [zero_higher/total_finals,one_higher/total_finals,two_higher/total_finals,all_higher/total_finals]

In [93]:
labels = ["Zero", "One","Two","All"]

## We will now build a dashboard that shows the defensive statistic against the season's year. With colour seperating the team's positions

In [85]:
from jupyter_dash import JupyterDash
from dash import html
from dash.dependencies import Input, Output
import plotly.express as px
from dash import dcc

In [86]:
from time import localtime

In [116]:
fig = px.bar(y=labels, x=probs,color=labels,barmode='overlay',opacity=1,orientation='h')
fig.update_layout(title_text="Percentage of NBA final winners who have defensive stats higher than the runner up",
                  title_x=0.5,yaxis={'categoryorder':'max ascending'},xaxis_title="Proportion of winners",
                  yaxis_title = "Defensive stats higher than the runner up")


app = JupyterDash(__name__)
colours = {'text': '#7FDBFF', 'background':'#333333','radio_button':'#BBBBBB'} 
text_size = {'H1':48,'H2':40,'text':28,'radio_button':20}

app.layout = html.Div(style={'backgroundColor':colours['background'],'fontFamily':'Arial'}, children=[

    html.H1(children='NBA Data visualisation',
        style = {'textAlign': 'center',
                 'color':colours['text'],
                 'fontSize':text_size['H1']}),


    html.Div(children=[dcc.Graph(figure = fig, id = 'graph')])
        
#             html.Div([html.Label("Position: ", style = {'textAlign': 'center',
#                          'color':colours['text'],
#                          'fontSize':text_size['text']}),
#                         pos_dropdown()
#                     ],
#                 style = {'textAlign': 'left',"flex":1}
#             ),
        
#             html.Div([html.Label("Season: ", style = {'textAlign': 'center',
#                          'color':colours['text'],
#                          'fontSize':text_size['text']}),
#                         season_dropdown()
#                      ],
#                 style = {'textAlign': 'center',"flex":1}
#             ),
        
        
#             html.Div([html.Label("Stats", style = {'textAlign': 'center',
#                          'color':colours['text'],
#                          'fontSize':text_size['text']}),
#                         player_stats()
#                      ],
#                 style = {'textAlign': 'right',"flex":1}
#             )
        
        
            

#         ],style = {'display':'flex','flex-direction': 'row'}
#     ),
    
#     player_slider(),
    
   
    
#     ,year_slider()
])

# @app.callback(
#     Output('graph','figure'),
#     Input('stats','value'),
#     Input('years','value'),
#     Input('pos_drop','value'),
#     Input('n_players','value'),
#     Input('season_drop','value'))
# def update_figure(stat,y,pos,top_n_players,season_type):
    
#     fig = px.scatter(dfs, x='YEAR', y="REB", color="POSITION",markers=True)
    
# #     fig.update_layout(yaxis={'categoryorder':'max ascending'},
# #                       title_text= stat+ " for top "+str(top_n_players)+" players at "+ pos +\
# #                       " in the "+str(y) +" regular season", title_x=0.5)

#     return fig

In [117]:
app.run_server(mode = "external")


The 'environ['werkzeug.server.shutdown']' function is deprecated and will be removed in Werkzeug 2.1.



Dash app running on http://127.0.0.1:8050/


## The resulting graph shows that the teams with a more complete defense tend to win the Finals with a total a proportion of 0.72
![graph](../newplot.png)
