# Looking Past the Basics: An Analysis of NBA Player Draymond Green and How He Contributed To A Championship Team

Basketball players in the NBA are often judged by the “basic” statistics: how many points do you score, how many rebounds do you get, and so on. However, the game is deeper than just these surface level stats. How can a player like Draymond Green of the Golden State Warriors, whose traditional stats are pedestrian at best, be a 3-time All-Star with 3 NBA championships? In my project, I am looking into more advanced statistics to examine what we miss when we judge players based on just their points, rebounds, and assists. To do so, I have collected NBA player data across 6 seasons from BasketballReference and 538 which cover many more advanced aspects of the game.

## Basic Offensive Statistics

The easiest way to judge NBA players is by how well they shoot the ball. Field goal percentage (the percentage of your shots which go into the basket) and points scored per game are often metrics by which the league's best players are compared. These statistics are easy to count, and have an obvious impact on any game you watch. To try to easily aggregate and display, I performed a principal component analysis on 6 typical offensive statistics (percentage of three-pointers made, percentage of two-pointers made, free throw attempts, free throws percentage, shots attempted, and assists) in order to capture the impact a player has offensively.

In this visualization, Draymond Green is compared to the rest of the NBA players in a given season (with some minimum number of minutes played throughout the year). All-NBA players, who are the 15 best players in a given season voted on by 100 sportswriters and broadcasters, are also identified to highlight where the peak talent is.

In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interactive
from sklearn.decomposition import PCA
from sklearn.preprocessing import MinMaxScaler
%matplotlib inline

data = pd.read_csv('./data/BballrefDatav3.csv')
data['draymond'] = np.where(data.Player.str.contains('Draymond Green'), 1, 0)

def pcaplot(year = 2014, minMP = 500):
    pca = PCA(n_components=2)
    scale = MinMaxScaler()
    X = data[['Thr','TwoPper','FTA','FTper','FGA','ASTper']]
    Xs = scale.fit_transform(X)
    Xp = pca.fit_transform(Xs)
    dfp = pd.DataFrame(data = Xp)
    dfp['Player'] = data['Player']
    dfp['TRAIN'] = data['TRAIN']
    dfp['MP'] = data['MP']
    dfp['2014'] = np.where(dfp.Player.str.contains('2014'), 1, 0)
    dfp['2015'] = np.where(dfp.Player.str.contains('2015'), 1, 0)
    dfp['2016'] = np.where(dfp.Player.str.contains('2016'), 1, 0)
    dfp['2017'] = np.where(dfp.Player.str.contains('2017'), 1, 0)
    dfp['2018'] = np.where(dfp.Player.str.contains('2018'), 1, 0)
    dfp['2019'] = np.where(dfp.Player.str.contains('2019'), 1, 0)
    yeardf = dfp[dfp[str(year)] == 1]
    yeardf = yeardf[yeardf['MP'] > minMP]
    # note: TRAIN is a relic from previous analyses; any player with a TRAIN of 5 or more made the All-NBA team.
    # I'm converting the variable to capture Green (2), All-NBA players (1), and others (0).
    yeardf.loc[(yeardf.TRAIN < 5), ('TRAIN')] = 0
    yeardf.loc[(yeardf.TRAIN > 0), ('TRAIN')] = 1
    yeardf.loc[(yeardf.Player.str.contains('Draymond Green')), ('TRAIN')] = 2
    yeardf.columns.values[0] = 'PCA1'
    yeardf.columns.values[1] = 'PCA2'
    yeardf.columns = yeardf.columns.str.strip()
    draydf = yeardf.loc[yeardf['TRAIN'] == 2]
    allnbadf = yeardf.loc[yeardf['TRAIN'] == 1]
    restdf = yeardf.loc[yeardf['TRAIN'] == 0]
    plt.figure(figsize = (10,10))
    ax = plt.gca()
    ax1 = ax.scatter(restdf.PCA1, restdf.PCA2, ec = 'gray', alpha = 0.5, s = 75)
    ax2 = ax.scatter(allnbadf.PCA1, allnbadf.PCA2, ec = 'gray', alpha = 0.5, s = 75)
    ax3 = ax.scatter(draydf.PCA1, draydf.PCA2, ec = 'black', alpha = 1, s = 100)
    plt.legend(['Other Players','All-NBA','Draymond Green'],
                fontsize=14,
                bbox_to_anchor = (0.3,1))
    plt.ylabel('PCA1, \n 42% Explained', fontsize = 14, rotation = 0, labelpad = 25)
    plt.xlabel('PCA2, 23% Explained', fontsize = 14)
    ax.get_xaxis().set_ticks([])
    ax.get_yaxis().set_ticks([])
    plt.title("PCA of Basic Offensive Statistics", fontsize = 17)
    plt.grid(axis = 'both', color = 'gray', alpha = .5)
    [ax.spines[s].set_visible(False) for s in ax.spines]
    plt.show()
    
    
interactive_plot = interactive(pcaplot,
                               year = (2014, 2019, 1),
                               minMP = (500, 2000, 100))
output = interactive_plot.children[-1]
output.layout.height = '600px'
interactive_plot

interactive(children=(IntSlider(value=2014, description='year', max=2019, min=2014), IntSlider(value=500, desc…

Based on this data, it is hard to believe that Green is anywhere near the talent level of the best players in the league. Regardless, he is considered to be that good. In fact, he made the All-NBA team in two of the years displayed; in 2016 and 2017. There must be something deeper which explains why he is such an outlier.

## Defense: Affecting Shots and Winning Games

It is harder to quantify the impact a player has in basketball defensively. It is inherently more cooperative than offense. It is often difficult to tell who is guarding who on a given possession, let alone what impact they have on a shot. However, the politics and sports statistics site FiveThirtyEight tried to quantify the average impact a defender has on a shot using spatial data to see who was the closest defender on a given shot. By adjusting for the skill of the shooter, they were able to come up with a metric which captures the change in field goal percentage that a player provides on whomever they are guarding.

Unlike traditional statistics, this metric captures the impact of Green. Over the six years analyzed, he has the most significant impact of opponents' shot out of the players analyzed.

In the following scatter plot, this metric, and a different metric measuring defensive impact called defensive win shares (the number of wins you are predicted to provide your team over the course of a season through your defensive impact), plotted against one other. In this chart, it is clear that Green is among the best.

In [7]:
def defense_plot(year, MinMP = 500):
    import matplotlib.ticker as mticker
    data['2014'] = np.where(data.Player.str.contains('2014'), 1, 0)
    data['2015'] = np.where(data.Player.str.contains('2015'), 1, 0)
    data['2016'] = np.where(data.Player.str.contains('2016'), 1, 0)
    data['2017'] = np.where(data.Player.str.contains('2017'), 1, 0)
    data['2018'] = np.where(data.Player.str.contains('2018'), 1, 0)
    data['2019'] = np.where(data.Player.str.contains('2019'), 1, 0)
    yeardata = data.loc[data[str(year)] == 1]
    yeardf = yeardata.loc[yeardata['MP'] > MinMP]
    yeardf = pd.DataFrame(data = yeardf)
    yeardf.loc[(yeardf.TRAIN < 5), ('TRAIN')] = 0
    yeardf.loc[(yeardf.TRAIN > 0), ('TRAIN')] = 1
    yeardf.loc[(yeardf.Player.str.contains('Draymond Green')), ('TRAIN')] = 2
    draydf = yeardf.loc[yeardf['TRAIN'] == 2]
    allnbadf = yeardf.loc[yeardf['TRAIN'] == 1]
    restdf = yeardf.loc[yeardf['TRAIN'] == 0]
    plt.figure(figsize = (10,10))
    ax = plt.gca()
    ax1 = ax.scatter(restdf.Draymond, restdf.DWS, ec = 'gray', alpha = 0.5, s = 75)
    ax2 = ax.scatter(allnbadf.Draymond, allnbadf.DWS, ec = 'gray', alpha = 0.5, s = 75)
    ax3 = ax.scatter(draydf.Draymond, draydf.DWS, ec = 'black', alpha = 1, s = 100)
    plt.legend(['Other Players', 'All-NBA Players', 'Draymond Green'],
                fontsize=14,
                bbox_to_anchor = (0.3,1))
    ticks_loc = ax.get_xticks().tolist()
    ax.xaxis.set_major_locator(mticker.FixedLocator(ticks_loc))
    ax.set_xticklabels([-x for x in ticks_loc], fontsize = 15)
    plt.ylabel('Defensive \n Win Shares', fontsize = 14, rotation = 0, labelpad = 40)
    plt.xlabel('Effect on Opponent FG Percentage', fontsize = 14)
    plt.title("Draymond Green Excels Defensively", fontsize = 17)
    plt.grid(axis = 'both', color = 'gray', alpha = .5)
    [ax.spines[s].set_visible(False) for s in ax.spines]
    plt.show()
    
interactive_plot = interactive(defense_plot,
                               year = (2014, 2019, 1),
                               MinMP = (500, 2000, 100))
output = interactive_plot.children[-1]
output.layout.height = '700px'
interactive_plot

interactive(children=(IntSlider(value=2016, description='year', max=2019, min=2014), IntSlider(value=500, desc…

## Putting It All Together

We have seen that, while Draymond Green may be only average offensively, he is able to impact the game heavily on defense. However, when it is all put together, is there any way to tell whether his total impact is positive? In a real game where both offense and defense matter, do I want Green on my team?

There are a few advanced metrics which try to quantify a player’s total impact on a basketball game. One is win shares (WS), which estimates the number of wins a player will contribute to a team in total throughout a season. The defensive subset of this statistic was used above, but there is a combined version which can capture complete impact. Another is VORP, or Value Over Replacement Player. This estimates the points a player contributes to a team per 100 possessions over the average replacement player.

Using a scatter plot of these two metrics, we should be able to best visualize the impact Green has compared to the rest of the NBA.

In [8]:
def wins_plot(year, MinMP = 500):
    data['2014'] = np.where(data.Player.str.contains('2014'), 1, 0)
    data['2015'] = np.where(data.Player.str.contains('2015'), 1, 0)
    data['2016'] = np.where(data.Player.str.contains('2016'), 1, 0)
    data['2017'] = np.where(data.Player.str.contains('2017'), 1, 0)
    data['2018'] = np.where(data.Player.str.contains('2018'), 1, 0)
    data['2019'] = np.where(data.Player.str.contains('2019'), 1, 0)
    yeardata = data.loc[data[str(year)] == 1]
    yeardf = yeardata.loc[yeardata['MP'] > MinMP]
    yeardf = pd.DataFrame(data = yeardf)
    yeardf.loc[(yeardf.TRAIN < 5), ('TRAIN')] = 0
    yeardf.loc[(yeardf.TRAIN > 0), ('TRAIN')] = 1
    yeardf.loc[(yeardf.Player.str.contains('Draymond Green')), ('TRAIN')] = 2
    draydf = yeardf.loc[yeardf['TRAIN'] == 2]
    allnbadf = yeardf.loc[yeardf['TRAIN'] == 1]
    restdf = yeardf.loc[yeardf['TRAIN'] == 0]
    plt.figure(figsize = (10,10))
    ax = plt.gca()
    ax1 = ax.scatter(restdf.VORP, restdf.WS, ec = 'gray', alpha = 0.5, s = 75)
    ax2 = ax.scatter(allnbadf.VORP, allnbadf.WS, ec = 'gray', alpha = 0.5, s = 75)
    ax3 = ax.scatter(draydf.VORP, draydf.WS, ec = 'black', alpha = 1, s = 100)
    plt.legend(['Other Players', 'All-NBA Players', 'Draymond Green'],
                fontsize=14,
                bbox_to_anchor = (0.3,1))
    yvals = ax.get_yticks()
    xvals = ax.get_xticks()
    plt.ylabel('Win Shares', fontsize = 14, rotation = 0, labelpad = 40)
    plt.xlabel('Value Over Replacement Player', fontsize = 14)
    plt.title("Draymond Green's Overall Value", fontsize = 17)
    plt.grid(axis = 'both', color = 'gray', alpha = .5)
    [ax.spines[s].set_visible(False) for s in ax.spines]
    plt.show()
    
interactive_plot = interactive(wins_plot,
                               year = (2014, 2019, 1),
                               MinMP = (500, 2000, 100))
output = interactive_plot.children[-1]
output.layout.height = '700px'
interactive_plot

interactive(children=(IntSlider(value=2016, description='year', max=2019, min=2014), IntSlider(value=500, desc…

Looking year by year, it seems as though Draymond Green has varying levels of overall value. However, with the exception perhaps of 2014 and 2019, he is consistently well above the average player. It appears as though his 2016 and 2017 All-NBA labels are well justified, and he may have deserved to receive the award in 2015 as well. This is despite having an offensive impact which is only around the league average, yet made up for by exceptional defense. It is safe to say that over this period, Green was worthy of the accolades he received and the credit he was given as an integral part of the Golden State Warriors during their 3 championship seasons.


### References
- For opponent field goal percentage statistics: https://fivethirtyeight.com/features/a-better-way-to-evaluate-nba-defense/
- For all other statistics and All-NBA selections: https://www.basketball-reference.com/