
# NBA 3-Point Statistics

## Table of Contents
1. [Introduction](#Introduction)
2. [Data Wrangling](#Data-Wrangling)
3. [New NBA Stats](#New-NBA-Stats)
4. [3Pave Rankings](#3Pave-Rankings)
5. [Conclusion](#Conclusion)


## Introduction

There has never been a single metric to determine the best NBA 3-Point shooters. Most fans know the reality from watching the games. Objectively, most fans consider two numbers, 3-point Percentage, and 3 Pointers Made.

Is there an empirical way to combine 3-point Percentage and 3-pointers Made into one metric? Will the metric verify that Steph Curry is the greatest 3-point shooter of all-time? What about the best 3-point shooting team of all-time?

To answer these questions, I developed 3Pave, or 3-point Average. 3Pave computes the number of points a team gains per possession when the player makes a 3-pointer, minus the number of points the team loses when the player misses a 3-pointer. The metric is based on the expected value of points per possession.

This Jupyter Notebook contains Exploratory Data Analysis introducing to the new metric, 3Pave. It also introduces EM3A, and EM3, Expected Minutes before a 3-point Attempt, and Expected Minutes before a 3. All metrics evaluate 3-point shooters throughout NBA History. Data Wrangling steps are included for those with an interest in pandas. Readers uninterested in pandas can skip directly to [New NBA Stats](#New-NBA-Stats).    


#### References

https://www.kaggle.com/drgilermo/nba-players-stats <br>
https://www.basketball-reference.com/

#### Copyright

Corey J Wade<br>
May 31, 2018

This Jupyter Notebook and the statistics within may be redistributed provided that credit is given to the author, Corey J Wade.


## Data Wrangling

The following csv file is taken from https://www.kaggle.com/drgilermo/nba-players-stats. When I downloaded the file, it contained NBA statistics through 2017. Dr. Guillermo scraped it from https://www.basketball-reference.com/. 

#### NBA Stats Through 2017

In [620]:
# Import pandas
import pandas as pd

# Open file as DataFrame
df_2017 = pd.read_csv('Seasons_Stats.csv')

# Display first five rows
df_2017.head()

Unnamed: 0.1,Unnamed: 0,Year,Player,Pos,Age,Tm,G,GS,MP,PER,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,0,1950.0,Curly Armstrong,G-F,31.0,FTW,63.0,,,,...,0.705,,,,176.0,,,,217.0,458.0
1,1,1950.0,Cliff Barker,SG,29.0,INO,49.0,,,,...,0.708,,,,109.0,,,,99.0,279.0
2,2,1950.0,Leo Barnhorst,SF,25.0,CHS,67.0,,,,...,0.698,,,,140.0,,,,192.0,438.0
3,3,1950.0,Ed Bartels,F,24.0,TOT,15.0,,,,...,0.559,,,,20.0,,,,29.0,63.0
4,4,1950.0,Ed Bartels,F,24.0,DNN,13.0,,,,...,0.548,,,,20.0,,,,27.0,59.0


Basketball statistics were not widely computed before the modern era, hence the null values. The 3-point shot did not exist before 1979, so we can start there.

In [621]:
# Delete unnecessary column
del df_2017['Unnamed: 0']

# Only select years after 1979
df_2017 = df_2017[df_2017['Year']>=1979]

# Display last five rows
df_2017.tail()

Unnamed: 0,Year,Player,Pos,Age,Tm,G,GS,MP,PER,TS%,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
24686,2017.0,Cody Zeller,PF,24.0,CHO,62.0,58.0,1725.0,16.7,0.604,...,0.679,135.0,270.0,405.0,99.0,62.0,58.0,65.0,189.0,639.0
24687,2017.0,Tyler Zeller,C,27.0,BOS,51.0,5.0,525.0,13.0,0.508,...,0.564,43.0,81.0,124.0,42.0,7.0,21.0,20.0,61.0,178.0
24688,2017.0,Stephen Zimmerman,C,20.0,ORL,19.0,0.0,108.0,7.3,0.346,...,0.6,11.0,24.0,35.0,4.0,2.0,5.0,3.0,17.0,23.0
24689,2017.0,Paul Zipser,SF,22.0,CHI,44.0,18.0,843.0,6.9,0.503,...,0.775,15.0,110.0,125.0,36.0,15.0,16.0,40.0,78.0,240.0
24690,2017.0,Ivica Zubac,C,19.0,LAL,38.0,11.0,609.0,17.0,0.547,...,0.653,41.0,118.0,159.0,30.0,14.0,33.0,30.0,66.0,284.0


In [622]:
# Display info
df_2017.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 19271 entries, 5382 to 24690
Data columns (total 52 columns):
Year      19271 non-null float64
Player    19271 non-null object
Pos       19271 non-null object
Age       19271 non-null float64
Tm        19271 non-null object
G         19271 non-null float64
GS        18233 non-null float64
MP        19271 non-null float64
PER       19266 non-null float64
TS%       19193 non-null float64
3PAr      18839 non-null float64
FTr       19181 non-null float64
ORB%      19266 non-null float64
DRB%      19266 non-null float64
TRB%      19266 non-null float64
AST%      19266 non-null float64
STL%      19266 non-null float64
BLK%      19266 non-null float64
TOV%      19209 non-null float64
USG%      19266 non-null float64
blanl     0 non-null float64
OWS       19271 non-null float64
DWS       19271 non-null float64
WS        19271 non-null float64
WS/48     19266 non-null float64
blank2    0 non-null float64
OBPM      19271 non-null float64
DBPM    

#### 2018 NBA Stats

The 2018 NBA season recently finished. I used the same link as Dr. Guillermo, https://www.basketball-reference.com/, to scrape the 2018 statistics.

In [623]:
# Read html file
df_2018, = pd.read_html("https://www.basketball-reference.com/leagues/NBA_2018_totals.html", header=0)

# Convert to csv file
df_2018.to_csv("df_2018.csv", index=False)

# Display first five rows
df_2018.head()

Unnamed: 0,Rk,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,1,Alex Abrines,SG,24,OKC,75,8,1134,115,291,...,0.848,26,88,114,28,38,8,25,124,353
1,2,Quincy Acy,PF,27,BRK,70,8,1359,130,365,...,0.817,40,216,256,57,33,29,60,149,411
2,3,Steven Adams,C,24,OKC,76,76,2487,448,712,...,0.557,384,301,685,88,92,78,128,215,1056
3,4,Bam Adebayo,C,20,MIA,69,19,1368,174,340,...,0.721,118,263,381,101,32,41,66,138,477
4,5,Arron Afflalo,SG,32,ORL,53,3,682,65,162,...,0.846,4,62,66,30,4,9,21,56,179


Since there is no column for 'Year', I need to add one.

In [624]:
# Delete unnecessary column
del df_2018['Rk']

# Add column for year, place at index 0
df_2018.insert(0, 'Year', 2018.0)

# Display last five rows
df_2018.tail()

Unnamed: 0,Year,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
685,2018.0,Tyler Zeller,C,28,BRK,42,33,703,125,229,...,0.667,63,131,194,28,8,21,35,78,300
686,2018.0,Tyler Zeller,C,28,MIL,24,1,406,62,105,...,0.895,47,64,111,19,7,14,12,48,141
687,2018.0,Paul Zipser,SF,23,CHI,54,12,824,81,234,...,0.731,13,118,131,46,20,15,43,86,218
688,2018.0,Ante Zizic,C,21,CLE,32,2,214,49,67,...,0.724,24,36,60,5,2,13,11,30,119
689,2018.0,Ivica Zubac,C,20,LAL,43,0,410,61,122,...,0.765,45,78,123,25,8,15,26,47,161


In [625]:
# Display info
df_2018.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 690 entries, 0 to 689
Data columns (total 30 columns):
Year      690 non-null float64
Player    690 non-null object
Pos       690 non-null object
Age       690 non-null object
Tm        690 non-null object
G         690 non-null object
GS        690 non-null object
MP        690 non-null object
FG        690 non-null object
FGA       690 non-null object
FG%       686 non-null object
3P        690 non-null object
3PA       690 non-null object
3P%       625 non-null object
2P        690 non-null object
2PA       690 non-null object
2P%       672 non-null object
eFG%      686 non-null object
FT        690 non-null object
FTA       690 non-null object
FT%       632 non-null object
ORB       690 non-null object
DRB       690 non-null object
TRB       690 non-null object
AST       690 non-null object
STL       690 non-null object
BLK       690 non-null object
TOV       690 non-null object
PF        690 non-null object
PTS       690 non-null o

#### Concatenating DataFrames

Before concatenating DataFrames, I select relevant columns for computing 3-point statistics.

In [626]:
# Select relevant columns
tp_2017 = df_2017[['Year', 'Tm', 'Player', 'G','MP', 'PTS', '3P', '3PA', '3P%']]
tp_2018 = df_2018[['Year', 'Tm', 'Player', 'G','MP', 'PTS', '3P', '3PA', '3P%']]

# Concatenate dataframes
tp = pd.concat([tp_2017, tp_2018], ignore_index=True, )

# Show last five rows
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%
19956,2018.0,BRK,Tyler Zeller,42,703,300,10,26,0.385
19957,2018.0,MIL,Tyler Zeller,24,406,141,0,2,0.0
19958,2018.0,CHI,Paul Zipser,54,824,218,37,110,0.336
19959,2018.0,CLE,Ante Zizic,32,214,119,0,0,
19960,2018.0,LAL,Ivica Zubac,43,410,161,0,1,0.0


#### Column Consistency

In [627]:
# Display info
tp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19961 entries, 0 to 19960
Data columns (total 9 columns):
Year      19961 non-null float64
Tm        19961 non-null object
Player    19961 non-null object
G         19961 non-null object
MP        19961 non-null object
PTS       19961 non-null object
3P        19617 non-null object
3PA       19617 non-null object
3P%       16041 non-null object
dtypes: float64(1), object(8)
memory usage: 1.4+ MB


With the exception of 'Year', the data has not been rendered as numbers. They must be converted to floats for mathematical operations.

In [628]:
# Convert numeric columns to decimals
tp.G = pd.to_numeric(tp.G, errors='coerce')
tp.MP = pd.to_numeric(tp.MP, errors='coerce')
tp.PTS = pd.to_numeric(tp.PTS, errors='coerce')
tp['3P'] = pd.to_numeric(tp['3P'], errors='coerce')
tp['3PA'] = pd.to_numeric(tp['3PA'], errors='coerce')
tp['3P%'] = pd.to_numeric(tp['3P%'], errors='coerce')

# Check columns
tp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19961 entries, 0 to 19960
Data columns (total 9 columns):
Year      19961 non-null float64
Tm        19961 non-null object
Player    19961 non-null object
G         19935 non-null float64
MP        19935 non-null float64
PTS       19935 non-null float64
3P        19591 non-null float64
3PA       19591 non-null float64
3P%       16015 non-null float64
dtypes: float64(7), object(2)
memory usage: 1.4+ MB


#### Minimum Requirements

It's not necessary to examine data from all players. If a player was never recorded as taking a 3-pointer, he can be excluded from the DataFrame. Players who only took a few 3's may also be excluded. The purpose of the minimum requirements is to eliminate non-3-point shooters and very low outliers. My minimum requirements are less stringent than other NBA "qualified" statistics online. See, for instance, https://stats.nba.com/help/statminimums/.

In [629]:
# Choose players with more than 20 3's per season
tp = tp[(tp['3P'] > 20)]

# Choose players with more than 320 mintes played per season
tp = tp[(tp['MP'] > 320)]

# Choose players with at least 41 games per season
tp = tp[(tp['G'] > 41)]

# Display last 5 rows
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%
19948,2018.0,TOR,Delon Wright,69.0,1433.0,555.0,56.0,153.0,0.366
19951,2018.0,IND,Joe Young,53.0,558.0,207.0,25.0,66.0,0.379
19952,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377
19953,2018.0,IND,Thaddeus Young,81.0,2607.0,955.0,58.0,181.0,0.32
19958,2018.0,CHI,Paul Zipser,54.0,824.0,218.0,37.0,110.0,0.336


In [630]:
tp.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 5036 entries, 354 to 19958
Data columns (total 9 columns):
Year      5036 non-null float64
Tm        5036 non-null object
Player    5036 non-null object
G         5036 non-null float64
MP        5036 non-null float64
PTS       5036 non-null float64
3P        5036 non-null float64
3PA       5036 non-null float64
3P%       5036 non-null float64
dtypes: float64(7), object(2)
memory usage: 393.4+ KB


Now all columns have the same number of rows.

#### Points Per Possession

The last piece of Data Wrangling is points per possession. It will be used to compute the expected value of points each time a team has the ball. I obtained the team ratings at https://www.basketball-reference.com/leagues/NBA_stats.html.

In [631]:
# Read html file
df_teams, = pd.read_html("https://www.basketball-reference.com/leagues/NBA_stats.html", header=0)

# Display first five rows
df_teams.head()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Per Game,Shooting,Advanced,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 22,Unnamed: 23,Unnamed: 24,Unnamed: 25,Unnamed: 26,Unnamed: 27,Unnamed: 28,Unnamed: 29,Unnamed: 30,Unnamed: 31
0,Rk,Season,Lg,Age,Ht,Wt,G,MP,FG,FGA,...,PTS,FG%,3P%,FT%,Pace,eFG%,TOV%,ORB%,FT/FGA,ORtg
1,1,2017-18,NBA,26.4,6-7,218,1230,241.4,39.6,86.1,...,106.3,.460,.362,.767,97.3,.521,13.0,22.3,.193,108.6
2,2,2016-17,NBA,26.6,6-7,220,1230,241.6,39.0,85.4,...,105.6,.457,.358,.772,96.4,.514,12.7,23.3,.209,108.8
3,3,2015-16,NBA,26.7,6-7,221,1230,241.8,38.2,84.6,...,102.7,.452,.354,.757,95.8,.502,13.2,23.8,.209,106.4
4,4,2014-15,NBA,26.7,6-7,222,1230,242.0,37.5,83.6,...,100.0,.449,.350,.750,93.9,.496,13.3,25.1,.205,105.6


In [632]:
# Drop first row
df_teams.drop(df_teams.index[0], inplace=True)

# Choose relevant columns
df_PPP = df_teams[['Unnamed: 1','Unnamed: 31']]

# Rename columns
df_PPP.columns = ['Year', 'PPP']

# Show first five rows
df_PPP.head()

Unnamed: 0,Year,PPP
1,2017-18,108.6
2,2016-17,108.8
3,2015-16,106.4
4,2014-15,105.6
5,2013-14,106.6


In [633]:
# Show column info
df_PPP.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 78 entries, 1 to 78
Data columns (total 2 columns):
Year    75 non-null object
PPP     48 non-null object
dtypes: object(2)
memory usage: 1.8+ KB


In [634]:
# Convert year column to year listed before hyphen
df_PPP['Year'] = df_PPP['Year'].str.split('-').str[0]

# Convert columns to numbers
df_PPP['Year'] = pd.to_numeric(df_PPP['Year'], errors='coerce')
df_PPP['PPP'] = pd.to_numeric(df_PPP['PPP'], errors='coerce')

# Drop NaN values
df_PPP = df_PPP.dropna()

# Add 1 to each year, since NBA seasons are maked by the second, not first number
df_PPP['Year'] = df_PPP['Year'] + 1

# Offensive rating is defined points per 100 possession
# Divide by 100 to convert to points per possession
df_PPP['PPP'] = df_PPP['PPP']/100

# View DataFrame
df_PPP

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Unnamed: 0,Year,PPP
1,2018.0,1.086
2,2017.0,1.088
3,2016.0,1.064
4,2015.0,1.056
5,2014.0,1.066
6,2013.0,1.058
7,2012.0,1.046
8,2011.0,1.073
9,2010.0,1.076
10,2009.0,1.083


## New NBA Stats

### Expected Minutes Before 3's

This first group of statistics computes the number of minutes players are on the court before attemping and making 3's.

#### AM3A : Average Minutes per 3-point Attempt

A player's Average Minutes per 3-Point Attempt is total minutes played divided by total 3-pointers attempted.

In [635]:
# Define AM3A, Average Minutes per 3-point Attempt 
tp['AM3A'] = tp['MP'] / tp['3PA']

# Show last five entrants
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A
19948,2018.0,TOR,Delon Wright,69.0,1433.0,555.0,56.0,153.0,0.366,9.366013
19951,2018.0,IND,Joe Young,53.0,558.0,207.0,25.0,66.0,0.379,8.454545
19952,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377,4.273006
19953,2018.0,IND,Thaddeus Young,81.0,2607.0,955.0,58.0,181.0,0.32,14.403315
19958,2018.0,CHI,Paul Zipser,54.0,824.0,218.0,37.0,110.0,0.336,7.490909


#### EM3A : Expected Minutes before 3-point Attempt

The expected value of a continuous interval of time is typically at the halfway mark. Will Nick Young (listed above) take a 3 once he checks in, or after 4.27 minutes? His most likely value is about halfway between, at 2.135 minutes. This is his expected minutes played before attempting a 3.

In [636]:
# Define EM3A, Expected Minutes before 3-point Attempt
tp['EM3A'] = tp['AM3A'] / 2

# Sort DataFrame by new category
tp_EM3A = tp.sort_values('EM3A', ascending=True)

# Reset index
tp_EM3A = tp_EM3A.reset_index(drop=True)

# Start index at 1 instead of 0
tp_EM3A.index = tp_EM3A.index + 1

# View players who attempt 3s faster than anyone in NBA history
tp_EM3A.head(20)

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A
1,2018.0,ORL,Marreese Speights,52.0,675.0,402.0,86.0,233.0,0.369,2.896996,1.448498
2,2018.0,TOR,C.J. Miles,70.0,1337.0,699.0,164.0,454.0,0.361,2.944934,1.472467
3,2016.0,GSW,Stephen Curry,79.0,2700.0,2375.0,402.0,886.0,0.454,3.047404,1.523702
4,2015.0,DAL,Charlie Villanueva,64.0,678.0,403.0,83.0,221.0,0.376,3.067873,1.533937
5,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745
6,2017.0,MEM,Troy Daniels,67.0,1183.0,551.0,138.0,355.0,0.389,3.332394,1.666197
7,2017.0,GSW,Stephen Curry,79.0,2638.0,1999.0,324.0,789.0,0.411,3.343473,1.671736
8,2015.0,TOT,Troy Daniels,47.0,397.0,176.0,43.0,118.0,0.364,3.364407,1.682203
9,2017.0,HOU,Eric Gordon,75.0,2323.0,1217.0,246.0,661.0,0.372,3.514372,1.757186
10,2018.0,CHO,Malik Monk,63.0,854.0,421.0,83.0,243.0,0.342,3.514403,1.757202


Statistical Notes<ul>
    <li> Many players on the list come off the bench. EM3A does not distinguish between starters and reserves.</li>
    <li> Most top performers are from the last few years, due to the meteoric rise of NBA 3-pointers. </li>
     <li> Joe Hassett from 1982 is a shocker! </li>
    </ul>

#### AM3 : Average Minutes per 3-Pointer

AM3 computes the average minutes played per 3-pointer made.

In [637]:
# Define AM3, Average Minutes per 3-pointer made
tp['AM3'] = tp['MP']/tp['3P']

# Show last five rows
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3
19948,2018.0,TOR,Delon Wright,69.0,1433.0,555.0,56.0,153.0,0.366,9.366013,4.683007,25.589286
19951,2018.0,IND,Joe Young,53.0,558.0,207.0,25.0,66.0,0.379,8.454545,4.227273,22.32
19952,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377,4.273006,2.136503,11.325203
19953,2018.0,IND,Thaddeus Young,81.0,2607.0,955.0,58.0,181.0,0.32,14.403315,7.201657,44.948276
19958,2018.0,CHI,Paul Zipser,54.0,824.0,218.0,37.0,110.0,0.336,7.490909,3.745455,22.27027


#### EM3 : Expected Minutes Before a 3

This is my favorite statistic of the group. It's how long a player is expected to be on the court before making a 3. EM3 is AM3 divided by two.

In [638]:
# Define EM3, Expected Minutes before 3-pointer
tp['EM3'] = tp['AM3'] / 2

# Sort DataFrame by new category
tp_EM3 = tp.sort_values('EM3', ascending=True)

# Reset index
tp_EM3 = tp_EM3.reset_index(drop=True)

# Start index at 1 instead of 0
tp_EM3.index = tp_EM3.index + 1

# Display top twenty seasons of all-time
tp_EM3.head(20)

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3,EM3
1,2016.0,GSW,Stephen Curry,79.0,2700.0,2375.0,402.0,886.0,0.454,3.047404,1.523702,6.716418,3.358209
2,2012.0,NYK,Steve Novak,54.0,1020.0,477.0,133.0,282.0,0.472,3.617021,1.808511,7.669173,3.834586
3,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745,7.693396,3.846698
4,2018.0,ORL,Marreese Speights,52.0,675.0,402.0,86.0,233.0,0.369,2.896996,1.448498,7.848837,3.924419
5,2016.0,CHO,Troy Daniels,43.0,476.0,242.0,59.0,122.0,0.484,3.901639,1.95082,8.067797,4.033898
6,2017.0,GSW,Stephen Curry,79.0,2638.0,1999.0,324.0,789.0,0.411,3.343473,1.671736,8.141975,4.070988
7,2018.0,TOR,C.J. Miles,70.0,1337.0,699.0,164.0,454.0,0.361,2.944934,1.472467,8.152439,4.07622
8,2015.0,DAL,Charlie Villanueva,64.0,678.0,403.0,83.0,221.0,0.376,3.067873,1.533937,8.168675,4.084337
9,2017.0,MEM,Troy Daniels,67.0,1183.0,551.0,138.0,355.0,0.389,3.332394,1.666197,8.572464,4.286232
10,2018.0,PHO,Troy Daniels,79.0,1622.0,703.0,183.0,458.0,0.4,3.541485,1.770742,8.863388,4.431694


EMB3 Statistical Notes:<ul>
    <li> EMB3 measures how quickly shooters make 3-pointers upon taking the court.</li> 
    <li> More restrictive minimum requirements could eliminate reserves. I prefer leaving them in. </li>
     <li> Steph Curry's legendary 2016 MVP season is a clear # 1. </li>
    </ul>

I prefer EM3A and EM3 to AM3A and AM3. They are shorter, more informative, and have a better ring. Since AM3A and AM3 are just doubles of EM3A and EM3, they can be eliminated without losing any valuable information.

In [639]:
# Delete extraneous columns
del tp['AM3A'] 
del tp['AM3']

Also, I have reindexed twice, and expect to do so again. It's always better to write a function instead of copying and pasting.

In [640]:
def reindex_start_1(data):
    
    # Reset index
    data = data.reset_index(drop=True)

    # Start index at 1 instead of 0
    data.index = data.index + 1
    
    return data.index

### 3Pave

The 3-point statistics above are compelling, but they do not a provide a single metric to rank all 3-point shooters. This is where 3Pave, or 3-point average comes in. 3Pave adds what the team gains beyond the expected value, and subtracts what the team loses beyond the expected value, for each 3-pointer attempted. 

#### Points Per Possession

3Pave depends on the expected value. Should the expected value be points per possession? Or points per field goal attempt? I have chosen points per possession since each time a team has the ball, this is what they are expected to earn. I will use mean points per possession throughout NBA history. The statistic was first computed in 1974.

In [641]:
# Define ev, expected value, as points per possession
ev = df_PPP['PPP'].mean()

# Display ev
print('Avg. Points Per Possession:', ev)

Avg. Points Per Possession: 1.0554222222222218


This is very close to what current teams average at 1.08

#### 3Pave Formula

When a player makes a 3-pointer, the team gains an extra 3 points minus the expected value. When a player misses a 3-pointer, the team loses the expected value.

In [642]:
# Compute 3PG, 3-pointers per Game
tp['3PG']=tp['3P']/tp['G']

# Compute 3PAG, 3-point Attempts per Game
tp['3PAG']= tp['3PA']/tp['G']

# Compute 3-point Misses per Game
tp_misses =tp['3PAG']-tp['3PG']

# Declare expected value
ev = df_PPP['PPP'].mean()
                          
# Compute 3Pave, 3-point Advantage
tp['3Pave']=tp['3PG'] * (3 - ev) - tp_misses * ev

# (3 - ev) is what the team gains per 3-pointer made
# -ev is what the team loses per 3-pointer missed

## 3Pave Rankings

#### The Top 25

In [644]:
# Sort dataframe by 3Pave
tp=tp.sort_values('3Pave', ascending=False)

# Reset index
tp.index = reindex_start_1(tp)

# Display top 25 3-point shooting seasons of all-time
tp.head(25)

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,EM3A,EM3,3PG,3PAG,3Pave
1,2016.0,GSW,Stephen Curry,79.0,2700.0,2375.0,402.0,886.0,0.454,1.523702,3.358209,5.088608,11.21519,3.429062
2,2015.0,ATL,Kyle Korver,75.0,2418.0,911.0,221.0,449.0,0.492,2.69265,5.470588,2.946667,5.986667,2.521539
3,2013.0,GSW,Stephen Curry,78.0,2983.0,1786.0,272.0,600.0,0.453,2.485833,5.483456,3.487179,7.692308,2.342906
4,2015.0,GSW,Stephen Curry,80.0,2613.0,1900.0,286.0,646.0,0.443,2.022446,4.568182,3.575,8.075,2.202466
5,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,1.627745,3.846698,4.156863,9.823529,2.102617
6,2016.0,LAC,J.J. Redick,75.0,2097.0,1226.0,200.0,421.0,0.475,2.490499,5.2425,2.666667,5.613333,2.075563
7,2014.0,ATL,Kyle Korver,71.0,2408.0,850.0,185.0,392.0,0.472,3.071429,6.508108,2.605634,5.521127,1.989782
8,1997.0,CHH,Glen Rice,79.0,3362.0,2115.0,207.0,440.0,0.47,3.820455,8.120773,2.620253,5.56962,1.982459
9,2018.0,GSW,Klay Thompson,73.0,2506.0,1461.0,229.0,520.0,0.44,2.409615,5.471616,3.136986,7.123288,1.892883
10,2002.0,MIL,Ray Allen,69.0,2525.0,1503.0,229.0,528.0,0.434,2.391098,5.5131,3.318841,7.652174,1.880247


3Pave Statistical Notes:<ul>
    <li> Steph Curry's legendary MVP season is heads and shoulders above the rest, and he dominates the list as a player.</li> 
    <li> 3Pave does a nice job of comparing 3-point shooters over the years. </li>
    <li> 3Pave has real meaning. It conveys the actual points a team gains beyond the average by the player shooting 3-pointers. </li>
    </ul>

#### Weighted

It's telling to use the same measure, mean points per possession, across all years. But is it justifiable? Teams score more points per possession these days, so it could be argued that 3-pointers were more valuable in years past. The expected value can be weighted, by taking the mean points per possession for each given year. 

In [645]:
# Merge df_PPP, dataframe with 'Year' and 'PPP', with tp, the current dataframe
tp = tp.merge(df_PPP)

# Declare weighed expected value
evw = tp['PPP']

# Compute 3PMi, 3-point Misses per Game
tp_misses =tp['3PAG']-tp['3PG']
                          
# Compute 3Pave using weighted expected value
tp['3Pave/w']=tp['3PG'] * (3 - evw) - tp_misses * evw

#### The Top 25, Weighted

In [646]:
# Keep dataframe tight by eliminating unnecessary columns
tp.drop(['MP', 'PPP'], axis=1, inplace=True)

# Sort dataframe by 3Pave/w
tp=tp.sort_values('3Pave/w', ascending=False)

# Reset index
tp.index = reindex_start_1(tp)

# Display top 25 3-point shooting weighted seasons
tp.head(25)

Unnamed: 0,Year,Tm,Player,G,PTS,3P,3PA,3P%,EM3A,EM3,3PG,3PAG,3Pave,3Pave/w
1,2016.0,GSW,Stephen Curry,79.0,2375.0,402.0,886.0,0.454,1.523702,3.358209,5.088608,11.21519,3.429062,3.332861
2,2015.0,ATL,Kyle Korver,75.0,911.0,221.0,449.0,0.492,2.69265,5.470588,2.946667,5.986667,2.521539,2.51808
3,2013.0,GSW,Stephen Curry,78.0,1786.0,272.0,600.0,0.453,2.485833,5.483456,3.487179,7.692308,2.342906,2.323077
4,2015.0,GSW,Stephen Curry,80.0,1900.0,286.0,646.0,0.443,2.022446,4.568182,3.575,8.075,2.202466,2.1978
5,2016.0,LAC,J.J. Redick,75.0,1226.0,200.0,421.0,0.475,2.490499,5.2425,2.666667,5.613333,2.075563,2.027413
6,2002.0,MIL,Ray Allen,69.0,1503.0,229.0,528.0,0.434,2.391098,5.5131,3.318841,7.652174,1.880247,1.96
7,2014.0,ATL,Kyle Korver,71.0,850.0,185.0,392.0,0.472,3.071429,6.508108,2.605634,5.521127,1.989782,1.93138
8,2012.0,NYK,Steve Novak,54.0,477.0,133.0,282.0,0.472,1.808511,3.834586,2.462963,5.222222,1.87724,1.926444
9,1997.0,CHH,Glen Rice,79.0,2115.0,207.0,440.0,0.47,3.820455,8.120773,2.620253,5.56962,1.982459,1.917975
10,2004.0,SAC,Peja Stojakovic,81.0,1964.0,240.0,554.0,0.433,2.945848,6.8,2.962963,6.839506,1.670322,1.851037


The values are very close. Some players from earlier eras, like Ray Allen, move up the list, but others, like Glen Rice, actually move down. It depends on how many points per possession the league averaged that year. Consider Klay Thompson's 2018 drop from 9 to 17. Was his 3-point season not as valuable because the league was better at shooting 3s? 

I prefer unweighted as the default metric because it provides one basis of comparison. I do not think that shooters lose value because others have improved. Perhaps a balance between weighted and unweighted would strike the right chord. For now, I will return to unweighted as the default ranking.

In [647]:
# Sort Dataframe by 3Pave, unweighted
tp=tp.sort_values('3Pave', ascending=False)

# Reset index
tp.index = reindex_start_1(tp)

#### 2018 League Leaders

We can check the league leaders for any given year. Note that for a particular year, weighted and unweighted will provide the same order.

In [649]:
# Create 2018 dataframe
tp_2018 = tp[tp['Year']==2018.0]

# Reset index
tp_2018.index = reindex_start_1(tp_2018)

# Show top 10 3Pave
tp_2018.head(10)

Unnamed: 0,Year,Tm,Player,G,PTS,3P,3PA,3P%,EM3A,EM3,3PG,3PAG,3Pave,3Pave/w
1,2018.0,GSW,Stephen Curry,51.0,1346.0,212.0,501.0,0.423,1.627745,3.846698,4.156863,9.823529,2.102617,1.802235
2,2018.0,GSW,Klay Thompson,73.0,1461.0,229.0,520.0,0.44,2.409615,5.471616,3.136986,7.123288,1.892883,1.675068
3,2018.0,UTA,Joe Ingles,82.0,940.0,204.0,464.0,0.44,2.778017,6.318627,2.487805,5.658537,1.491269,1.318244
4,2018.0,PHI,J.J. Redick,70.0,1198.0,193.0,460.0,0.42,2.3,5.481865,2.757143,6.571429,1.335797,1.134857
5,2018.0,CLE,Kyle Korver,73.0,672.0,164.0,376.0,0.436,2.093085,4.79878,2.246575,5.150685,1.303579,1.146082
6,2018.0,DET,Reggie Bullock,62.0,698.0,125.0,281.0,0.445,3.081851,6.928,2.016129,4.532258,1.264941,1.126355
7,2018.0,GSW,Kevin Durant,68.0,1792.0,173.0,413.0,0.419,2.81477,6.719653,2.544118,6.073529,1.222215,1.0365
8,2018.0,SAC,Buddy Hield,80.0,1079.0,176.0,408.0,0.431,2.480392,5.75,2.2,5.1,1.217347,1.0614
9,2018.0,DET,Anthony Tolliver,79.0,703.0,159.0,365.0,0.436,2.406849,5.525157,2.012658,4.620253,1.161657,1.02038
10,2018.0,OKC,Paul George,79.0,1734.0,244.0,608.0,0.401,2.377467,5.92418,3.088608,7.696203,1.14308,0.907747


The Golden State Warriors dominate the list. What about the Houston Rockets? They have the reputation of being a great 3-point shooting team.

#### 2018 Warriors v Rockets

In [584]:
# Create 2018 dataframe for GSW and HOU
tp_2018_GSW_HOU = tp_2018[(tp_2018['Tm']=='GSW') | (tp_2018['Tm']=='HOU')]

# Display DataFrame with 2018 rankings
tp_2018_GSW_HOU

Unnamed: 0,Year,Tm,Player,G,PTS,3P,3PA,3P%,EM3A,EM3,3PG,3PAG,3Pave,3Pave/w
1,2018.0,GSW,Stephen Curry,51.0,1346.0,212.0,501.0,0.423,1.627745,3.846698,4.156863,9.823529,2.102617,1.802235
2,2018.0,GSW,Klay Thompson,73.0,1461.0,229.0,520.0,0.44,2.409615,5.471616,3.136986,7.123288,1.892883,1.675068
7,2018.0,GSW,Kevin Durant,68.0,1792.0,173.0,413.0,0.419,2.81477,6.719653,2.544118,6.073529,1.222215,1.0365
46,2018.0,HOU,Chris Paul,58.0,1081.0,144.0,379.0,0.38,2.436675,6.413194,2.482759,6.534483,0.551638,0.351828
50,2018.0,HOU,Ryan Anderson,66.0,617.0,131.0,339.0,0.386,2.544248,6.583969,1.984848,5.136364,0.533513,0.376455
63,2018.0,HOU,James Harden,72.0,2191.0,265.0,722.0,0.367,1.76662,4.813208,3.680556,10.027778,0.458127,0.1515
76,2018.0,HOU,Trevor Ariza,67.0,782.0,170.0,462.0,0.368,2.455628,6.673529,2.537313,6.895522,0.334253,0.123403
80,2018.0,GSW,Nick Young,80.0,581.0,123.0,326.0,0.377,2.136503,5.662602,1.5375,4.075,0.311654,0.18705
101,2018.0,HOU,P.J. Tucker,82.0,502.0,115.0,310.0,0.371,3.679032,9.917391,1.402439,3.780488,0.217306,0.101707
112,2018.0,HOU,Eric Gordon,69.0,1243.0,218.0,608.0,0.359,1.771382,4.940367,3.15942,8.811594,0.178309,-0.09113


Golden State is at the top and bottom, while Houston dominates the middle. It's interesting to note that Eric Gordon is a + or - depending on whether the column is weighted. Summing 3Pave will give us the winner.

In [585]:
# Sum 3Pave for GSW and HOU
tp_2018_GSW_HOU.groupby('Tm')['3Pave'].sum()

Tm
GSW    4.586971
HOU    2.378259
Name: 3Pave, dtype: float64

Golden State is the clear winner. How does the team rank historically?

#### Best 3-Point Shooting Teams of All-Time

In [586]:
# Eliminate TOT, Total for traded players, from list of NBA Teams
tp_teams = tp[tp['Tm'] != 'TOT']

# Group teams and year by 3Pave, sort in order
tp_teams = tp_teams.groupby(['Tm','Year'])['3Pave'].sum().sort_values(ascending=False)

# Convert top 25 to DataFrame
pd.DataFrame(tp_teams.head(25))

Unnamed: 0_level_0,Unnamed: 1_level_0,3Pave
Tm,Year,Unnamed: 2_level_1
GSW,2016.0,6.37441
GSW,2018.0,4.586971
GSW,2015.0,4.250061
PHO,2006.0,4.203874
PHO,2010.0,3.986143
CHH,1997.0,3.874711
PHO,2007.0,3.837297
MIA,2013.0,3.747401
GSW,2013.0,3.73075
SAS,2014.0,3.441617


It's no surprise that the Warriors take the top 3 spots. The 7-seconds-or-less Suns are close behind. The Charlotte Hornets from '97 are a surprise #7 until one recalls that they had Dell Curry and Glen Rice. Four of the last five NBA champions made the top 10.

#### Best 3-Point Shooting Teams of All-Time, Weighted

In [587]:
# Eliminate TOT, Total for traded players, from list of NBA Teams
tp_teams = tp[tp['Tm'] != 'TOT']

# Group teams and year by weighted 3Pave, and sort in order
tp_teams = tp_teams.groupby(['Tm','Year'])['3Pave/w'].sum().sort_values(ascending=False)

# Convert top 25 to DataFrame
pd.DataFrame(tp_teams.head(25))

Unnamed: 0_level_0,Unnamed: 1_level_0,3Pave/w
Tm,Year,Unnamed: 2_level_1
GSW,2016.0,6.082173
GSW,2015.0,4.233746
PHO,2006.0,4.041437
CHH,1997.0,3.696312
MIA,2013.0,3.687041
GSW,2013.0,3.683147
SAC,2004.0,3.671448
SEA,2001.0,3.630873
PHO,2007.0,3.620564
GSW,2018.0,3.590728


Weighted includes more Spurs teams, the Reggie Miller Pacer team that made the finals, and a Ray Allen Bucks team. The 73-win Warriors team is still way above the rest.

#### Best of the 90s

In [651]:
# Create 90s DataFrame
tp_90s = tp[(tp['Year']<2000) & (tp['Year']>1989)]

# Reset index
tp_90s.index = reindex_start_1(tp_90s)

# Display top 25 3-point shooting seasons of all-time for the 90s 
tp_90s.head(25)

Unnamed: 0,Year,Tm,Player,G,PTS,3P,3PA,3P%,EM3A,EM3,3PG,3PAG,3Pave,3Pave/w
1,1997.0,CHH,Glen Rice,79.0,2115.0,207.0,440.0,0.47,3.820455,8.120773,2.620253,5.56962,1.982459,1.917975
2,1995.0,PHI,Dana Barros,82.0,1686.0,197.0,425.0,0.464,3.903529,8.42132,2.402439,5.182927,1.737141,1.594207
3,1996.0,ORL,Dennis Scott,82.0,1431.0,267.0,628.0,0.425,2.421178,5.694757,3.256098,7.658537,1.685303,1.527707
4,1996.0,WSB,Tim Legler,77.0,726.0,128.0,245.0,0.522,3.622449,6.933594,1.662338,3.181818,1.628851,1.563377
5,1996.0,SAC,Mitch Richmond*,81.0,1872.0,225.0,515.0,0.437,2.860194,6.546667,2.777778,6.358025,1.622933,1.492099
6,1997.0,IND,Reggie Miller*,81.0,1751.0,229.0,536.0,0.427,2.766791,6.475983,2.82716,6.617284,1.497453,1.42084
7,1996.0,CHI,Steve Kerr,82.0,688.0,122.0,237.0,0.515,4.048523,7.864754,1.487805,2.890244,1.412987,1.353512
8,1996.0,NYK,Hubert Davis,74.0,789.0,127.0,267.0,0.476,3.320225,6.980315,1.716216,3.608108,1.340571,1.266324
9,1997.0,SAC,Mitch Richmond*,81.0,2095.0,204.0,477.0,0.428,3.275681,7.659314,2.518519,5.888889,1.340291,1.272111
10,1999.0,MIL,Dell Curry,42.0,423.0,69.0,145.0,0.476,2.97931,6.26087,1.642857,3.452381,1.284852,1.400238


The big 90s shooters. Glen Rice, Reggie Miller, Dell Curry, Dennis Scott, Mitch Ritchmond, Dale Ellis.

#### Best of the 80s

In [650]:
# Create 80s dataframe
# The first 3-point shot was recorded in the 1980 season.
tp_80s = tp[tp['Year']<1990]

# Reset index
tp_80s.index = reindex_start_1(tp_80s)

# Display top 25
tp_80s.head(25)

Unnamed: 0,Year,Tm,Player,G,PTS,3P,3PA,3P%,EM3A,EM3,3PG,3PAG,3Pave,3Pave/w
1,1989.0,SEA,Dale Ellis,82.0,2253.0,162.0,339.0,0.478,4.705015,9.845679,1.97561,4.134146,1.563559,1.47022
2,1988.0,TOT,Craig Hodges,66.0,629.0,86.0,175.0,0.491,4.128571,8.401163,1.30303,2.651515,1.110623,1.045455
3,1988.0,MIL,Craig Hodges,43.0,397.0,55.0,118.0,0.466,4.165254,8.936364,1.27907,2.744186,0.940934,0.873488
4,1988.0,BOS,Danny Ainge,81.0,1270.0,148.0,357.0,0.415,4.226891,10.195946,1.82716,4.407407,0.829806,0.721481
5,1989.0,CLE,Mark Price,75.0,1414.0,93.0,211.0,0.441,6.464455,14.666667,1.24,2.813333,0.750745,0.687227
6,1988.0,CLE,Mark Price,80.0,1279.0,72.0,148.0,0.486,8.871622,18.236111,0.9,1.85,0.747469,0.702
7,1987.0,BOS,Danny Ainge,71.0,1053.0,85.0,192.0,0.443,6.507812,14.7,1.197183,2.704225,0.73745,0.662873
8,1989.0,CHI,Craig Hodges,49.0,490.0,71.0,168.0,0.423,3.309524,7.830986,1.44898,3.428571,0.728348,0.650939
9,1986.0,MIL,Craig Hodges,66.0,716.0,73.0,162.0,0.451,5.367284,11.910959,1.106061,2.454545,0.7276,0.686909
10,1989.0,MIA,Jon Sundvold,68.0,709.0,48.0,92.0,0.522,7.271739,13.9375,0.705882,1.352941,0.689723,0.659176


Craig Hodges. Mark Price. Larry Bird. More Dale Ellis. The first 3-point shot was made in 1979, so we can't go back much further.

#### Career Totals

How about the most points gained by shooting 3's over their entire career?

In [590]:
# Create 3Pave/c using same formula as 3Pave, but use totals instead of per game
tp['3Pave/c'] = tp['3P'] * (3 - ev) - (tp['3PA'] - tp['3P']) * ev

# Group by player, and sum over their career
tp_player = tp.groupby('Player')['3Pave/c'].sum()

# Order from the top
tp_player = tp_player.sort_values(ascending=False)

# Convert to DataFrame for nicer viewing
tp_player = pd.DataFrame(tp_player)

# Display top 25
tp_player.head(25)

Unnamed: 0_level_0,3Pave/c
Player,Unnamed: 1_level_1
Kyle Korver,1245.264622
Stephen Curry,1199.245644
Ray Allen,1119.032
Steve Nash,890.282311
Reggie Miller*,834.531467
Klay Thompson,775.436578
J.J. Redick,677.701689
Dale Ellis,649.176044
Chauncey Billups,642.6984
Peja Stojakovic,603.778489


Kyle Korver beats out Ray Allen and Reggie Miller. (Note that stats are for the regular season only.) The only players on this list who are not retired, or at the end of their careers are Steph Curry and Klay Thompson. It's amazing to think that Steph Curry is already number two. Let's compare this to traditional 3 pointers made.

In [591]:
# Convert to DataFrame: Group by player, sum over 3 pointers, sort values, display top 25
pd.DataFrame(tp.groupby('Player')['3P'].sum().sort_values(ascending=False).head(25))

Unnamed: 0_level_0,3P
Player,Unnamed: 1_level_1
Ray Allen,3096.0
Reggie Miller*,2560.0
Tim Hardaway,2289.0
Kyle Korver,2286.0
Vince Carter,2284.0
Jason Terry,2243.0
Jamal Crawford,2234.0
Paul Pierce,2128.0
Joe Johnson,2087.0
Stephen Curry,2074.0


Most fans would agree that Kobe Bryant, Lebron James and Nick Van Exel are not as good at 3-pointers as Glen Rice, Del Curry and Steve Kerr. Finally, we have a statistic to prove it.

#### Career Averages

In [592]:
# Count number of seasons
tp['seasons'] = tp.groupby('Player')['3Pave'].transform('count')

# Require at least 4 seasons
tp_seasons = tp[tp['seasons']>=4]

# Compute average, divide sum of player's 3Pave by the number of seasons
tp_av = tp_seasons.groupby('Player')['3Pave'].sum()/tp_seasons.groupby('Player')['3Pave'].count()

# Sort and display top 25 as DataFrame
pd.DataFrame(tp_av.sort_values(ascending=False).head(25))

Unnamed: 0_level_0,3Pave
Player,Unnamed: 1_level_1
Stephen Curry,2.001557
Klay Thompson,1.437085
Kyle Korver,1.079106
J.J. Redick,0.866753
C.J. McCollum,0.846514
Ray Allen,0.831735
Hubert Davis,0.831196
Steve Novak,0.787593
Steve Nash,0.739509
Peja Stojakovic,0.739333


Steph Curry doubles everyone on the list except for teammate Klay Thompson, and Kyle Korver. I think we can safely answer the questions posed at the beginning of this notebook. The Warriors are the best 3-point shooting team of all-time because they have the two best 3-point shooters of all-time. The statistics verify what every fan knows to be true: Steph Curry is the greatest 3-point shooter of all-time.

## Conclusion

Three new NBA statistics have been presented, EM3A, EM3, and 3Pave. EM3A, Expected Minutes before a 3-point Attempt could be of value to coaches preparing for opponents and working with their own players. EM3, Expected Minutes before a 3, is a fun statistic that could be used for similar reasons. 3Pave, 3-point Average, is a powerful statistic that provides a single number to rank 3-point shooters across all seasons.

3Pave rewards players for making 3-point shots, and penalizes them for missing. Players that make a lot of 3s, but shoot a low percentage are exposed as making slight contributions to their teams. Players who shoot a high percentage need to make a high volume to be competitive. 3Pave rankings are statisically verifiable while simultaneously communicating valuable information.

3Pave reveals the net gain in points beyond the league average that a player adds to his team by shooting 3's. It can be weighted, summed, or displayed as per game averages. It can be used as a barometer to determine whether a player should be encouraged or discouraged from shooting 3's. Any positive score is a plus for the team, while negative scores are a detriment.

3Pave can be used to analyze playoff statistics and clutch 3-point shooters. It can be used during basketball seasons past and future to analyze the success of 3-point shooters. It can be used for any league, WNBA, college, high school, etc., provided that an appropiate expected value, like points per possession, is used in the formula given above.