
# NBA 3-Point Statistics

## Introduction

There has never been a single metric to determine the best NBA 3-Point shooter. Most fans consider two numbers, 3-Point Percentage, and 3 Pointers Made. Is there a way to combine these two categories into one?

This Jupyter Notebook reveals a new metric that will rank NBA 3-Point shooters using one number. I considered different ways of combining 3-Point Percentage and 3-Pointers Made before coming up with an empirical solution.


#### References

https://www.kaggle.com/drgilermo/nba-players-stats <br>
https://www.basketball-reference.com/leagues/NBA_2018_totals.html

#### Copyright

Corey J Wade<br>
May 28,2018

This Jupyter Notebook and the statistics within may be redistributed provided that Corey J Wade is given credit.


## Import NBA Statistics

The following csv file is taken from https://www.kaggle.com/drgilermo/nba-players-stats. When I downloaded the file, it contained standard statistics through 2017. Dr. Guillermo scraped it from https://www.basketball-reference.com/. 

#### NBA Stats Through 2017

In [1]:
# import pandas
import pandas as pd

# open file as dataframe
df_2017 = pd.read_csv('Seasons_Stats.csv')

# display first five rows
df_2017.head()

Unnamed: 0.1,Unnamed: 0,Year,Player,Pos,Age,Tm,G,GS,MP,PER,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,0,1950.0,Curly Armstrong,G-F,31.0,FTW,63.0,,,,...,0.705,,,,176.0,,,,217.0,458.0
1,1,1950.0,Cliff Barker,SG,29.0,INO,49.0,,,,...,0.708,,,,109.0,,,,99.0,279.0
2,2,1950.0,Leo Barnhorst,SF,25.0,CHS,67.0,,,,...,0.698,,,,140.0,,,,192.0,438.0
3,3,1950.0,Ed Bartels,F,24.0,TOT,15.0,,,,...,0.559,,,,20.0,,,,29.0,63.0
4,4,1950.0,Ed Bartels,F,24.0,DNN,13.0,,,,...,0.548,,,,20.0,,,,27.0,59.0


Statistics were not widely computed before the modern era, hence the null values.

In [2]:
# delete unnecessary column
del df_2017['Unnamed: 0']

# display last five rows
df_2017.tail()

Unnamed: 0,Year,Player,Pos,Age,Tm,G,GS,MP,PER,TS%,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
24686,2017.0,Cody Zeller,PF,24.0,CHO,62.0,58.0,1725.0,16.7,0.604,...,0.679,135.0,270.0,405.0,99.0,62.0,58.0,65.0,189.0,639.0
24687,2017.0,Tyler Zeller,C,27.0,BOS,51.0,5.0,525.0,13.0,0.508,...,0.564,43.0,81.0,124.0,42.0,7.0,21.0,20.0,61.0,178.0
24688,2017.0,Stephen Zimmerman,C,20.0,ORL,19.0,0.0,108.0,7.3,0.346,...,0.6,11.0,24.0,35.0,4.0,2.0,5.0,3.0,17.0,23.0
24689,2017.0,Paul Zipser,SF,22.0,CHI,44.0,18.0,843.0,6.9,0.503,...,0.775,15.0,110.0,125.0,36.0,15.0,16.0,40.0,78.0,240.0
24690,2017.0,Ivica Zubac,C,19.0,LAL,38.0,11.0,609.0,17.0,0.547,...,0.653,41.0,118.0,159.0,30.0,14.0,33.0,30.0,66.0,284.0


As expected, all visible columns are full of data.

#### 2018 NBA Stats

The 2018 NBA season recently finished. I used the same link, https://www.basketball-reference.com/, to scrape the 2018 statistics.

In [3]:
# read html file
df_2018, = pd.read_html("https://www.basketball-reference.com/leagues/NBA_2018_totals.html", header=0)

# convert to csv file
df_2018.to_csv("tp_2016.csv", index=False)

# display first five rows
df_2018.head()

Unnamed: 0,Rk,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,1,Alex Abrines,SG,24,OKC,75,8,1134,115,291,...,0.848,26,88,114,28,38,8,25,124,353
1,2,Quincy Acy,PF,27,BRK,70,8,1359,130,365,...,0.817,40,216,256,57,33,29,60,149,411
2,3,Steven Adams,C,24,OKC,76,76,2487,448,712,...,0.557,384,301,685,88,92,78,128,215,1056
3,4,Bam Adebayo,C,20,MIA,69,19,1368,174,340,...,0.721,118,263,381,101,32,41,66,138,477
4,5,Arron Afflalo,SG,32,ORL,53,3,682,65,162,...,0.846,4,62,66,30,4,9,21,56,179


Since there is no column for year, I will add one. Since I plan to concatenate the two dataframes, their columns need to match.

In [4]:
# delete unnecessary column
del df_2018['Rk']

# add column for year, place at index 0
df_2018.insert(0, 'Year', 2018.0)

# display last five rows
df_2018.tail()

Unnamed: 0,Year,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
685,2018.0,Tyler Zeller,C,28,BRK,42,33,703,125,229,...,0.667,63,131,194,28,8,21,35,78,300
686,2018.0,Tyler Zeller,C,28,MIL,24,1,406,62,105,...,0.895,47,64,111,19,7,14,12,48,141
687,2018.0,Paul Zipser,SF,23,CHI,54,12,824,81,234,...,0.731,13,118,131,46,20,15,43,86,218
688,2018.0,Ante Zizic,C,21,CLE,32,2,214,49,67,...,0.724,24,36,60,5,2,13,11,30,119
689,2018.0,Ivica Zubac,C,20,LAL,43,0,410,61,122,...,0.765,45,78,123,25,8,15,26,47,161


#### Concatenating Dataframes

Since the dataframes have a different number of columns, and I clearly do not need all of the columns since my focus is on 3-pointers, I will select the relevant columns before concatenating.

In [5]:
# select relevant columns
tp_2017 = df_2017[['Year', 'Tm', 'Player', 'G','MP', 'PTS', '3P', '3PA', '3P%']]
tp_2018 = df_2018[['Year', 'Tm', 'Player', 'G','MP', 'PTS', '3P', '3PA', '3P%']]

# concatenate dataframes
tp = pd.concat([tp_2017, tp_2018], ignore_index=True, )

# show last five rows
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%
25376,2018.0,BRK,Tyler Zeller,42,703,300,10,26,0.385
25377,2018.0,MIL,Tyler Zeller,24,406,141,0,2,0.0
25378,2018.0,CHI,Paul Zipser,54,824,218,37,110,0.336
25379,2018.0,CLE,Ante Zizic,32,214,119,0,0,
25380,2018.0,LAL,Ivica Zubac,43,410,161,0,1,0.0


## Editing Dataframe

#### Column Consistency

In [6]:
# display column info
tp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25381 entries, 0 to 25380
Data columns (total 9 columns):
Year      25314 non-null float64
Tm        25314 non-null object
Player    25314 non-null object
G         25314 non-null object
MP        24828 non-null object
PTS       25314 non-null object
3P        19617 non-null object
3PA       19617 non-null object
3P%       16041 non-null object
dtypes: float64(1), object(8)
memory usage: 1.7+ MB


With the exception of 'Year', the data has not been rendered as numbers. They must be converted to floats for mathematical operations.

In [7]:
# convert numeric columns to decimals
tp.G = pd.to_numeric(tp.G, errors='coerce')
tp.MP = pd.to_numeric(tp.MP, errors='coerce')
tp.PTS = pd.to_numeric(tp.PTS, errors='coerce')
tp['3P'] = pd.to_numeric(tp['3P'], errors='coerce')
tp['3PA'] = pd.to_numeric(tp['3PA'], errors='coerce')
tp['3P%'] = pd.to_numeric(tp['3P%'], errors='coerce')

# check columns
tp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25381 entries, 0 to 25380
Data columns (total 9 columns):
Year      25314 non-null float64
Tm        25314 non-null object
Player    25314 non-null object
G         25288 non-null float64
MP        24802 non-null float64
PTS       25288 non-null float64
3P        19591 non-null float64
3PA       19591 non-null float64
3P%       16015 non-null float64
dtypes: float64(7), object(2)
memory usage: 1.7+ MB


#### Minimum Requirements

It's not necessary to examine data from all players. For instance, if a player was never recorded as taking a 3-pointer, he should be excluded from the dataframe. The same holds for a player who only took one 3-pointer and made it. The purpose of the minimum requirements is to eliminate non-3-point shooters and low outliers.

In [8]:
# choose players with more than 20 3's
tp = tp[(tp['3P'] > 10)]

# chosee players with more than 320 mintes played
tp = tp[(tp['MP'] > 320)]

# choose players with at least 41 games
tp = tp[(tp['G'] > 41)]

# display last 5 rows
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%
25368,2018.0,TOR,Delon Wright,69.0,1433.0,555.0,56.0,153.0,0.366
25371,2018.0,IND,Joe Young,53.0,558.0,207.0,25.0,66.0,0.379
25372,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377
25373,2018.0,IND,Thaddeus Young,81.0,2607.0,955.0,58.0,181.0,0.32
25378,2018.0,CHI,Paul Zipser,54.0,824.0,218.0,37.0,110.0,0.336


In [9]:
tp.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 5928 entries, 5734 to 25378
Data columns (total 9 columns):
Year      5928 non-null float64
Tm        5928 non-null object
Player    5928 non-null object
G         5928 non-null float64
MP        5928 non-null float64
PTS       5928 non-null float64
3P        5928 non-null float64
3PA       5928 non-null float64
3P%       5928 non-null float64
dtypes: float64(7), object(2)
memory usage: 463.1+ KB


## New 3-point Statistics

### Expected Minutes

This first group of statistics computes the number of minutes players are on the court before attemping and making 3's.

#### 1) AM3A : Average Minutes per 3-point Attempt

A player's Average Minutes per 3-Point Attempt is total minutes played divided by total 3-pointers attempted.

In [10]:
# define new column, AM3A: Average Minutes per 3-point Attempt 
tp['AM3A'] = tp['MP'] / tp['3PA']

# show last five entrants
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A
25368,2018.0,TOR,Delon Wright,69.0,1433.0,555.0,56.0,153.0,0.366,9.366013
25371,2018.0,IND,Joe Young,53.0,558.0,207.0,25.0,66.0,0.379,8.454545
25372,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377,4.273006
25373,2018.0,IND,Thaddeus Young,81.0,2607.0,955.0,58.0,181.0,0.32,14.403315
25378,2018.0,CHI,Paul Zipser,54.0,824.0,218.0,37.0,110.0,0.336,7.490909


#### 2) EM3A : Expected Minutes before 3-point Attempt

The expected value of a continuous interval of time is typically at the halfway mark. Will Nick Young (listed above) take a 3 once he checks in, or after 4.27 minutes? His most likely value is halfway between, at 2.135 minutes. This is his expected minutes played before attempting a 3. 

In [11]:
# define new column, EM3A: Expected Minutes before 3-point Attempt
tp['EM3A'] = tp['AM3A'] / 2

# sort dataframe by new category
tp_EM3A = tp.sort_values('EM3A', ascending=True)

# view players who attempt 3s faster than anyone in NBA history
tp_EM3A.head(20)

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A
25279,2018.0,ORL,Marreese Speights,52.0,675.0,402.0,86.0,233.0,0.369,2.896996,1.448498
25119,2018.0,TOR,C.J. Miles,70.0,1337.0,699.0,164.0,454.0,0.361,2.944934,1.472467
23633,2016.0,GSW,Stephen Curry,79.0,2700.0,2375.0,402.0,886.0,0.454,3.047404,1.523702
23463,2015.0,DAL,Charlie Villanueva,64.0,678.0,403.0,83.0,221.0,0.376,3.067873,1.533937
24842,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745
24217,2017.0,MEM,Troy Daniels,67.0,1183.0,551.0,138.0,355.0,0.389,3.332394,1.666197
24216,2017.0,GSW,Stephen Curry,79.0,2638.0,1999.0,324.0,789.0,0.411,3.343473,1.671736
23002,2015.0,TOT,Troy Daniels,47.0,397.0,176.0,43.0,118.0,0.364,3.364407,1.682203
24290,2017.0,HOU,Eric Gordon,75.0,2323.0,1217.0,246.0,661.0,0.372,3.514372,1.757186
25130,2018.0,CHO,Malik Monk,63.0,854.0,421.0,83.0,243.0,0.342,3.514403,1.757202


Statistical Notes<ul>
    <li> Many players on the list come off the bench. EM3A does not distinguish between starters and reserves.</li>
    <li> Most top performers are from the last few years, due to the meteoric rise of NBA 3-pointers. </li>
     <li> Joe Hassett from 1982 is a shocker. </li>
    </ul>

#### 3) AM3P : Average Minutes per 3-Pointer

AMP3 is the player's total minutes divided by the number of 3-pointers made. This statistic is more interesting because we are now looking at makes instead of just attempts.

In [12]:
# define new column, AM3P: Average Minutes per 3-pointer
tp['AM3P'] = tp['MP']/tp['3P']

# show last five rows
tp.tail()

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P
25368,2018.0,TOR,Delon Wright,69.0,1433.0,555.0,56.0,153.0,0.366,9.366013,4.683007,25.589286
25371,2018.0,IND,Joe Young,53.0,558.0,207.0,25.0,66.0,0.379,8.454545,4.227273,22.32
25372,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377,4.273006,2.136503,11.325203
25373,2018.0,IND,Thaddeus Young,81.0,2607.0,955.0,58.0,181.0,0.32,14.403315,7.201657,44.948276
25378,2018.0,CHI,Paul Zipser,54.0,824.0,218.0,37.0,110.0,0.336,7.490909,3.745455,22.27027


#### 4) EMB3 : Expected Minutes Before a 3

This is the best statistic of the group. It's how long a player is expected to be on the court before making a 3. As before, Average Minutes per 3-pointer is simply divided by two.

In [13]:
# define new column, EM3P: Expected Minutes before 3-pointer
tp['EM3P'] = tp['AM3P'] / 2

# sort dataframe by new category
tp_EM3P = tp.sort_values('EM3P', ascending=True)

# display top twenty seasons of all-time
tp_EM3P.head(20)

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P
23633,2016.0,GSW,Stephen Curry,79.0,2700.0,2375.0,402.0,886.0,0.454,3.047404,1.523702,6.716418,3.358209
21504,2012.0,NYK,Steve Novak,54.0,1020.0,477.0,133.0,282.0,0.472,3.617021,1.808511,7.669173,3.834586
24842,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745,7.693396,3.846698
25279,2018.0,ORL,Marreese Speights,52.0,675.0,402.0,86.0,233.0,0.369,2.896996,1.448498,7.848837,3.924419
23634,2016.0,CHO,Troy Daniels,43.0,476.0,242.0,59.0,122.0,0.484,3.901639,1.95082,8.067797,4.033898
24216,2017.0,GSW,Stephen Curry,79.0,2638.0,1999.0,324.0,789.0,0.411,3.343473,1.671736,8.141975,4.070988
25119,2018.0,TOR,C.J. Miles,70.0,1337.0,699.0,164.0,454.0,0.361,2.944934,1.472467,8.152439,4.07622
23463,2015.0,DAL,Charlie Villanueva,64.0,678.0,403.0,83.0,221.0,0.376,3.067873,1.533937,8.168675,4.084337
24217,2017.0,MEM,Troy Daniels,67.0,1183.0,551.0,138.0,355.0,0.389,3.332394,1.666197,8.572464,4.286232
24844,2018.0,PHO,Troy Daniels,79.0,1622.0,703.0,183.0,458.0,0.4,3.541485,1.770742,8.863388,4.431694


EM3P Statistical Notes:<ul>
    <li> EMB3, like 3-Point Percentage, levels the playing field between starters and reserves.</li> 
    <li> More restrictive minimum requirements would eliminate many players from this list. </li>
     <li> Steph Curry's legendary 2016 MVP season is a clear # 1. </li>
    </ul>

### 3-Point Advantage

Although the 3-point statistics above are compelling, they do not a provide a single metric to rank 3-point shooters. This is where 3-point Advantage comes in. Simply put, 3-Point Advantage adds the net gain per 3-pointer made, and subtracts the net loss per 3-pointer missed. 

#### Points Per Possession

Computing net gain and net loss depends on the expected value. Should the expected value be points per possession? Or points per field goal attempt? There is no clear answer. I have decided to employ the classical strategy of assuming that NBA teams average 1.0 points per possession, very close to current averages. The expected value will be 1.0 points per possession.

#### Computation

The computation is determined by expected value. When a player makes a 3-pointer, the team gains more than the expected value. How much more? An additional 3 points minus the expected value. When a player misses a 3-pointer, the team does worse than the expected value. How much worse? They lose the expected value.

More concisely, to compute 3-Point Advantage, each 3-pointer made is multipled by 3 minus the expected value. Each 3-pointer missed is multiped by the expected value. 


In [14]:
# 3PAd, 3-Point Advantage

# Compute 3PG, 3-pointers per Game
tp['3PG']=tp['3P']/tp['G']

# Compute 3PAG, 3-point Attempts per Game
tp['3PAG']= tp['3PA']/tp['G']

# Compute 3PMi, 3-point Misses per Game
tp['3PMiG']=tp['3PAG']-tp['3PG']

# Declare expected value
ev = 1.0
                          
# Compute 3PAd, 3-point Advantage
tp['3PAd']=tp['3PG'] * (3 - ev) - tp['3PMiG'] * ev

# (3 - ev) is how much the team gains per 3-pointer made
# -ev is how much the team loses per 3-pointer missed

#### Top 40 Seasons of All-Time

In [37]:
# sort dataframe by 3PAd
tp=tp.sort_values('3PAd', ascending=False)

# reset index
tp = tp.reset_index(drop=True)

# start index at 1 instead of 0
tp.index = tp.index + 1

# display top 40 seasons
with pd.option_context("display.max_rows", 40): display(tp.head(40))

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P,3PG,3PAG,3PMiG,3PAd
1,2016.0,GSW,Stephen Curry,79.0,2700.0,2375.0,402.0,886.0,0.454,3.047404,1.523702,6.716418,3.358209,5.088608,11.21519,6.126582,4.050633
2,2015.0,ATL,Kyle Korver,75.0,2418.0,911.0,221.0,449.0,0.492,5.385301,2.69265,10.941176,5.470588,2.946667,5.986667,3.04,2.853333
3,2013.0,GSW,Stephen Curry,78.0,2983.0,1786.0,272.0,600.0,0.453,4.971667,2.485833,10.966912,5.483456,3.487179,7.692308,4.205128,2.769231
4,2015.0,GSW,Stephen Curry,80.0,2613.0,1900.0,286.0,646.0,0.443,4.044892,2.022446,9.136364,4.568182,3.575,8.075,4.5,2.65
5,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745,7.693396,3.846698,4.156863,9.823529,5.666667,2.647059
6,2016.0,LAC,J.J. Redick,75.0,2097.0,1226.0,200.0,421.0,0.475,4.980998,2.490499,10.485,5.2425,2.666667,5.613333,2.946667,2.386667
7,2017.0,GSW,Stephen Curry,79.0,2638.0,1999.0,324.0,789.0,0.411,3.343473,1.671736,8.141975,4.070988,4.101266,9.987342,5.886076,2.316456
8,2002.0,MIL,Ray Allen,69.0,2525.0,1503.0,229.0,528.0,0.434,4.782197,2.391098,11.026201,5.5131,3.318841,7.652174,4.333333,2.304348
9,2014.0,ATL,Kyle Korver,71.0,2408.0,850.0,185.0,392.0,0.472,6.142857,3.071429,13.016216,6.508108,2.605634,5.521127,2.915493,2.295775
10,1997.0,CHH,Glen Rice,79.0,3362.0,2115.0,207.0,440.0,0.47,7.640909,3.820455,16.241546,8.120773,2.620253,5.56962,2.949367,2.291139


3PAd Statistical Notes:<ul>
    <li> Steph Curry's legendary MVP season is heads and shoulders above the rest, and he dominates the list.</li> 
    <li> The metric does an adequate job of comparing 3-point shooters over the years. </li>
     <li> Different expected values will produce different lists. </li>
    <li> The metric has real meaning. It conveys the points a team gains per game by the player shooting 3-pointers. </li>
    </ul>

#### 2018 League Leaders

In [16]:
# Create 2018 dataframe
tp_2018 = tp[tp['Year']==2018.0]

# Show top 20 3PAd
tp_2018.head(20)

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P,3PG,3PAG,3PMiG,3PAd
24842,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745,7.693396,3.846698,4.156863,9.823529,5.666667,2.647059
25301,2018.0,GSW,Klay Thompson,73.0,2506.0,1461.0,229.0,520.0,0.44,4.819231,2.409615,10.943231,5.471616,3.136986,7.123288,3.986301,2.287671
24991,2018.0,UTA,Joe Ingles,82.0,2578.0,940.0,204.0,464.0,0.44,5.556034,2.778017,12.637255,6.318627,2.487805,5.658537,3.170732,1.804878
25231,2018.0,PHI,J.J. Redick,70.0,2116.0,1198.0,193.0,460.0,0.42,4.6,2.3,10.963731,5.481865,2.757143,6.571429,3.814286,1.7
25051,2018.0,CLE,Kyle Korver,73.0,1574.0,672.0,164.0,376.0,0.436,4.18617,2.093085,9.597561,4.79878,2.246575,5.150685,2.90411,1.589041
24908,2018.0,OKC,Paul George,79.0,2891.0,1734.0,244.0,608.0,0.401,4.754934,2.377467,11.848361,5.92418,3.088608,7.696203,4.607595,1.56962
24869,2018.0,GSW,Kevin Durant,68.0,2325.0,1792.0,173.0,413.0,0.419,5.62954,2.81477,13.439306,6.719653,2.544118,6.073529,3.529412,1.558824
24994,2018.0,BOS,Kyrie Irving,60.0,1931.0,1466.0,166.0,407.0,0.408,4.744472,2.372236,11.63253,5.816265,2.766667,6.783333,4.016667,1.516667
24784,2018.0,DET,Reggie Bullock,62.0,1732.0,698.0,125.0,281.0,0.445,6.163701,3.081851,13.856,6.928,2.016129,4.532258,2.516129,1.516129
25082,2018.0,TOR,Kyle Lowry,78.0,2510.0,1267.0,238.0,596.0,0.399,4.211409,2.105705,10.546218,5.273109,3.051282,7.641026,4.589744,1.512821


The Golden State Warriors dominate the list. What about the Houston Rockets? They have the reputation of being a great 3-point shooting team. Let's make a comparison.

#### Warriors v Rockets

In [17]:
# Create 2018 dataframe for GSW and HOU
tp_2018_GSW_HOU = tp_2018[(tp_2018['Tm']=='GSW') | (tp_2018['Tm']=='HOU')]

# Display dataframe
tp_2018_GSW_HOU

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P,3PG,3PAG,3PMiG,3PAd
24842,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745,7.693396,3.846698,4.156863,9.823529,5.666667,2.647059
25301,2018.0,GSW,Klay Thompson,73.0,2506.0,1461.0,229.0,520.0,0.44,4.819231,2.409615,10.943231,5.471616,3.136986,7.123288,3.986301,2.287671
24869,2018.0,GSW,Kevin Durant,68.0,2325.0,1792.0,173.0,413.0,0.419,5.62954,2.81477,13.439306,6.719653,2.544118,6.073529,3.529412,1.558824
24932,2018.0,HOU,James Harden,72.0,2551.0,2191.0,265.0,722.0,0.367,3.533241,1.76662,9.626415,4.813208,3.680556,10.027778,6.347222,1.013889
25198,2018.0,HOU,Chris Paul,58.0,1847.0,1081.0,144.0,379.0,0.38,4.873351,2.436675,12.826389,6.413194,2.482759,6.534483,4.051724,0.913793
24704,2018.0,HOU,Ryan Anderson,66.0,1725.0,617.0,131.0,339.0,0.386,5.088496,2.544248,13.167939,6.583969,1.984848,5.136364,3.151515,0.818182
24710,2018.0,HOU,Trevor Ariza,67.0,2269.0,782.0,170.0,462.0,0.368,4.911255,2.455628,13.347059,6.673529,2.537313,6.895522,4.358209,0.716418
24915,2018.0,HOU,Eric Gordon,69.0,2154.0,1243.0,218.0,608.0,0.359,3.542763,1.771382,9.880734,4.940367,3.15942,8.811594,5.652174,0.666667
25372,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377,4.273006,2.136503,11.325203,5.662602,1.5375,4.075,2.5375,0.5375
25307,2018.0,HOU,P.J. Tucker,82.0,2281.0,502.0,115.0,310.0,0.371,7.358065,3.679032,19.834783,9.917391,1.402439,3.780488,2.378049,0.426829


Even though Golden State has the best shooters, it's not clear which team is the best at shooting 3's. This is easily resolved by summing 3PAd.

In [24]:
tp_2018_GSW_HOU['3PAd/T'] = tp_2018_GSW_HOU.groupby('Tm')['3PAd'].transform('sum')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [25]:
# Display updated dataframe
tp_2018_GSW_HOU

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P,3PG,3PAG,3PMiG,3PAd,3PAd/T
24842,2018.0,GSW,Stephen Curry,51.0,1631.0,1346.0,212.0,501.0,0.423,3.255489,1.627745,7.693396,3.846698,4.156863,9.823529,5.666667,2.647059,5.989152
25301,2018.0,GSW,Klay Thompson,73.0,2506.0,1461.0,229.0,520.0,0.44,4.819231,2.409615,10.943231,5.471616,3.136986,7.123288,3.986301,2.287671,5.989152
24869,2018.0,GSW,Kevin Durant,68.0,2325.0,1792.0,173.0,413.0,0.419,5.62954,2.81477,13.439306,6.719653,2.544118,6.073529,3.529412,1.558824,5.989152
24932,2018.0,HOU,James Harden,72.0,2551.0,2191.0,265.0,722.0,0.367,3.533241,1.76662,9.626415,4.813208,3.680556,10.027778,6.347222,1.013889,4.818073
25198,2018.0,HOU,Chris Paul,58.0,1847.0,1081.0,144.0,379.0,0.38,4.873351,2.436675,12.826389,6.413194,2.482759,6.534483,4.051724,0.913793,4.818073
24704,2018.0,HOU,Ryan Anderson,66.0,1725.0,617.0,131.0,339.0,0.386,5.088496,2.544248,13.167939,6.583969,1.984848,5.136364,3.151515,0.818182,4.818073
24710,2018.0,HOU,Trevor Ariza,67.0,2269.0,782.0,170.0,462.0,0.368,4.911255,2.455628,13.347059,6.673529,2.537313,6.895522,4.358209,0.716418,4.818073
24915,2018.0,HOU,Eric Gordon,69.0,2154.0,1243.0,218.0,608.0,0.359,3.542763,1.771382,9.880734,4.940367,3.15942,8.811594,5.652174,0.666667,4.818073
25372,2018.0,GSW,Nick Young,80.0,1393.0,581.0,123.0,326.0,0.377,4.273006,2.136503,11.325203,5.662602,1.5375,4.075,2.5375,0.5375,5.989152
25307,2018.0,HOU,P.J. Tucker,82.0,2281.0,502.0,115.0,310.0,0.371,7.358065,3.679032,19.834783,9.917391,1.402439,3.780488,2.378049,0.426829,4.818073


Golden State is the clear winner. They gain 5.99 points per game by shooting 3s while Houston gains 4.81.

## Fun with 3PAd

Who had the best 3-point seasons before the year 2000? Let's find out.

#### Best of the 90s

In [46]:
tp_before2000 = tp[(tp['Year']<2000) & (tp['Year']>1989)]

with pd.option_context("display.max_rows", 25): display(tp_before2000.head(25))

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P,3PG,3PAG,3PMiG,3PAd
10,1997.0,CHH,Glen Rice,79.0,3362.0,2115.0,207.0,440.0,0.47,7.640909,3.820455,16.241546,8.120773,2.620253,5.56962,2.949367,2.291139
17,1996.0,ORL,Dennis Scott,82.0,3041.0,1431.0,267.0,628.0,0.425,4.842357,2.421178,11.389513,5.694757,3.256098,7.658537,4.402439,2.109756
20,1995.0,PHI,Dana Barros,82.0,3318.0,1686.0,197.0,425.0,0.464,7.807059,3.903529,16.84264,8.42132,2.402439,5.182927,2.780488,2.02439
22,1996.0,SAC,Mitch Richmond*,81.0,2946.0,1872.0,225.0,515.0,0.437,5.720388,2.860194,13.093333,6.546667,2.777778,6.358025,3.580247,1.975309
28,1997.0,IND,Reggie Miller*,81.0,2966.0,1751.0,229.0,536.0,0.427,5.533582,2.766791,12.951965,6.475983,2.82716,6.617284,3.790123,1.864198
34,1996.0,WSB,Tim Legler,77.0,1775.0,726.0,128.0,245.0,0.522,7.244898,3.622449,13.867188,6.933594,1.662338,3.181818,1.519481,1.805195
43,1997.0,SAC,Mitch Richmond*,81.0,3125.0,2095.0,204.0,477.0,0.428,6.551363,3.275681,15.318627,7.659314,2.518519,5.888889,3.37037,1.666667
55,1995.0,ORL,Dennis Scott,62.0,1499.0,802.0,150.0,352.0,0.426,4.258523,2.129261,9.993333,4.996667,2.419355,5.677419,3.258065,1.580645
57,1996.0,CHI,Steve Kerr,82.0,1919.0,688.0,122.0,237.0,0.515,8.097046,4.048523,15.729508,7.864754,1.487805,2.890244,1.402439,1.573171
58,1998.0,CLE,Wesley Person,82.0,3198.0,1204.0,192.0,447.0,0.43,7.154362,3.577181,16.65625,8.328125,2.341463,5.45122,3.109756,1.573171


The big 90s shooters. Glen Rice, Reggie Miller, Dell Curry, Dennis Scott, Mitch Ritchmond, Dale Ellis. What about the 80s?

#### Best of the 80s

In [45]:
tp_before90 = tp[tp['Year']<1990]

with pd.option_context("display.max_rows", 25): display(tp_before90.head(25))

Unnamed: 0,Year,Tm,Player,G,MP,PTS,3P,3PA,3P%,AM3A,EM3A,AM3P,EM3P,3PG,3PAG,3PMiG,3PAd
36,1989.0,SEA,Dale Ellis,82.0,3190.0,2253.0,162.0,339.0,0.478,9.410029,4.705015,19.691358,9.845679,1.97561,4.134146,2.158537,1.792683
163,1988.0,TOT,Craig Hodges,66.0,1445.0,629.0,86.0,175.0,0.491,8.257143,4.128571,16.802326,8.401163,1.30303,2.651515,1.348485,1.257576
262,1988.0,MIL,Craig Hodges,43.0,983.0,397.0,55.0,118.0,0.466,8.330508,4.165254,17.872727,8.936364,1.27907,2.744186,1.465116,1.093023
272,1988.0,BOS,Danny Ainge,81.0,3018.0,1270.0,148.0,357.0,0.415,8.453782,4.226891,20.391892,10.195946,1.82716,4.407407,2.580247,1.074074
427,1989.0,CHI,Craig Hodges,49.0,1112.0,490.0,71.0,168.0,0.423,6.619048,3.309524,15.661972,7.830986,1.44898,3.428571,1.979592,0.918367
445,1989.0,CLE,Mark Price,75.0,2728.0,1414.0,93.0,211.0,0.441,12.92891,6.464455,29.333333,14.666667,1.24,2.813333,1.573333,0.906667
465,1987.0,BOS,Danny Ainge,71.0,2499.0,1053.0,85.0,192.0,0.443,13.015625,6.507812,29.4,14.7,1.197183,2.704225,1.507042,0.887324
505,1986.0,MIL,Craig Hodges,66.0,1739.0,716.0,73.0,162.0,0.451,10.734568,5.367284,23.821918,11.910959,1.106061,2.454545,1.348485,0.863636
533,1988.0,CLE,Mark Price,80.0,2626.0,1279.0,72.0,148.0,0.486,17.743243,8.871622,36.472222,18.236111,0.9,1.85,0.95,0.85
566,1988.0,SEA,Dale Ellis,75.0,2790.0,1938.0,107.0,259.0,0.413,10.772201,5.3861,26.074766,13.037383,1.426667,3.453333,2.026667,0.826667


Craig Hodges. Mark Price. Larry Bird. More Dale Ellis. The first 3-point shot was made in 1979, so we can't go back much further.

#### Best Career Totals

In [52]:
tp['3PNG/C'] tp['3P'] * (3 - ev) - (tp['3PA'] - tp['3P']) * ev

tp_player = tp.groupby('Player')['3PNG/C'].sum()
tp_player = tp_player.sort_values(ascending=False)
with pd.option_context("display.max_rows", 25): display(tp_player.head(25))

Player
Ray Allen           21.905276
Kyle Korver         21.289620
Stephen Curry       19.562224
Reggie Miller*      15.245892
Steve Nash          14.655158
Chauncey Billups    12.766855
Klay Thompson       12.703306
J.J. Redick         12.424284
Dale Ellis          12.127907
Mike Miller         11.625611
Peja Stojakovic     11.457409
Jason Terry         10.602291
Vince Carter        10.519366
Rashard Lewis       10.357993
Brent Barry         10.145916
Glen Rice            9.934763
Dirk Nowitzki        9.782523
Dell Curry           9.485008
J.R. Smith           9.370183
Wesley Person        9.033540
Jose Calderon        9.003851
Hubert Davis         8.865226
Allan Houston        8.862571
Paul Pierce          8.665088
Mike Bibby           8.587188
Name: 3PAd, dtype: float64

In [55]:
# Create new category, use totals instead of per game
tp['3PNG/C'] = tp['3P'] * (3 - ev) - (tp['3PA'] - tp['3P']) * ev

# group by player, and sum over their career
tp_player = tp.groupby('Player')['3PNG/C'].sum()

# order from the top
tp_player = tp_player.sort_values(ascending=False)

# display top 25
tp_player.head(25)

Player
Ray Allen           1548.0
Kyle Korver         1540.0
Stephen Curry       1463.0
Reggie Miller*      1194.0
Steve Nash          1100.0
Klay Thompson        980.0
Chauncey Billups     933.0
Dale Ellis           891.0
J.J. Redick          877.0
Peja Stojakovic      835.0
Jason Terry          822.0
Mike Miller          808.0
Glen Rice            780.0
Vince Carter         762.0
Rashard Lewis        758.0
Dirk Nowitzki        749.0
Brent Barry          727.0
Wesley Person        662.0
J.R. Smith           658.0
Allan Houston        655.0
Mike Bibby           639.0
Dell Curry           637.0
Joe Johnson          613.0
Paul Pierce          611.0
Hubert Davis         598.0
Name: 3PNG/C, dtype: float64

What about best teams of all-time?

In [None]:
tp_player = tp.groupby('Team')['3PAd'].sum()
tp_player = tp_player.sort_values(ascending=False)
with pd.option_context("display.max_rows", 25): display(tp_player.head(25))

## Conclusion

3PAd is a fun, informative and comparative statistic used to rank 3-point shooters. It is a single metric that combines the number of 3-pointers made with a players 3-point percentage. It stands up well to general expectations about who the best 3-point shooters are. It can be used to compare teams, and players from any year and different eras. 