## Introduction

The motivation behind this project is to go back and see which if any teams have consistently gotten it right in the draft. Which teams have historically drafted well? To make my life easier in collecting the data, I have used the following criteria to narrow down the group of players. 

* Players for this analysis are from 1985 onwards (Lottery era so unfortunately no Jordan or Hakeem [Fortunately for Portland though]
* At least have played 100 games in the NBA
* Played at least 10 minutes per game
* Contribute at least 1 win to their team while in the NBA 

The [data](https://www.basketball-reference.com/play-index/draft_finder.cgi?request=1&year_min=1985&college_id=0&pos_is_g=Y&pos_is_gf=Y&pos_is_f=Y&pos_is_fg=Y&pos_is_fc=Y&pos_is_c=Y&pos_is_cf=Y&c1stat=g&c1comp=gt&c1val=100&c2stat=mp_per_g&c2comp=gt&c2val=10&c3stat=ws&c3comp=gt&c3val=1&order_by=ws)data as per usual was collected by the trusted **Basketball-Reference** 

As we will see in a bit there are 1046 players that meet this criteria

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pandas import DataFrame,Series

In [2]:
draft = pd.read_table('team_drafting.txt', sep=',', header=1)

In [7]:
# Let us look at the top 10 players by win shares
draft.head(10).iloc[:,[6] + list(range(13,26))]

Unnamed: 0,Player,G,MP,PTS,TRB,AST,STL,BLK,FG%,2P%,3P%,FT%,WS,WS/48
0,Karl Malone\malonka01,1476,37.2,25.0,10.1,3.6,1.4,0.8,0.516,0.519,0.274,0.742,234.6,0.205
1,LeBron James\jamesle01,1143,38.8,27.2,7.4,7.2,1.6,0.8,0.504,0.547,0.344,0.739,219.4,0.238
2,Tim Duncan\duncati01,1392,34.0,19.0,10.8,3.0,0.7,2.2,0.506,0.509,0.179,0.696,206.4,0.209
3,Dirk Nowitzki\nowitdi01,1471,34.4,21.2,7.7,2.5,0.8,0.9,0.472,0.497,0.383,0.879,206.1,0.196
4,Kevin Garnett\garneke01,1462,34.5,17.8,10.0,3.7,1.3,1.4,0.497,0.504,0.275,0.789,191.4,0.182
5,Shaquille O'Neal\onealsh01,1207,34.7,23.7,10.9,2.5,0.6,2.3,0.582,0.583,0.045,0.527,181.7,0.208
6,David Robinson\robinda01,987,34.7,21.1,10.6,2.5,1.4,3.0,0.518,0.52,0.25,0.736,178.7,0.25
7,Reggie Miller\millere01,1389,34.3,18.2,3.0,3.0,1.1,0.2,0.471,0.516,0.395,0.888,174.4,0.176
8,Kobe Bryant\bryanko01,1346,36.1,25.0,5.2,4.7,1.4,0.5,0.447,0.479,0.329,0.837,172.7,0.17
9,Chris Paul\paulch01,892,35.3,18.7,4.5,9.8,2.3,0.1,0.472,0.505,0.372,0.868,164.8,0.251


In [8]:
draft.tail(10).iloc[:,[6] + list(range(13,26))]

Unnamed: 0,Player,G,MP,PTS,TRB,AST,STL,BLK,FG%,2P%,3P%,FT%,WS,WS/48
1036,A.J. English\engliaj01,151,20.6,9.9,2.1,2.1,0.4,0.2,0.435,0.449,0.138,0.778,1.1,0.017
1037,Eric Mobley\mobleer01,113,13.9,3.9,3.1,0.5,0.2,0.5,0.541,0.538,0.75,0.475,1.1,0.034
1038,Darrin Hancock\hancoda01,133,10.5,3.5,1.3,0.7,0.4,0.1,0.53,0.533,0.333,0.579,1.1,0.038
1039,Ed O'Bannon\obanned01,128,16.1,5.0,2.5,0.8,0.6,0.2,0.367,0.4,0.222,0.755,1.1,0.025
1040,Zarko Cabarkapa\cabarza01,150,10.3,4.3,2.1,0.6,0.2,0.2,0.427,0.457,0.273,0.733,1.1,0.035
1041,Archie Goodwin\goodwar01,165,14.5,6.3,2.0,1.2,0.4,0.2,0.429,0.484,0.236,0.7,1.1,0.022
1042,Markel Brown\brownma02,113,15.9,5.2,2.1,1.2,0.6,0.2,0.38,0.427,0.295,0.781,1.1,0.029
1043,Chris Welp\welpch01,109,10.8,3.3,2.4,0.4,0.3,0.5,0.446,0.447,0.0,0.681,1.0,0.042
1044,Anthony Cook\cookan01,116,12.9,3.6,3.7,0.3,0.4,0.8,0.439,0.446,0.0,0.524,1.0,0.033
1045,Terence Morris\morrite01,139,13.9,3.4,2.7,0.7,0.3,0.4,0.407,0.465,0.196,0.711,1.0,0.025


In [22]:
draft.describe().iloc[list(range(1,3)), list(range(7,20))]

Unnamed: 0,G,MP,PTS,TRB,AST,STL,BLK,FG%,2P%,3P%,FT%,WS,WS/48
mean,533.557361,22.760994,9.31501,4.049426,1.981071,0.736807,0.487763,0.45659,0.479311,0.275153,0.737172,29.34522,0.090098
std,298.407262,6.793028,4.52382,2.068926,1.602715,0.363177,0.457728,0.043901,0.039155,0.125334,0.088917,31.936057,0.039545


In [23]:
# How many players
np.shape(draft)

(1046, 26)

In [25]:
# Checking for missing data. Not that many missing values
draft.isnull().sum().sum()

155

## Querying the data

Here comes the first part let us ask some questions. We will start off with some basic ones:

 * What is the correlation between where a player is selected and Win Shares?
 * Which teams have drafted the most players that have contributed some amount to the team?
 * Which teams have drafted players with the most amount of total win shares?

## What is the correlation between where a player is selected and Win Shares?

In [27]:
# We get a weak negative correlation for where a player was picked and the amount of wins he contributes to a team
draft.corr().loc['Pk', 'WS']

-0.26902079410924884

In [28]:
draft.corr().loc['Pk', 'WS/48']

-0.11392251237677788

In [29]:
draft.corr().loc['Pk', 'G']

-0.23040995516039001

## Which teams have drafted the most players that have contributed some amount to the team?

In [152]:
# A very complicated way of doing things
def WS(teamname):
    return draft.loc[:, ['Tm', 'Player', 'WS']].set_index('Tm').loc[teamname, :]

In [135]:
# Convert team to a categorical variable
draft_WS = draft.loc[:, ['Tm', 'Player', 'WS']]

In [143]:
teams_WS = pd.pivot_table(draft_WS, values = 'WS', index = 'Tm', aggfunc = np.sum)

In [149]:
teams_WS.sort_values(by = 'WS', ascending = False)

Unnamed: 0_level_0,WS
Tm,Unnamed: 1_level_1
CLE,1561.6
SEA,1457.7
PHO,1383.0
GSW,1355.1
SAS,1239.7
CHI,1203.1
MIL,1194.1
DET,1164.1
UTA,1107.1
SAC,1093.9
