# 2018 NFL Draft Analysis

## Introduction
The goal of this project is to do an exploratory analysis on what traits in College Football Players are most associated with success at the next level, and then project the careers of this year's current draft class.

In [94]:
%matplotlib inline

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Reading in Data
First I am going to read in CSVs containing information from this year's combine, the previous 17 years' combines, and the last 17 years' ProBowls. All data is courtesy of Pro Football Reference.

In [95]:
CB2018DF = pd.read_csv("./CSVs/CB2018Combine") #Need to clean out the links and unused columns
CBHistoryDF = pd.read_csv("./CSVs/PastCornersCombine")
CBProBowlDF = pd.read_csv('./CSVs/CBProBowls')
uneditedCB2018 = CB2018DF
uneditedCBHistory = CBHistoryDF
uneditedCBProBowl = CBProBowlDF
#CBProBowlDf = TempCBProBowlDF.groupby('Name').count()

## Data Cleaning
The Data must now be cleaned. Specifically, the links in the original CSV are still included in the 'Name' strings, and the unused columns need to be dropped. Additionally, we really only want a count of the ProwBowls that players made.

In [96]:
sep = "\\"
CB2018DF['Player'] = CB2018DF['Player'].map(lambda x: x.split(sep, 1)[0])
CBHistoryDF['Player'] = CBHistoryDF['Player'].map(lambda x: x.split(sep, 1)[0])
CB2018DF = CB2018DF.drop(['Pos', 'AV', 'School', 'College', 'Drafted (tm/rnd/yr)'], 1)
CBHistoryDF = CBHistoryDF.drop(['Pos', 'AV', 'School', 'College', 'Drafted (tm/rnd/yr)'], 1)
CBProBowlDF['Name'] = CBProBowlDF['Name'].map(lambda x: x.replace('%', ''))
CBProBowlDF['Name'] = CBProBowlDF['Name'].map(lambda x: x.split(sep, 1)[0])
CBProBowlDF['Name'] = CBProBowlDF['Name'].map(lambda x: x.replace('+', ''))

Next, we need to know how many probowls each player made, and edit the ProBowl Dataframe accordingly.

In [97]:
CBProBowlDF = CBProBowlDF.groupby('Name').count()
display(CBProBowlDF)
#display(uneditedCBProBowl)

Unnamed: 0_level_0,Pos,Conf,Tm,Age,Yrs,G,GS,Cmp,Att,Yds,...,Att.1,Yds.1,TD.1,Rec,Yds.2,TD.2,Tkl,Sk,Int.1,All-pro teams
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
A.J. Bouye,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
Aaron Glenn,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
Adam Jones,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,0
Aeneas Williams,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
Al Harris,2,2,2,2,2,2,2,2,2,2,...,2,2,2,2,2,2,2,2,2,1
Allen Rossum,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,0
Alterraun Verner,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
Antoine Winfield,3,3,3,3,3,3,3,3,3,3,...,3,3,3,3,3,3,3,3,3,2
Antonio Cromartie,4,4,4,4,4,4,4,4,4,4,...,4,4,4,4,4,4,4,4,4,2
Aqib Talib,5,4,5,5,5,5,5,5,5,5,...,5,5,5,5,5,5,5,5,5,2


Next, we only really care about the number of ProBowls and All-Pro teams each of these players made it to, so we will remove everything else. Since the number of Pro-Bowls could be any entry, we will pick a random column to keep and then relabel it.

In [98]:
CBProBowlDF = CBProBowlDF[['Pos', 'All-pro teams']]
CBProBowlDF = CBProBowlDF.rename(columns = {'Pos' : 'ProBowls', 'All-pro teams' : 'All-Pro Teams'})
CBProBowlDF

Unnamed: 0_level_0,ProBowls,All-Pro Teams
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
A.J. Bouye,1,1
Aaron Glenn,1,1
Adam Jones,1,0
Aeneas Williams,1,1
Al Harris,2,1
Allen Rossum,1,0
Alterraun Verner,1,1
Antoine Winfield,3,2
Antonio Cromartie,4,2
Aqib Talib,5,2


Next we need to join the ProBowl Dataframe and the Cornerback history Dataframe.

In [99]:
CBHistoryDF = CBHistoryDF.join(CBProBowlDF, on = 'Player')
CBHistoryDF

Unnamed: 0,Rk,Year,Player,Height,Wt,40YD,Vertical,BenchReps,Broad Jump,3Cone,Shuttle,ProBowls,All-Pro Teams
0,1,2017,Ahkello Witherspoon,6-3,198,4.45,40.5,,127.0,6.93,4.13,,
1,2,2017,Quincy Wilson,6-1,211,4.54,32.0,14.0,118.0,6.86,4.02,,
2,3,2017,Howard Wilson,6-1,184,4.57,33.5,,119.0,6.68,3.94,,
3,4,2017,TreDavious White,5-11,192,4.47,32.0,16.0,119.0,6.90,4.32,,
4,5,2017,Marquez White,6-0,194,4.59,36.0,,123.0,,,,
5,6,2017,Jack Tocho,6-0,202,4.54,35.0,21.0,125.0,,,,
6,7,2017,Cordrea Tankersley,6-1,199,4.40,29.5,13.0,121.0,7.00,4.32,,
7,8,2017,Teez Tabor,6-0,199,4.62,31.0,9.0,120.0,,,,
8,9,2017,Cameron Sutton,5-11,188,4.52,34.0,11.0,120.0,6.81,,,
9,10,2017,Channing Stribling,6-1,188,4.60,31.5,5.0,114.0,6.94,4.56,,


Next, we want to get rid of the entries in the Cornerback History Dataframe where the player didn't make the ProBowl. This is because we only are looking at past succesful corners to see if this year's corners will be succesful.

In [100]:
#df = df[pd.notnull(df['EPS'])]
CBHistoryDF = CBHistoryDF[pd.notnull(CBHistoryDF['ProBowls'])]
CBHistoryDF

Unnamed: 0,Rk,Year,Player,Height,Wt,40YD,Vertical,BenchReps,Broad Jump,3Cone,Shuttle,ProBowls,All-Pro Teams
16,17,2017,Marshon Lattimore,6-0,193,4.36,38.5,,132.0,,,1.0,0.0
41,42,2016,Jalen Ramsey,6-1,209,4.41,41.5,14.0,135.0,6.94,4.18,1.0,1.0
81,12,2015,Marcus Peters,6-0,197,4.53,37.5,17.0,121.0,7.08,4.08,2.0,2.0
105,36,2014,Jason Verrett,5-9,189,4.38,39.0,,128.0,6.69,4.0,1.0,0.0
147,78,2013,Desmond Trufant,6-0,190,4.38,37.5,16.0,125.0,,3.85,1.0,0.0
150,81,2013,Darius Slay,6-0,192,4.36,35.5,14.0,124.0,6.9,4.21,1.0,1.0
154,85,2013,Xavier Rhodes,6-1,210,4.43,40.5,14.0,132.0,,,2.0,1.0
160,91,2013,Tyrann Mathieu,5-9,186,4.5,34.0,4.0,117.0,6.87,4.14,1.0,1.0
185,116,2012,Josh Norman,6-0,197,4.61,33.0,14.0,124.0,7.09,4.23,1.0,1.0
191,122,2012,Janoris Jenkins,5-10,193,4.41,33.5,,121.0,6.95,4.13,1.0,1.0
