

`The following is a simple EDA of NFL WR stats. The purpose of this project is to show the correlation between individual wide receiver stats and their fantasy points. For example, both Team Target Percentage (TAR%) and Team Air Yards Percentage (AY%) show the highest correlation per the given data. Data was retrieved from Rotowire.com. `



In [1]:
import pandas as pd
import plotly.express as px


In [None]:
data = pd.read_csv("wr_data.csv", header=1)

In [None]:
data.head()

Unnamed: 0,Name,Team,Pos,G,TAR,REC,YDS,Rts,TPRR%,YPRR,...,Cmp AY,Catch %,Drop %,AY.1,TAR.1,YDS.1,AVG,YAC%,Drawn,On
0,Justin Jefferson,MIN,WR,16,179,124,1771,646,27.7,2.74,...,1161,70.9,2.9,39.4,28.8,610,4.9,34.4,4,1
1,Tyreek Hill,MIA,WR,16,165,117,1687,473,34.9,3.57,...,1186,70.9,3.6,40.8,30.9,501,4.3,29.7,5,1
2,Davante Adams,LV,WR,16,171,95,1443,511,33.5,2.82,...,938,55.6,3.5,41.8,32.6,511,5.4,35.4,4,0
3,A.J. Brown,PHI,WR,16,135,84,1401,469,28.8,2.99,...,855,61.8,5.1,39.9,28.7,546,6.5,39.0,1,2
4,Stefon Diggs,BUF,WR,15,144,101,1325,591,24.4,2.24,...,895,71.6,5.7,34.0,27.3,430,4.3,32.5,3,2


In [None]:
data.columns

Index(['Name', 'Team', 'Pos', 'G', 'TAR', 'REC', 'YDS', 'Rts', 'TPRR%', 'YPRR',
       'AY', 'aDOT', 'AY/Snap', 'Cmp AY', 'Catch %', 'Drop %', 'AY.1', 'TAR.1',
       'YDS.1', 'AVG', 'YAC%', 'Drawn', 'On'],
      dtype='object')

In [None]:
renamed_columns = {'AY.1': 'AY%', 'TAR.1':'TAR%', 'YDS.1':'YAC', 'AVG':'YAC_AVG'}
data = data.rename(columns=renamed_columns)
data.columns

Index(['Name', 'Team', 'Pos', 'G', 'TAR', 'REC', 'YDS', 'Rts', 'TPRR%', 'YPRR',
       'AY', 'aDOT', 'AY/Snap', 'Cmp AY', 'Catch %', 'Drop %', 'AY%', 'TAR%',
       'YAC', 'YAC_AVG', 'YAC%', 'Drawn', 'On'],
      dtype='object')

In [None]:
data.isnull().sum()

In [None]:
data.duplicated()

In [None]:
f_points = pd.read_csv('wr_fantasy_points.csv')


In [None]:
f_points.head()

Unnamed: 0,Name,Team,Pos,Fantasy_Points
0,Justin Jefferson,MIN,WR,235.2
1,Tyreek Hill,MIA,WR,219.9
2,Davante Adams,LV,WR,228.2
3,A.J. Brown,PHI,WR,206.1
4,Stefon Diggs,BUF,WR,192.2


In [None]:
app_data = pd.merge(data, f_points)
app_data

Unnamed: 0,Name,Team,Pos,G,TAR,REC,YDS,Rts,TPRR%,YPRR,...,Catch %,Drop %,AY%,TAR%,YAC,YAC_AVG,YAC%,Drawn,On,Fantasy_Points
0,Justin Jefferson,MIN,WR,16,179,124,1771,646,27.7,2.74,...,70.9,2.9,39.4,28.8,610,4.9,34.4,4,1,235.2
1,Tyreek Hill,MIA,WR,16,165,117,1687,473,34.9,3.57,...,70.9,3.6,40.8,30.9,501,4.3,29.7,5,1,219.9
2,Davante Adams,LV,WR,16,171,95,1443,511,33.5,2.82,...,55.6,3.5,41.8,32.6,511,5.4,35.4,4,0,228.2
3,A.J. Brown,PHI,WR,16,135,84,1401,469,28.8,2.99,...,61.8,5.1,39.9,28.7,546,6.5,39.0,1,2,206.1
4,Stefon Diggs,BUF,WR,15,144,101,1325,591,24.4,2.24,...,71.6,5.7,34.0,27.3,430,4.3,32.5,3,2,192.2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
225,Kadarius Toney,NYG,WR,1,3,2,0,16,18.8,0.00,...,66.7,33.3,-4.8,9.4,7,3.5,0.0,0,1,0.0
226,Lance McCutcheon,LAR,WR,2,5,0,0,18,27.8,0.00,...,0.0,25.0,16.3,9.5,0,0.0,0.0,0,0,0.0
227,DJ Turner,LV,WR,1,1,0,0,2,50.0,0.00,...,0.0,100.0,-1.7,5.3,0,0.0,0.0,0,0,0.6
228,Cade Johnson,SEA,WR,1,2,0,0,18,11.1,0.00,...,0.0,0.0,7.7,6.9,0,0.0,0.0,0,0,0.0


In [None]:
tar_hist = px.histogram(app_data, x='TAR%', y='Fantasy_Points')
tar_hist.show()

In [None]:
tar_scatter = px.scatter(app_data, x='TAR%', y='Fantasy_Points')
tar_scatter.show()

In [None]:
ay_hist = px.histogram(app_data, x='AY%', y='Fantasy_Points')
ay_hist.show()

In [None]:
ay_scatter = px.scatter(app_data, x='AY%', y='Fantasy_Points')
ay_scatter.show()

Based on the given dataset, percentage of team targets and percentage of air yards tend to have the highest correlation to fantasy points. 