NFL Combine Athleticism Calculator

Importing libraries

In [19]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

Reading in 2013-2022 NFL Combine Data

In [20]:
combine_2013 = pd.read_csv('Datasets/2013_combine.csv')
combine_2014 = pd.read_csv('Datasets/2014_combine.csv')
combine_2015 = pd.read_csv('Datasets/2015_combine.csv')
combine_2016 = pd.read_csv('Datasets/2016_combine.csv')
combine_2017 = pd.read_csv('Datasets/2017_combine.csv')
combine_2018 = pd.read_csv('Datasets/2018_combine.csv')
combine_2019 = pd.read_csv('Datasets/2019_combine.csv')
combine_2020 = pd.read_csv('Datasets/2020_combine.csv')
combine_2021 = pd.read_csv('Datasets/2021_combine.csv')
combine_2022 = pd.read_csv('Datasets/2022_combine.csv')

Combing Datasets

In [21]:
combine_datasets = [combine_2013, combine_2014, combine_2015, combine_2016,
                    combine_2017, combine_2018, combine_2019, combine_2020, combine_2021, combine_2022]

combine = pd.concat(
    combine_datasets, ignore_index=True).sort_values(["Pos", "Player"])
combine = combine.set_index("Player")

combine.sample(5)

Unnamed: 0_level_0,Pos,School,Ht,Wt,40yd,Vertical,Bench,Broad Jump,3Cone,Shuttle
Player,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Jaquiski Tartt,S,Samford,6-1,221.0,4.53,,,124.0,,
Kain Colter,WR,Northwestern,5-10,198.0,4.71,,,,,
Greg Dortch,WR,Wake Forest,5-7,173.0,,,,,,
Keith Kelsey,LB,Louisville,6-0,233.0,4.92,29.5,23.0,112.0,7.28,4.56
Jamaal Johnson-Webb,OG,Alabama A&M,6-5,313.0,5.37,23.0,17.0,92.0,8.12,4.74


Currently, all heights in the Datasets are strings structure as so: "Feet-Inches". Will change this by splitting the columns at the "-" character and multply height by 12 plus inches to get each player's height in inches.

In [22]:
heights = combine["Ht"].str.split(pat="-", expand=True)
heights[[0, 1]] = heights[[0, 1]].apply(pd.to_numeric)
combine["Ht"] = (heights[0] * 12) + heights[1]

combine["Ht"].sample(5)

Player
Jeremiah Gemmel    73.0
Duke Dawson        70.0
Dillon Radunz      77.0
Darius Hodge       72.0
Andre Hal          70.0
Name: Ht, dtype: float64

Will ask the user to enter the name of a Participant from the 2013-2022 NFL Combine they want to see the Athletcism Score of.

In [23]:
def get_player(combine):
    while True:
        player = input("Enter a participant from the 2013-2022 NFL Combine: ")
        if player not in combine.index:
            print("Player is not in dataset, try someone else.")
        else:
            return player

player_name = get_player(combine)
player_data = combine.loc[player_name]
player_data


Pos              WR
School          LSU
Ht             71.0
Wt            198.0
40yd           4.38
Vertical       38.5
Bench           7.0
Broad Jump    122.0
3Cone          6.69
Shuttle        3.94
Name: Odell Beckham Jr., dtype: object

For our calculator, we will first calculate for each player the percentile their height, weight, 40 time, vertical jump, bench reps, broad jump, three cone time, and shuttle time fall into relative to their position group. We will then calculate the mean value of these percentiles, labelling it a "raw score". Lastly, we will take this raw score and find what percentile it is in relative to its position group. This final value will be the player's Athletic Score.

To start, let's first filter out all players that are not in the player's position group. We are doing this because each position in Football require  different types of physiques and athleticism to play. For this reason it would not be accurate to compare these differing positions to eachother. To demonstrate this, let's view the mean testing for each position group to visualize how much different they are.

In [24]:
combine["Pos"] = combine["Pos"].str.replace(pat="DB", repl="CB", regex=False)
means_by_position = combine.pivot_table(values=["Ht", "Wt", "40yd", "Vertical", "Bench", "Broad Jump", "3Cone", "Shuttle"], index="Pos", aggfunc=[np.mean])
means_by_position

Unnamed: 0_level_0,mean,mean,mean,mean,mean,mean,mean,mean
Unnamed: 0_level_1,3Cone,40yd,Bench,Broad Jump,Ht,Shuttle,Vertical,Wt
Pos,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
C,7.713333,5.236184,24.909091,104.274194,75.41573,4.673103,27.914062,305.606742
CB,6.957857,4.500449,14.302932,123.522293,71.569682,4.202276,35.935938,193.435208
DE,7.259851,4.799088,23.193277,117.313953,75.717391,4.42432,33.19771,263.950311
DT,7.730064,5.11086,27.061111,105.644444,74.991803,4.683789,28.860215,308.060976
FB,7.279444,4.804167,24.142857,114.2,72.52,4.365,32.775,241.2
K,,4.894286,13.0,113.5,71.933333,,33.5,193.673913
LB,7.117853,4.699216,20.964706,119.178947,73.604457,4.319,34.070652,238.45961
LS,7.470833,5.080625,18.0,108.5,73.882353,4.556667,29.233333,238.705882
OG,7.835362,5.256199,26.680556,102.877419,76.07772,4.790069,27.058442,314.902062
OT,7.823422,5.208718,23.979592,105.341232,77.587121,4.759746,28.055046,313.412879


Let's filter for only participants in the player's position group, and the mean for the position group.

In [25]:
pos_group_data = combine[combine["Pos"] == player_data["Pos"]]
pos_group_mean = pos_group_data.loc[:, "Ht":"Shuttle"].mean().round(2)

Put all the measurements/performances into list.

In [26]:
heights = pos_group_data["Ht"].dropna().to_numpy()
weights = pos_group_data["Wt"].dropna().to_numpy()
fourtys = pos_group_data["40yd"].dropna().to_numpy()
verticals = pos_group_data["Vertical"].dropna().to_numpy()
bench_reps = pos_group_data["Bench"].dropna().to_numpy()
broad_jumps = pos_group_data["Broad Jump"].dropna().to_numpy()
three_cones = pos_group_data["3Cone"].dropna().to_numpy()
shuttle_times = pos_group_data["Shuttle"].dropna().to_numpy()

Calculate percentiles each player's data is in relative to position group. Then will divide the percentiles into two scores: physical and drill Score. Physical score will be the mean of the  player's height and weight percentiles. Drill score will be the mean of the player's fourty, vertical, bench press, broad jump, three cone, shuttle percentiles. A player's drill score will automatically be 0 if they did not complete at least 2 drills.

In [27]:
pd.options.mode.chained_assignment = None

pos_group_data["Ht Perc"] = pos_group_data["Ht"].map(
    lambda x: stats.percentileofscore(heights, x))
pos_group_data["Wt Perc"] = pos_group_data["Wt"].map(
    lambda x: stats.percentileofscore(weights, x))
pos_group_data["Fourty Perc"] = pos_group_data["40yd"].map(
    lambda x: 100 - stats.percentileofscore(fourtys, x))
pos_group_data["Vertical Perc"] = pos_group_data["Vertical"].map(
    lambda x: stats.percentileofscore(verticals, x))
pos_group_data["Bench Perc"] = pos_group_data["Bench"].map(
    lambda x: stats.percentileofscore(bench_reps, x))
pos_group_data["Broad Jump Perc"] = pos_group_data["Broad Jump"].map(
    lambda x: stats.percentileofscore(broad_jumps, x))
pos_group_data["3Cone Perc"] = pos_group_data["3Cone"].map(
    lambda x: 100 - stats.percentileofscore(three_cones, x))
pos_group_data["Shuttle Perc"] = pos_group_data["Shuttle"].map(
    lambda x: 100 - stats.percentileofscore(shuttle_times, x))

pos_group_data["Size Score"] = pos_group_data.loc[:,
                                                      ["Ht Perc","Wt Perc", "Bench"]].apply(np.mean, axis=1)
pos_group_data["Speed Score"] = pos_group_data.loc[:,
                                                   ["Fourty Perc"]].apply(np.mean, axis=1)
pos_group_data["Explosive Score"] = pos_group_data.loc[:,
                                                   ["Broad Jump Perc", "Vertical Perc"]].apply(np.mean, axis=1)
pos_group_data["Agility Score"] = pos_group_data.loc[:,
                                                   ["3Cone Perc", "Shuttle Perc"]].apply(np.mean, axis=1)


size_scores = pos_group_data["Size Score"].dropna().to_numpy()
speed_scores = pos_group_data["Speed Score"].dropna().to_numpy()
explosive_score = pos_group_data["Explosive Score"].dropna().to_numpy()
agility_score = pos_group_data["Agility Score"].dropna().to_numpy()

pos_group_data["Size Score"] = pos_group_data["Size Score"].map(
    lambda x: stats.percentileofscore(size_scores, x))
pos_group_data["Speed Score"] = pos_group_data["Speed Score"].map(
    lambda x: stats.percentileofscore(speed_scores, x))
pos_group_data["Explosive Score"] = pos_group_data["Explosive Score"].map(
    lambda x: stats.percentileofscore(explosive_score, x))
pos_group_data["Agility Score"] = pos_group_data["Agility Score"].map(
    lambda x: stats.percentileofscore(agility_score, x))

pos_group_data["Size Score"] = pos_group_data["Size Score"].round(2)
pos_group_data["Speed Score"] = pos_group_data["Speed Score"].round(2)
pos_group_data["Explosive Score"] = pos_group_data["Explosive Score"].round(2)
pos_group_data["Agility Score"] = pos_group_data["Agility Score"].round(2)

Outputs the player's Size, Speed, Explosive, and Agility scores.

In [28]:
player_scores = pos_group_data.loc[player_name,
                                   "Size Score":"Agility Score"]
player_scores

Size Score          22.8
Speed Score        91.49
Explosive Score    72.55
Agility Score      98.23
Name: Odell Beckham Jr., dtype: object

Calculate the Athletic Score.

In [29]:
pos_group_data["Athletic Score"] = pos_group_data[["Size Score", "Speed Score", "Explosive Score", "Agility Score"]].mean(axis=1)

athletic_scores = pos_group_data["Athletic Score"].dropna().to_numpy()

pos_group_data["Athletic Score"] = pos_group_data["Athletic Score"].map(
    lambda x: stats.percentileofscore(athletic_scores, x))
pos_group_data["Athletic Score"] = pos_group_data["Athletic Score"].apply(
    lambda x: 0 if pos_group_data.loc[player_name, "40yd":"Shuttle"].isna().sum() > 4 else x)

pos_group_data["Athletic Score"] = pos_group_data["Athletic Score"].round(2)

Output the player's Athletic Score.

In [30]:
athletic_score = pos_group_data.loc[player_name, "Athletic Score"]
print(f"{player_name} is a top {athletic_score}% athlete at the {player_data['Pos']} position.")

Odell Beckham Jr. is a top 86.64% athlete at the WR position.
