# NFL Draft Tool

### Brandon Wallace

This notebook takes scaped data of hundreds of 2023 NFL mock drafts from sports journalists to build a drafting tool. NFL teams typically enter the NFL draft with several college football players that they want to join their team. However, the draft is a highly uncertain enviorment. No team has perfect information about the intentions of other teams. Since the draft is ordered, the behavior of the teams preceeding another is studied acutely. 

This project finds a distribution for all players in the draft based on how often they are mocked at each pick. Then, I construct a large simulation.

Lastly, I focus on the Pittsburgh Steelers, my favorite team. I find the likelihood that a player the Steelers want is available at their pick location, 17. 

In [106]:
import pandas as pd
import numpy as np
import random
import warnings
import altair as alt
warnings.filterwarnings('ignore')

In [107]:
# Load Excel file into Pandas dataframe
column_names =  ["Publication", "Author", "Date"] + [f'pick {i}' for i in range(1, 43)]
df = pd.read_csv('mock_draft_data.csv', header=None)
df.columns = column_names

# Clean data 
df['Publication'] = df['Publication'].astype(str).str.replace(r"\[|\]|'", '')
df['Author'] = df['Author'].astype(str).str.extract(r"By (\w+ \w+)")[0]
df['Date'] = df['Date'].astype(str).str.replace(r"\[|\]|'", '')
df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%y') 
# Remove non-first round picks
df = df.iloc[:, :-10]

In [108]:
# Checking Cleaned Data
df.head(5)

Unnamed: 0,Publication,Author,Date,pick 1,pick 2,pick 3,pick 4,pick 5,pick 6,pick 7,...,pick 23,pick 24,pick 25,pick 26,pick 27,pick 28,pick 29,pick 30,pick 31,pick 32
0,Sporting News,Vinnie Iyer,2023-04-27,Bryce Young,Will Anderson Jr.,Tyree Wilson,Anthony Richardson,Jalen Carter,Devon Witherspoon,Christian Gonzalez,...,Jordan Addison,Brian Branch,Drew Sanders,Antonio Johnson,Trenton Simpson,Michael Mayer,Lukas Van Ness,Nolan Smith,Darnell Wright,Calijah Kancey
1,Bucs Nation,Mike Kiwak,2023-04-27,Bryce Young,Will Anderson Jr.,C.J. Stroud,Anthony Richardson,Tyree Wilson,Devon Witherspoon,Nolan Smith,...,Zay Flowers,Brian Branch,Emmanuel Forbes,Calijah Kancey,O'Cyrus Torrence,Jahmyr Gibbs,Myles Murphy,Dalton Kincaid,Jordan Addison,Quentin Johnston
2,Draft Countdown,Brian Bosarge,2023-04-27,Bryce Young,Will Anderson Jr.,Paris Johnson Jr.,C.J. Stroud,Tyree Wilson,Devon Witherspoon,Darnell Wright,...,Zay Flowers,Michael Mayer,O'Cyrus Torrence,Will McDonald IV,Quentin Johnston,Emmanuel Forbes,Jahmyr Gibbs,Deonte Banks,Myles Murphy,Bryan Bresee
3,Behind The Steel Curtain,Mike Frazer,2023-04-27,C.J. Stroud,Jalen Carter,Will Levis,Peter Skoronski,Christian Gonzalez,Tyree Wilson,Calijah Kancey,...,O'Cyrus Torrence,Brian Branch,Bijan Robinson,Jack Campbell,Michael Mayer,Emmanuel Forbes,Josh Downs,Quentin Johnston,Anthony Richardson,Lukas Van Ness
4,Barstool Sports,Matt Fitzgerald,2023-04-27,Bryce Young,Tyree Wilson,Will Anderson Jr.,Will Levis,Jalen Carter,Devon Witherspoon,Christian Gonzalez,...,Myles Murphy,Calijah Kancey,Quentin Johnston,Michael Mayer,Jalin Hyatt,Dalton Kincaid,Zay Flowers,Steve Avila,Lukas Van Ness,Jordan Addison


In [109]:
# Extract player names from pick columns
pick_columns = df.columns[3:]  # Assumes pick columns start from index 3

# Create Melted df and distribution of picks
melted_df = df.melt(id_vars=['Publication', 'Author', 'Date'], value_vars=pick_columns, var_name='Pick', value_name='Player')

# Get full distribtuion 
distribution = melted_df.groupby(['Player', 'Pick'])['Player'].count().sort_values(ascending=False)

In [156]:
# Check distribution variable
distribution 

Player               Pick   
Bryce Young          pick 1     312
Devon Witherspoon    pick 6     173
Jalen Carter         pick 5     163
Will Anderson Jr.    pick 3     131
                     pick 2     115
                               ... 
Adetomiwa Adebawore  pick 18      1
Jalin Hyatt          pick 24      1
                     pick 22      1
                     pick 20      1
                     pick 29      1
Name: Player, Length: 868, dtype: int64

In [116]:
# Check for each player
player_b_young = 'Bryce Young'
b_young_distribution = distribution.loc[player_b_young]
print(b_young_distribution)

Pick
pick 1     312
pick 2      20
pick 4       3
pick 5       2
pick 3       2
pick 6       1
pick 11      1
Name: Player, dtype: int64


In [129]:
# Investigate a single player
player_j_carter = 'Jalen Carter'
j_carter_distribution = melted_df.loc[melted_df['Player'] == player_j_carter, 'Pick']
# Set an order so that x-axis is in the draft order
pick_order = list(range(1, len(pick_columns) + 1))

# Create histogram using Altair
chart_data = pd.DataFrame({'Pick': j_carter_distribution})
chart_data['Pick'] = pd.Categorical(chart_data['Pick'], categories=pick_columns, ordered=True)

chart = alt.Chart(chart_data).mark_bar().encode(
    alt.X('Pick:O', title='Pick', sort=None),  # Disable automatic sorting
    alt.Y('count()', title='Frequency')
).properties(
    title=f'Distribution of Mock Draft Order for {player_j_carter} Based on All Mocks'
)

# Display the histogram 
chart

In [115]:
# First round picks simulator 
rounds = 1
simulations = 1000

for i in range(1, rounds + 1):
    print("Most Common Selection at Each Pick based on 1000 Simulations:")
    for pick in pick_columns:
        # Loop over each pick in order 
        player_distribution = melted_df.loc[melted_df['Pick'] == pick, 'Player']
        pick_results = []
        # Pick a player based on the distribution 
        for _ in range(simulations):
            draft = np.random.choice(player_distribution)
            pick_results.append(draft)

        unique_players, player_counts = np.unique(pick_results, return_counts=True)
        max_count = np.max(player_counts)
        max_player = unique_players[np.argmax(player_counts)]
        max_percentage = max_count / simulations * 100

        print(f"{pick}: {max_player} ({max_count}/{simulations}, {max_percentage:.2f}%)")

Most Common Selection at Each Pick based on 1000 Simulations:
pick 1: Bryce Young (898/1000, 89.80%)
pick 2: Will Anderson Jr. (339/1000, 33.90%)
pick 3: Will Anderson Jr. (382/1000, 38.20%)
pick 4: Will Levis (325/1000, 32.50%)
pick 5: Jalen Carter (478/1000, 47.80%)
pick 6: Devon Witherspoon (490/1000, 49.00%)
pick 7: Christian Gonzalez (322/1000, 32.20%)
pick 8: Bijan Robinson (255/1000, 25.50%)
pick 9: Paris Johnson Jr. (345/1000, 34.50%)
pick 10: Nolan Smith (231/1000, 23.10%)
pick 11: Peter Skoronski (216/1000, 21.60%)
pick 12: Jaxon Smith-Njigba (340/1000, 34.00%)
pick 13: Jaxon Smith-Njigba (189/1000, 18.90%)
pick 14: Broderick Jones (137/1000, 13.70%)
pick 15: Broderick Jones (256/1000, 25.60%)
pick 16: Joey Porter Jr. (244/1000, 24.40%)
pick 17: Joey Porter Jr. (256/1000, 25.60%)
pick 18: Calijah Kancey (215/1000, 21.50%)
pick 19: Anton Harrison (149/1000, 14.90%)
pick 20: Myles Murphy (142/1000, 14.20%)
pick 21: Jordan Addison (212/1000, 21.20%)
pick 22: Deonte Banks (144/10