<p style="font-size:24px"><b>Predicting Fantasy Football Stars</b></p>

I love fantasy football. In general, I love sports, but there’s something about sitting down for an entire day on Sundays and watching football with your league mates that cannot be beat. Compared to other sports, fantasy football is easily the best type of fantasy sport. Basketball and baseball have too many games in a week to constantly be updating and changing your line up, and soccer has too little actual metrics to use to determine what fantasy points are worth. However, football has the perfect mix of frequency and statistical measurement. Games aren’t on too frequently, but not too scarcely either. Points are judged by yards and touchdowns, so one play can literally make or break your week. Overall, fantasy football is very exciting and fun, and I’m hoping that I can somehow predict who I should draft next year.

<i>Cleaning Data</i>

First and foremost, I need a data set to work with. Thankfully, Funk Monarch on Kaggle posted a huge data set containing every important offensive metric for every player from every week since 2012. This was a lot of data to sort through, but ultimately, through the help of SQL queries and ChatGPT, I managed to isolate what I determined to be the important metrics for flex positions (wide receivers, running backs, and tight ends) in one spreadsheet, and quarterbacks in another spreadsheet. Now, I can begin building a predictive model.

<i>Building The Model (Flex Players)</i>

To build the actual model, I transitioned from MySQL to VS Code to use python. Python has various libraries that make building predictive models much, much easier. Also, everything in the data will be catered towards PPR scoring, so I apologize in advance to all of those in standard and half-PPR scoring leagues.

In [89]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

First, I want to a construct a very basic base model to compare our future models against. This model will rely only on the consensus rankings from various fantasy football sites like ESPN, Sleeper, and more. This will help set a guideline for the accuracy score. To use the data, I created a table in SQL that contains any relevant information to this base/control model.


In [90]:
control_flex_path = '.../Data/Created Data/control_flex.csv'
control_flex = pd.read_csv(control_flex_path)
print(control_flex.head(10))

   rank player_name  year position  points  preseason_rank  postseason_rank  \
0   257    AJ Brown  2019       WR   217.1             164               38   
1    42    AJ Brown  2020       WR   247.5              40               21   
2    24    AJ Brown  2021       WR   180.9              23               59   
3    31    AJ Brown  2022       WR   299.6              29               10   
4    15    AJ Brown  2023       WR   289.6              14                7   
5   543    AJ Derby  2016       TE    30.0             319              257   
6   410    AJ Derby  2017       TE     4.0             254              386   
7   475    AJ Derby  2018       TE    13.8             283              322   
8   167   AJ Dillon  2020       RB    40.3             130              261   
9   101   AJ Dillon  2021       RB   185.6              85               56   

   accuracy_score  yearly_ac  total_accuracy_score  
0               0         10                    94  
1               0       

As you can see, the table contains the player's preseason_rank, which uses the consensus rankings, and their postseason_rank, which just ranks how highly they finished based on their total points. Then, an accuracy score was calculated for each one where a 2 was awarded to players whose preseason and postseason rank are equal and a 1 if both the preseason and postseason rank were in the top 20, showing some accuracy. Then, a total for a specific year's accuracy score was calculated, and then a final total acuracy score for the entire model across all the years was calculated.

Now that that is done, we can begin to construct our own model. Obviously, for our model, using the fantasy point totals from the year before is cheating, and we will not be doing that. Instead, we will be looking to use other metrics. From the dataset provided, I ran a correlation test to see which statistical measures would be most applicable.

In [91]:
flex_data_path = '.../Data/Created Data/yearly_flex.csv'
flex_data = pd.read_csv(flex_data_path)
numeric_columns = flex_data.select_dtypes(include=['number'])
correlation = numeric_columns.corr()
ppr_correlation = correlation['fantasy_points_ppr'].sort_values(ascending=False).head(10)
print(ppr_correlation)

fantasy_points_ppr             1.000000
total_yards                    0.977413
receptions                     0.908898
ppr_ppg                        0.908640
receiving_yards_after_catch    0.905792
total_tds                      0.892195
targets                        0.880267
ypg                            0.873735
receiving_yards                0.857100
receiving_first_downs          0.840573
Name: fantasy_points_ppr, dtype: float64


Based on the test, I decided to train my model based on total yards, receptions, receiving yards after catch, total touchdowns, targets, and PPR PPG.

In [92]:
top_features = ['total_yards','receptions','receiving_yards_after_catch','total_tds','targets','ppr_ppg']
X = numeric_columns[top_features]
y = numeric_columns['fantasy_points_ppr']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")

Mean Squared Error: 2.929150192509859
R-squared: 0.9995025063574153


<i>Testing the Algorithm</i>

Now that we have our model (which has a very high MSE but an almost perfect R^2), I want to test how strong it is. To do this, I used this model and compared with the control model's accuracy score.

In [93]:
def predict_next_year(train_year):

    data_train = flex_data[flex_data['season'] == train_year]
    data_actual = flex_data[flex_data['season'] == (train_year + 1)]
    numeric_columns_train = data_train.select_dtypes(include=['number'])

    X_train = numeric_columns_train[top_features]
    y_train = numeric_columns_train['fantasy_points_ppr']

    model = LinearRegression()
    model.fit(X_train, y_train)

    data_predict = data_train.copy()
    data_predict['predicted_fantasy_points_ppr'] = model.predict(X_train)

    top_20_predicted = []
    top_20_actual = []
    for _, row in data_predict[['name', 'predicted_fantasy_points_ppr']].sort_values(by='predicted_fantasy_points_ppr', ascending=False).iterrows():
        if row['name'] not in [player['name'] for player in top_20_predicted]:
            top_20_predicted.append({'name': row['name'], 'predicted_fantasy_points_ppr': row['predicted_fantasy_points_ppr']})
        if len(top_20_predicted) == 20:
            break

    if not data_actual.empty and 'name' in data_actual.columns:
        for _, row in data_actual[['name', 'fantasy_points_ppr']].sort_values(by='fantasy_points_ppr', ascending=False).iterrows():
            if row['name'] not in [player['name'] for player in top_20_actual]:
                top_20_actual.append({'name': row['name'], 'fantasy_points_ppr': row['fantasy_points_ppr']})
            if len(top_20_actual) == 20:
                break
    else:
        for i in range(20):
            top_20_actual.append({'name': 'NA', 'fantasy_points_ppr': 'NA'})

    top_20_predicted_df = pd.DataFrame(top_20_predicted)
    top_20_predicted_df['predicted_rank'] = range(1, 21)

    top_20_actual_df = pd.DataFrame(top_20_actual)
    top_20_actual_df['actual_rank'] = range(1, 21)

    if 'name' in top_20_predicted_df.columns and 'name' in top_20_actual_df.columns:
        if 'NA' not in top_20_actual_df['name'].values:
            score = accuracy_test(top_20_actual_df, top_20_predicted_df)
        else:
            score = 'NA'
    else:
        score = 'NA'

    combined_results = pd.DataFrame({
        'Rank': range(1, 21),
        'Projected Leaders': top_20_predicted_df['name'],
        'Projected PPR PPG': top_20_predicted_df['predicted_fantasy_points_ppr'],
        '': range(1, 21),
        'Actual Leaders': top_20_actual_df['name'],
        'Actual PPR Total': top_20_actual_df['fantasy_points_ppr']
    })

    print(f"Projected vs Actual Leaders for the {train_year + 1} Season based on {train_year} Stats:")
    print(combined_results)
    print(f"Accuracy Score: {score}")
    print(f"Control Score: {control_flex[control_flex['year'] == train_year + 1]['yearly_ac'].unique()}")
    return score

In [94]:
def accuracy_test(actual, predicted):
    score = 0
    predicted_ranks = {row['name']: row['predicted_rank'] for index, row in predicted.iterrows()}
    actual_ranks = {row['name']: row['actual_rank'] for index, row in actual.iterrows()}
    for player in actual_ranks:
        if player in predicted_ranks:
            if predicted_ranks[player] == actual_ranks[player]:
                score += 2  # Exact rank match
            else:
                score += 1  # Player is in the top 20 but not the exact rank

    return(score)

In [95]:
def test_data():
    total_score = 0
    total_score_c = 0
    for i in range(3,13):
        score = predict_next_year(2010+i)
        total_score += score
        score_c = control_flex[control_flex['year'] == 2010 + i]['yearly_ac'].unique()
        total_score_c = control_flex[control_flex['year'] == 2010 + i]['total_accuracy_score'].unique()
    print(f"Total Score: {total_score}")
    print(f"Total Score (Control): {total_score_c}")
test_data()

Projected vs Actual Leaders for the 2014 Season based on 2013 Stats:
    Rank Projected Leaders  Projected PPR PPG        Actual Leaders  \
0      1    Jamaal Charles         378.363094   1     Antonio Brown   
1      2        Matt Forte         335.763961   2      Le'Veon Bell   
2      3      LeSean McCoy         327.032161   3    DeMarco Murray   
3      4  Demaryius Thomas         318.014981   4        Matt Forte   
4      5     Antonio Brown         309.098671   5  Demaryius Thomas   
5      6          AJ Green         306.096294   6      Jordy Nelson   
6      7    Calvin Johnson         303.857212   7        Dez Bryant   
7      8      Jimmy Graham         303.557052   8    Marshawn Lynch   
8      9  Brandon Marshall         301.859309   9  Emmanuel Sanders   
9     10        Dez Bryant         294.326344  10       Julio Jones   
10    11   Knowshon Moreno         293.799293  11     Odell Beckham   
11    12    Alshon Jeffery         282.038986  12      Randall Cobb   
12    13

<i>Improving the Model</i>

Based on the results, my model had a combined accuracy score of 80 vs the control's score of 94. This shows that using my model has a long way to go before it can compete with the experts at ESPN. To try and improve this model, I found a data set on Kaggle from Nick Cantalupa that has every important team metric from the last 20 years. After cleaning and fixing some inconsistent variables, I joined my 2 data sets together to get one big flex player data set. I then ran a correlation test on that to find out which team stats are most applicable.

In [96]:
flex_data_path = '.../Data/Created Data/flex_teamonly.csv'
flex_data = pd.read_csv(flex_data_path)
numeric_columns = flex_data.select_dtypes(include=['number'])
correlation = numeric_columns.corr()
fantasy_points_ppr_correlation = correlation['fantasy_points_ppr'].sort_values(ascending=False).head(10)
print(fantasy_points_ppr_correlation)

fantasy_points_ppr      1.000000
total_team_yds          0.130583
pass_yds                0.129999
pass_net_yds_per_att    0.123012
points                  0.120800
yds_per_play_offense    0.120282
pass_fd                 0.119585
first_down              0.116607
pass_td                 0.111316
score_pct               0.106461
Name: fantasy_points_ppr, dtype: float64


While the correlation is minimal (topping at only 0.13), I still want to include this data in my model as from a logistic standpoint, it makes sense to factor in how the player's team is doing to determine how well they will do.

Since the correlation is so small, I didn't stick entirely to the top correlated attributes and experimented with others. Ultimatley, I decided to add how many points that player's team scored that year ('points'), how many offensive plays they had ('plays_offense'), how many games the player played (this wasn't part of the new team data, but rather was something that came to mind while I was working on this). I decided on these because I believe that a fantasy football player is only as good as his team. If they aren't putting up yards on offense or getting near the endzone, then how can I expect the player to produce?

In [97]:
flex_data_path = '.../Data/Created Data/flex_team.csv'
flex_data = pd.read_csv(flex_data_path)
numeric_columns = flex_data.select_dtypes(include=['number'])

top_features = ['ypg','total_yards','total_tds','receiving_yards_after_catch','receptions','ppr_ppg','games','points','plays_offense']
X = numeric_columns[top_features]
y = numeric_columns['fantasy_points_ppr']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")

Mean Squared Error: 3.258201569787962
R-squared: 0.9994921975729246


Adding these metrics to my model actually increased the MSE by 0.3. Regardless, I will check it against the existing data to see how it performs against the control.

In [98]:
test_data()

Projected vs Actual Leaders for the 2014 Season based on 2013 Stats:
    Rank Projected Leaders  Projected PPR PPG        Actual Leaders  \
0      1    Jamaal Charles         378.428921   1     Antonio Brown   
1      2        Matt Forte         336.769443   2      Le'Veon Bell   
2      3      LeSean McCoy         328.330262   3        Matt Forte   
3      4  Demaryius Thomas         318.258946   4  Demaryius Thomas   
4      5     Antonio Brown         309.789018   5      Jordy Nelson   
5      6          AJ Green         306.124287   6    Marshawn Lynch   
6      7    Calvin Johnson         303.684499   7  Emmanuel Sanders   
7      8      Jimmy Graham         302.749090   8       Julio Jones   
8      9  Brandon Marshall         302.180192   9     Odell Beckham   
9     10   Knowshon Moreno         294.444767  10      Randall Cobb   
10    11    Alshon Jeffery         283.144931  11     Jeremy Maclin   
11    12       Eric Decker         281.662998  12        Eddie Lacy   
12    13

Despite having a higher MSE, this upgraded model performed even better than the previous one (Accuracy score of 88, 8 more than the previous 80). So far, our model has continued to grow and improve compared to the base copy-and-paste ESPN model, which is great. However, we are far from finished. Out of curiosity, I want to see if the model is skewed towards a certain position (i.e. does it favor WRs over RBs).

In [99]:
def predict_next_year_position(train_year):
    data_train = flex_data[flex_data['season'] == train_year]
    data_actual = flex_data[flex_data['season'] == (train_year + 1)]
    numeric_columns_train = data_train.select_dtypes(include=['number'])

    X_train = numeric_columns_train[top_features]
    y_train = numeric_columns_train['fantasy_points_ppr']

    model = LinearRegression()
    model.fit(X_train, y_train)

    data_predict = data_train.copy()
    data_predict['predicted_fantasy_points_ppr'] = model.predict(X_train)

    top_20_predicted = []
    top_20_actual = []
    for _, row in data_predict[['name', 'position', 'predicted_fantasy_points_ppr']].sort_values(by='predicted_fantasy_points_ppr', ascending=False).iterrows():
        if row['name'] not in [player['name'] for player in top_20_predicted]:
            top_20_predicted.append({'name': row['name'], 'position': row['position'], 'predicted_fantasy_points_ppr': row['predicted_fantasy_points_ppr']})
        if len(top_20_predicted) == 20:
            break

    if not data_actual.empty and 'name' in data_actual.columns:
        for _, row in data_actual[['name', 'position', 'fantasy_points_ppr']].sort_values(by='fantasy_points_ppr', ascending=False).iterrows():
            if row['name'] not in [player['name'] for player in top_20_actual]:
                top_20_actual.append({'name': row['name'], 'position': row['position'], 'fantasy_points_ppr': row['fantasy_points_ppr']})
            if len(top_20_actual) == 20:
                break
    else:
        for i in range(20):
            top_20_actual.append({'name': 'NA', 'position': 'NA', 'fantasy_points_ppr': 'NA'})

    top_20_predicted_df = pd.DataFrame(top_20_predicted)
    top_20_predicted_df['predicted_rank'] = range(1, 21)

    top_20_actual_df = pd.DataFrame(top_20_actual)
    top_20_actual_df['actual_rank'] = range(1, 21)

    if 'name' in top_20_predicted_df.columns and 'name' in top_20_actual_df.columns:
        if 'NA' not in top_20_actual_df['name'].values:
            score = accuracy_test(top_20_actual_df, top_20_predicted_df)
        else:
            score = 'NA'
    else:
        score = 'NA'

    combined_results = pd.DataFrame({
        'Rank': range(1, 21),
        'Projected Leaders': top_20_predicted_df['name'] + " (" + top_20_predicted_df['position'] + ")",
        'Projected PPR PPG': top_20_predicted_df['predicted_fantasy_points_ppr'],
        '': range(1, 21),
        'Actual Leaders': top_20_actual_df['name'] + " (" + top_20_actual_df['position'] + ")",
        'Actual PPR Total': top_20_actual_df['fantasy_points_ppr']
    })

    predicted_positions_count = top_20_predicted_df['position'].value_counts()
    actual_positions_count = top_20_actual_df['position'].value_counts()

    position_comparison = pd.DataFrame({
        'Position': predicted_positions_count.index.union(actual_positions_count.index),
        'Predicted Count': predicted_positions_count.reindex(predicted_positions_count.index.union(actual_positions_count.index), fill_value=0),
        'Actual Count': actual_positions_count.reindex(predicted_positions_count.index.union(actual_positions_count.index), fill_value=0)
    }).reset_index(drop=True)
    position_comparison['Difference'] = position_comparison['Predicted Count'] - position_comparison['Actual Count']

    print(f"Projected vs Actual Leaders for the {train_year + 1} Season based on {train_year} Stats:")
    print(combined_results)
    print(f"Accuracy Score: {score}")
    print("\nPosition Comparison:")
    print(position_comparison)
    
    return score, position_comparison

def test_data_position():
    total_score = 0
    total_score_c = 0
    total_position_diff = pd.DataFrame(columns=['Position', 'Predicted Count', 'Actual Count', 'Difference'])
    
    for i in range(3, 13):
        score, position_comparison = predict_next_year_position(2010 + i)
        total_score += score
        total_position_diff = pd.concat([total_position_diff, position_comparison], ignore_index=True)
        score_c = control_flex[control_flex['year'] == 2010 + i]['yearly_ac'].unique()
        total_score_c = control_flex[control_flex['year'] == 2010 + i]['total_accuracy_score'].unique()
    
    total_position_diff = total_position_diff.groupby('Position').agg(
        Predicted_Count_Total=('Predicted Count', 'sum'),
        Actual_Count_Total=('Actual Count', 'sum'),
        Difference_Total=('Difference', 'sum'),
        Difference_Max=('Difference', 'max'),
        Difference_Min=('Difference', 'min')
    ).reset_index()
    
    print(f"Total Score: {total_score}")
    print(f"Total Score (Control): {total_score_c}")
    print("\nTotal Position Difference over all years:")
    print(total_position_diff)

test_data_position()


Projected vs Actual Leaders for the 2014 Season based on 2013 Stats:
    Rank      Projected Leaders  Projected PPR PPG             Actual Leaders  \
0      1    Jamaal Charles (RB)         378.428921   1     Antonio Brown (WR)   
1      2        Matt Forte (RB)         336.769443   2      Le'Veon Bell (RB)   
2      3      LeSean McCoy (RB)         328.330262   3        Matt Forte (RB)   
3      4  Demaryius Thomas (WR)         318.258946   4  Demaryius Thomas (WR)   
4      5     Antonio Brown (WR)         309.789018   5      Jordy Nelson (WR)   
5      6          AJ Green (WR)         306.124287   6    Marshawn Lynch (RB)   
6      7    Calvin Johnson (WR)         303.684499   7  Emmanuel Sanders (WR)   
7      8      Jimmy Graham (TE)         302.749090   8       Julio Jones (WR)   
8      9  Brandon Marshall (WR)         302.180192   9     Odell Beckham (WR)   
9     10   Knowshon Moreno (RB)         294.444767  10      Randall Cobb (WR)   
10    11    Alshon Jeffery (WR)         

Over all of the availible years, the model only inaccuratley predicts that 1 WR instead of 1 RB appears. Yes, this differs from year to year (as some years have even more inconsistent positioning), but the volatility is relatively consistent throughout all of the positions (as shown in the max/min table). Running backs are the msot volatile, but they are also the most injury prone and have the most boom-bust potential, so this makes sense.

Out of curiosity, I want to use the same approach I used in the QB Predictor to see if there's a different combination of top features that might improve my model, whose MSE currently stands around 3.2.

In [100]:
from itertools import combinations
from sklearn.feature_selection import RFE

flex_data_path = '.../Data/Created Data/flex_team.csv'
flex_data = pd.read_csv(flex_data_path)
numeric_columns = flex_data.select_dtypes(include=['number'])

correlation = numeric_columns.corr()
fantasy_points_ppr_correlation = correlation['fantasy_points_ppr'].sort_values(ascending=False)

top_25_features = fantasy_points_ppr_correlation.index[1:26].tolist()  # index[0] is 'fantasy_points_ppr'

X = numeric_columns[top_25_features]
y = numeric_columns['fantasy_points_ppr']

model = LinearRegression()
rfe = RFE(model, n_features_to_select=1)
rfe.fit(X, y)

ranking = rfe.ranking_
ranked_features = pd.DataFrame({'Feature': top_25_features, 'Rank': ranking}).sort_values(by='Rank')

def evaluate_top_feature_combinations(X, y, max_features=5):
    results = []
    feature_list = ranked_features['Feature'].tolist()

    for r in range(1, max_features + 1):
        for combo in combinations(feature_list[:max_features], r):
            combo = list(combo)
            
            X_subset = X[combo]
            X_train, X_test, y_train, y_test = train_test_split(X_subset, y, test_size=0.2, random_state=42)

            model = LinearRegression()
            model.fit(X_train, y_train)
            
            y_pred = model.predict(X_test)
            mse = mean_squared_error(y_test, y_pred)
            
            results.append((combo, mse))
    
    results.sort(key=lambda x: x[1])
    
    return results

results = evaluate_top_feature_combinations(X, y, max_features=15)

print("Top 10 feature combinations based on MSE:")
for combo, mse in results[:10]:
    print(f"Features: {combo}, MSE: {mse}")


Top 10 feature combinations based on MSE:
Features: ['target_share', 'air_yards_share', 'total_tds', 'rushing_tds', 'receptions', 'receiving_tds', 'rush_ypg', 'receiving_fumbles', 'ppr_ppg', 'offense_pct', 'ypg', 'total_yards', 'receiving_yards', 'rushing_yards'], MSE: 2.660569513338025
Features: ['target_share', 'air_yards_share', 'total_tds', 'rushing_tds', 'receptions', 'receiving_tds', 'rush_ypg', 'receiving_fumbles', 'ppr_ppg', 'rec_ypg', 'offense_pct', 'ypg', 'total_yards', 'receiving_yards', 'rushing_yards'], MSE: 2.66085912475115
Features: ['target_share', 'air_yards_share', 'total_tds', 'rushing_tds', 'receptions', 'receiving_tds', 'receiving_fumbles', 'ppr_ppg', 'rec_ypg', 'offense_pct', 'ypg', 'total_yards', 'receiving_yards', 'rushing_yards'], MSE: 2.661050988688859
Features: ['target_share', 'air_yards_share', 'total_tds', 'rushing_tds', 'receptions', 'receiving_tds', 'rush_ypg', 'receiving_fumbles', 'ppr_ppg', 'rec_ypg', 'offense_pct', 'total_yards', 'receiving_yards', 'r

In [101]:
top_features =['target_share', 'air_yards_share', 'total_tds', 'rushing_tds', 'receptions', 'receiving_tds', 'rush_ypg', 'receiving_fumbles', 'ppr_ppg', 'offense_pct', 'ypg', 'total_yards', 'receiving_yards', 'rushing_yards']
test_data()

Projected vs Actual Leaders for the 2014 Season based on 2013 Stats:
    Rank Projected Leaders  Projected PPR PPG        Actual Leaders  \
0      1    Jamaal Charles         378.303206   1     Antonio Brown   
1      2        Matt Forte         336.225348   2      Le'Veon Bell   
2      3      LeSean McCoy         328.612328   3        Matt Forte   
3      4  Demaryius Thomas         318.756769   4  Demaryius Thomas   
4      5     Antonio Brown         311.274887   5      Jordy Nelson   
5      6          AJ Green         306.672038   6    Marshawn Lynch   
6      7    Calvin Johnson         304.936544   7  Emmanuel Sanders   
7      8      Jimmy Graham         304.283147   8       Julio Jones   
8      9  Brandon Marshall         302.726353   9     Odell Beckham   
9     10   Knowshon Moreno         294.345761  10      Randall Cobb   
10    11    Alshon Jeffery         281.254811  11     Jeremy Maclin   
11    12       Eric Decker         280.485542  12        Eddie Lacy   
12    13

Despite this new model having an MSE of 2.66, a 17% decrease, the accuracy score dropped by 1 to 87. Currently, the best version of the model performs only 6% worse than the consensus rankings of various fantasy football sites. I would also like to point out that the current model does not account for rookies since they do not have any data from the prior year, and they appear frequently on these lists.