# Predicting NBA Players of the Week: A Logistic Regression Approach
***

![NBAUrl](https://i.pinimg.com/originals/9e/d1/3d/9ed13d1846a5f262edaea59c29483c02.gif "nba")

This exercise takes NBA box score data from Kaggle via kagglehub and attempts to predict the winners of the NBA Player of the Week award for the 2023-24 season using the prior ten seasons of data as historical precedent. In this notebook, I will walk through my logic and processes used to clean the data and perform data transformations which includes conditionally filtering dataframes, applying user-created functions onto dataframes, performing mathematical operations with datetime objects, aggregating, merging, and concatenating dataframes, and finally feeding the data through a logistic regression model.

Let's get started!

## 1. Background

<img src="https://cdn.nba.com/manage/2024/10/davis-tatum-potw.jpeg" alt="potw" style="width: 700px;"/>

The NBA Player of the Week is awarded to one player from each conference (East / West) for each week of the NBA regular season (with exceptions for the weeks of the NBA Cup semifinals and NBA All-Star game). Information about the criteria for the award is not publicly available (to my best knowledge), but a few general characteristics of players who have historically been winners of the award are:

1) _Outstanding individual performances_ - The player should have strongly positive production on the court and be a major contributor to his team's success
2) _Team record_ - The player's team should win the majority of its games during the week
3) _Availability_ - The player should not miss significant time during the week relative to the team's number of scheduled games in the week

LeBron James is the all-time leader in Player of the Week awards (68 award wins before age 40 as of this notebook's publishing). Basketball Reference maintains a great one-page view of every Player of the Week award winner __[here](https://www.basketball-reference.com/awards/pow.html)__, but I will be using __[this neat table](https://basketball.realgm.com/nba/awards/by-type/Player-Of-The-Week/30)__ as a data export from RealGM as my datasource 



## 2. Loading the Data and Making Initial Observations
Start by loading the Python packages that we'll be using:

In [None]:
import kagglehub #Used for connect to the dataset on kaggle
import pandas as pd #Used for dataframe transformations and manipulations
import numpy as np #Used for a few helper functions
import statsmodels.api as sm #Used to run our logistic regression, analyze the results, and produce predictions

pd.set_option('display.max_columns',20)

Next, let's load the data and preview it in our notebook:

In [None]:
#Load NBA player of the week data, while simultaneously converting the date column to a date datatype
potw_data = pd.read_csv('https://github.com/BryanDfor3/NBA-POTW-PREDICTION/blob/main/NBA%20Players%20of%20the%20Week%20Data.csv?raw=true', parse_dates = ['Date'], date_format = '%b %d, %Y')

#Download the latest version of the traditional and advanced box score data from Kaggle
trad_path = kagglehub.dataset_download("szymonjwiak/nba-traditional")
adv_path = kagglehub.dataset_download("szymonjwiak/nba-advanced-boxscores-1997-2023")

#Load the box score data downloaded from the previous step into variables
trad_bs_data = pd.read_csv(trad_path+"/traditional.csv", parse_dates = ['date'], date_format = '%Y-%m-%d')
adv_bs_data = pd.read_csv(adv_path+"/advanced.csv", parse_dates = ['date'], date_format = '%Y-%m-%d')

print('potw_data: ', '\n', potw_data.head(5), '\n', 'trad_bs_data: ', '\n', trad_bs_data.head(5))

One thing to consider is that the Player of the Week Award may not always be awarded on the same day of the week. We need to check for this to ensure that the way we define our weeks is standardized:

In [None]:
#Observe which days of the week the POTW is awarded on to treat data appropriately downstream. Then add weekday to the box score data
potw_data['weekday'] = potw_data['Date'].dt.day_name()
trad_bs_data['weekday'] = trad_bs_data['date'].dt.day_name()
adv_bs_data['weekday'] = adv_bs_data['date'].dt.day_name()

#Observe the breakdown of weekdays that Player of the Week is awarded on
potw_data['weekday'].value_counts(normalize=True)

Let's see if this changes when we filter our data for the last 11 NBA seasons worth of data:

In [None]:
#Filter the data to keep the last 11 years of data only
potw_data = potw_data[((potw_data['Date'] >= '2014') | (potw_data['Season'] == '2013-2014')) & (potw_data['Season'] != '2024-2025')]
trad_bs_data = trad_bs_data[((trad_bs_data['date'] >= '2014') | (trad_bs_data['season'] == 2014)) & (trad_bs_data['type'] == 'regular')]
adv_bs_data = adv_bs_data[((adv_bs_data['date'] >= '2014') | (adv_bs_data['season'] == 2014)) & (adv_bs_data['type'] == 'regular')]

#Observe the breakdown of weekdays that Player of the Week is awarded on
potw_data['weekday'].value_counts(normalize=True)

Looks like the NBA has been pretty consistent about announcing the award on Mondays with the exception of a few Tuesday announcements over the last 11 years. Let's make sure to account for this in our next steps.

## 3. Using Data Transformations to Define Our Weeks and Defining Our Model

Ideally, our date definitions for the award winners should be consistent. I want to establish 'week starting' and 'week ending' columns to avoid any overlap between our dataframes. Since we have some data that represents Mondays and other data representing Tuesdays, let's standardize both cases so that all of our weeks start on Monday and end on Sunday:

In [None]:
#Calculate the 'week starting' and 'week ending' using the date and weekday that the POTW was awarded (Monday version)
def week_ending(weekday):
    '''
    This function takes a day of the week and a date field and computes the day that the week started on

    INPUTS
    date - The date to be transformed
    weekday - A value representing the day of the week (e.g. Monday, Tuesday, etc.) represented by the date

    OUTPUTS
    date_delta - The transformed date
    '''

    deltas = {'Monday': 1, 'Tuesday': 2, 'Wednesday': 3, 'Thursday': 4, 'Friday': 5, 'Saturday': 6, 'Sunday': 0}
    date_delta = pd.Timedelta(days=deltas[weekday])
    return date_delta

#Calculate the 'week starting' and 'week ending' using the date and weekday that the POTW was awarded (Tuesday version)
def week_ending2(weekday):
    '''
    This function takes a day of the week and a date field and computes the day that the week started on

    INPUTS
    date - The date to be transformed
    weekday - A value representing the day of the week (e.g. Monday, Tuesday, etc.) represented by the date

    OUTPUTS
    date_delta - The transformed date
    '''

    deltas = {'Monday': 6, 'Tuesday': 5, 'Wednesday': 4, 'Thursday': 3, 'Friday': 2, 'Saturday': 1, 'Sunday': 0}
    date_delta = pd.Timedelta(days=deltas[weekday])
    return date_delta

#Calculate the week ending date using the custom function defined earlier
potw_data['week ending'] = potw_data['Date'] - potw_data['weekday'].apply(week_ending)
trad_bs_data['week ending'] = trad_bs_data['date'] + trad_bs_data['weekday'].apply(week_ending2)
adv_bs_data['week ending'] = adv_bs_data['date'] + adv_bs_data['weekday'].apply(week_ending2)


#Calculate the week starting date a six day time delta on the week ending date calculated in the previous step
potw_data['week starting'] = potw_data['week ending'] - pd.Timedelta(days=6)
trad_bs_data['week starting'] = trad_bs_data['week ending'] - pd.Timedelta(days=6)
adv_bs_data['week starting'] = adv_bs_data['week ending'] - pd.Timedelta(days=6)

Since we've standardized our dates and have all of the raw data, now would be a good time to consider the variables we want to incorporate into our model. For the sake of simplicity and to minimize the risk of multicolinearity, we will restrict this analysis to the following set of variables:

### Independent Variables
* Game Score **`(gscore)`** - Gives a rough measure of a player's productivity for a single game, created by John Hollinger to be a simpler version of PER, which has historically had predicted MVP winners with significant accuracy. A game score of 10 is considered average, while a game score of 40 is considered outstanding. <br>
* Win Percentage **`(win%)`** - The win percentage of the player's team in games that he played in during the week. <br>
* Participation **`(played%)`** - The percentage of a player's team's games that he played in during the week. <br>
* Team Games vs. Max **`(% vs. max)`** - The percentage of games scheduled for a player's team relative to the highest count of games scheduled in a week for a team per conference. For example: If the Lakers, Clippers, Nets and Knicks respectively had 3, 4, 2, and 5 games scheduled in a week, the variable should produce the following results: <br> <br>
Western Conference: LAL - 0.75 (3/4), LAC - 1.00 (4/4)  
Eastern Conference: BKN - 0.40 (2/5), NYK - 1.00 (5/5)
<br>
### Dependent Variables
* Player of the Week Winner **`(potw_winner)`** - 1 if the player was the NBA Player of the Week, 0 otherwise.

Because our dependent variable is binary in nature and because we want to bound our result as a value between 0 and 1, we will perform a logistic regression to inform our predictions.

Note that our prediction values falling between 0 and 1 will produce a probability; however, since we will be running the model on entire seasons rather than individual weeks, the players with the highest probability  as predicted by the model in a week per conference will be deemed as the predicted winners by the model. Another way of saying this is that we care about the rank value of the probability in the week, rather than the probability value itself. 

Our next steps will involve preparing the data in order to perform joins and define our variables in the code as they're described above. Recall that the NBA Player of the Week is awarded to one player for each conference, so we will need multiple levels of granularity between the data that we will be joining.

In [None]:
#Create a flag indicating whether the player played or not for the box score data
trad_bs_data['played'] = [True if x > 0 else False for x in trad_bs_data['MIN']]
adv_bs_data['played'] = [True if x > 0 else False for x in adv_bs_data['MIN']]

#Define player conference across the box score datasets
East = ['IND', 'ORL', 'CHI', 'MIA', 'BKN', 'CLE', 'PHI', 'BOS', 'TOR', 'DET', 'WAS', 'MIL', 'NYK', 'CHA', 'ATL']
trad_bs_data['conference'] = ['East' if x in East else 'West' for x in trad_bs_data['team']]
adv_bs_data['conference'] = ['East' if x in East else 'West' for x in adv_bs_data['team']]

#Create unique ID using 'week ending', 'conference', and 'player'
potw_data['id'] = potw_data['week ending'].astype(str) + ":" + potw_data['Conference'] + ":" + potw_data['Player']
trad_bs_data['id'] = trad_bs_data['week ending'].astype(str) + ":" +  trad_bs_data['conference'] + ":" + trad_bs_data['player']
adv_bs_data['id'] = adv_bs_data['week ending'].astype(str) + ":" + adv_bs_data['conference'] + ":" + adv_bs_data['player']

trad_bs_data['team id'] = trad_bs_data['week ending'].astype(str) + ":" +  trad_bs_data['team']
trad_bs_data['conf id'] = trad_bs_data['week ending'].astype(str) + ":" +  trad_bs_data['conference']

#Create a dataframe to determine the number of games played per team downstream
team_data = trad_bs_data.loc[:,['team id','team','week ending', 'gameid']]

#Create a dataframe to determine the max games played by a team in the week
conf_data = trad_bs_data.loc[:,['conf id', 'week ending', 'conference', 'team', 'gameid']]

## 4. Aggregating Our Data and Further Data Prep

We now need to convert our game-level data (our box scores) into weekly aggregate stats in order to properly compare players to one another to predict the award winners. Additionally, we will need to create groupby objects with number of games played per team and the highest number of games played by a Eastern Conference / Western Conference team per week. From there, we will need to join our dataframes together to have full information for every player each week:


In [None]:
#Perform a groupby on the box score datasets
trad_bs_data = trad_bs_data.groupby(['id', 'week starting', 'week ending', 'team', 'conference', 'team id', 'conf id']).agg({'MIN':'sum', 'PTS':'sum', 'FGM':'sum', 'FGA':'sum','3PM':'sum','3PA':'sum', 'FTM':'sum', 'FTA':'sum','OREB':'sum', 'DREB':'sum','AST':'sum','STL':'sum','BLK':'sum','TOV':'sum','PF':'sum','win':'sum','played':'sum'})
adv_bs_data = adv_bs_data.groupby(['id', 'week starting', 'week ending', 'team', 'conference']).agg({'OFFRTG':'mean', 'DEFRTG':'mean', 'NETRTG':'mean', 'AST%':'mean','OREB%':'mean','DREB%':'mean', 'REB%':'mean', 'TS%':'mean','USG%':'mean', 'PIE':'mean'})
team_data = team_data.groupby(['team id', 'team', 'week ending']).nunique('gameid')
conf_data = conf_data.groupby(['conf id', 'conference', 'week ending', 'team']).agg({'gameid':'nunique'})

conf_data.sort_values(by=['week ending', 'conference', 'gameid'], ascending=[True, True, False], inplace=True)
conf_data = conf_data.groupby(['conf id', 'conference', 'week ending']).first()

#Remove multiindex from the groupbys
trad_bs_data = trad_bs_data.reset_index()
adv_bs_data = adv_bs_data.reset_index()
team_data = team_data.reset_index()
conf_data = conf_data.reset_index()

#Remove 'team' and 'week ending' from the team data df
team_data.drop(['team', 'week ending'], axis=1, inplace=True)
conf_data.drop(['conference','week ending'], axis=1, inplace=True)
conf_data.rename(columns={'gameid':'max_games'}, inplace=True)

#Merge the team df with the traditional box score data
trad_bs_data = trad_bs_data.merge(team_data, how='left', right_on='team id', left_on='team id')
trad_bs_data = trad_bs_data.merge(conf_data, how='left', right_on='conf id', left_on='conf id')

#Rename aggregation to games, and calculate win% and percentage of team's games played by the player
trad_bs_data.rename(columns={'gameid':'games'}, inplace=True)
trad_bs_data['win%'] = trad_bs_data['win']/trad_bs_data['played']
trad_bs_data['played%'] = trad_bs_data['played']/trad_bs_data['games']
trad_bs_data['% vs. max'] = trad_bs_data['games']/trad_bs_data['max_games']

#Flag Player of the Week winners to serve as the binary dependent variable
potw_data['potw_winner'] = 1
potw_data_copy = potw_data.copy()
potw_data.drop(['week starting', 'week ending', 'Conference', 'Player'], axis=1, inplace=True)

#Calculate game score from the box score statistics
trad_bs_data['gscore'] = (trad_bs_data['PTS'] + (0.4 * trad_bs_data['FGM']) - (0.7 * trad_bs_data['FGA']) - (0.4*(trad_bs_data['FTA'] - trad_bs_data['FTM'])) + (0.7 * trad_bs_data['OREB']) + (0.3 * trad_bs_data['DREB']) + trad_bs_data['STL'] + (0.7 * trad_bs_data['AST']) + (0.7 * trad_bs_data['BLK']) - (0.4 * (trad_bs_data['PF'])) - trad_bs_data['TOV'])/trad_bs_data['played']

#Set indexes to be the id column in all datasets
potw_data.set_index('id', inplace=True)
trad_bs_data.set_index('id', inplace=True)
adv_bs_data.set_index('id', inplace=True)

#Join first dataset
model_1_data = trad_bs_data.merge(potw_data, how='left', left_index=True, right_index=True)
model_1_data['potw_winner'] = model_1_data['potw_winner'].replace(np.NaN,0)
model_1_data.drop(['Date', 'Team','Pos','Height','Weight','Age','Draft Yr', 'YOS','weekday','Season'], axis=1, inplace=True)
model_1_data = model_1_data[model_1_data['played%'] != 0]
model_1_data.sort_values(by='gscore', ascending=False, inplace=True)

## 5. Fitting a Logistic Regression Model to our Data

Now for the fun part - we will fit a logistic regression (logit) to 10 seasons of data (2014 - 2023 seasons), then apply this model to each player's weekly statistics in the 2024 season to get the model's Player of the Week predictions:

In [None]:
model_1_X = model_1_data.loc[:,['gscore', 'win%', 'played%', '% vs. max']]

#Test the model on the 2023-24 season data
model_1_X_test = model_1_X[model_1_data['week starting'] >= '2023-10-22']

#Train the model on the pre 2023-24 season data
model_1_X_train = model_1_X[model_1_data['week starting'] < '2023-10-22']

#Isolate the player of the week column to serve as the dependent variable to be tested and trained
model_1_Y = model_1_data.loc[:,'potw_winner']
model_1_Y_test = model_1_Y[model_1_data['week starting'] >= '2023-10-22']
model_1_Y_train = model_1_Y[model_1_data['week starting'] < '2023-10-22']

#Initialize the logistic regression model, fit the model to the data, and generate predictions
model_1_X_train = sm.add_constant(model_1_X_train)
model = sm.Logit(model_1_Y_train, model_1_X_train)
result = model.fit()
model_1_X_test = sm.add_constant(model_1_X_test)
model_1_pred = result.predict(model_1_X_test)

#Print a summary of our regression
print(result.summary())

I'd like to re-emphasize that our model's 'accuracy' is not based on the R-squared value of the regression, but rather on whether the players with the highest probability from the model were the actual winners. However, the summary table gives us a good idea of which variables are the strongest influencers on the players that the model is choosing as the most likely winners of the award (bearing in mind that an 'average' game score is 10). Here are a few notable takeaways:

* Given that **`win%`**, **`played%`**, and **`% vs. max`** are all bounded between 0 and 1, they are the easisest to compare at face-value. The model results imply that **`played%`** is the most influential characteristic, followed by **`win%`**, and lastly **`% vs. max.`** 
* Our constant in the model is negative, and is of significant magnitude relative to the bounded coefficients. In fact, assuming a player performs to an average standard (gscore = 10), plays in and wins all games during the week (**`win%`**, **`played%`** = 1.00), and is on a team with the most games in the conference during the week (**`% vs. max`** = 1.00), still results in a probability value <0! This is good news, because we would expect a player of the week to have to perform above average in order to win the award, which the model's parameters also are implying. A bit of algebra tells us that under these same conditions, a player would have to average a game score of ~18.3 to exceed a probability of zero. My guesstimation after glancing at __[the season leaders for average game score in 2023-24](https://www.teamrankings.com/nba/player-stat/game-score?season_id=221)__ is that 18.3 sits around borderline All-Star level output.

Let's now do a manual calculation of the accuracy from our model to see how well it performed on our test data. 

## 6. Final Data Cleansing and Calculating Model Accuracy

Now that we've produced probability estimates, let's connect each estimate to the respective  player's weekly box score statistics. From there, we should be able to sort and aggregate to identify the player with the highest probabilities from each week of the 2023-24 season according to the model, and be able to compare the predictions with the actual winners:

In [None]:
#Convert the predictions to be a dataframe and rename the one column to 'probability'
model_1_pred = pd.DataFrame(data=model_1_pred)
model_1_pred.columns = ['probability']
model_1_pred = model_1_pred.reset_index()
predictions_df = model_1_pred.copy()
model_1_pred.drop('id', axis=1, inplace=True)

#Filter the observational data to only include the 2023-24 season and reset the index
model_1_obs = model_1_data[model_1_data['week starting'] >= '2023-10-22']
model_1_obs = model_1_obs.reset_index()

#Combine the observational data with the predictions from the model, and sort by week, conference, and the model's predicted win 'probability'
all_model_1_data = pd.concat([model_1_obs, model_1_pred], axis=1)
all_model_1_data.sort_values(by=['week ending', 'conference', 'probability',], ascending=[True, True, False], inplace=True)

#Group the data by week and conference, returning the highest probability player as the aggregation using .max() and .first()
all_model_1_data_grouped = all_model_1_data.groupby(['week ending', 'conference']).agg({'probability':'max', 'id':'first'})
all_model_1_data_grouped = all_model_1_data_grouped.reset_index()

#Create a new id in order to combine the data with the actual player of the week winners
all_model_1_data_grouped['potw_id'] = all_model_1_data_grouped['week ending'].astype(str)+":"+all_model_1_data_grouped['conference']

#Create the same id using the actual player of the week winners data, sort by date and conference, and drop out of scope information
potw_data_copy['potw_id'] = potw_data_copy['week ending'].astype(str)+":"+potw_data_copy['Conference']
potw_data_copy.sort_values(by=['Date', 'Conference'], ascending=[True, True], inplace=True)
potw_data_copy = potw_data_copy[potw_data_copy['Date']>'2023-10-22']
potw_data_copy.drop(['Pos', 'potw_winner', 'week ending', 'Date', 'Conference','week starting', 'Player', 'Season', 'Team','Height', 'Weight', 'Age', 'Draft Yr', 'YOS','weekday'], axis=1, inplace=True) 

#Rename the id from the aggregation to be 'predicted winner' and change the id in the Player of the Week data to 'actual winner'
all_model_1_data_grouped.rename(columns={'id':'predicted winner'}, inplace=True)
potw_data_copy.rename(columns={'id':'actual winner'}, inplace=True)

#Combine the predictions and actuals into a single dataframe, then drop the join key
final_df = all_model_1_data_grouped.merge(potw_data_copy, how='left', left_on='potw_id', right_on='potw_id')
final_df.drop(columns='potw_id', axis=1, inplace=True)

#Determine the accuracy of the model for each week-conference combination
final_df['accurate'] = np.where((final_df['predicted winner'] == final_df['actual winner']), 1, 0)

#Remove null values (weeks where there was no player of the week awarded)
final_df.dropna(axis=0, inplace=True)

#Get the proportion of accurate predictions of the model to determine the model's accuracy
accuracy = final_df.value_counts('accurate', normalize=True)

print(final_df.head(5), '\n', accuracy)

Looks like our model accurately predicted 45% of the players of the week with a one shot guess - not terrible!

## 7. Analyzing the Results (More Data Prep)

From here, we can investigate further to see where the biggest difference between the probability of the actual winner and predicted winner was where the model was inaccurate. This might inform additional variables to consider for future analyses to produce a more accurate model:

In [None]:
#Perform a join to compare probability between actual winner and model's predicted winner
predictions_df.rename(columns={'probability':'actual probability'}, inplace=True)
final_df = final_df.merge(predictions_df, how='left', left_on='actual winner', right_on='id')
final_df.drop('id', axis=1, inplace=True)

#Calculate the difference in probability between the actual winner and the predicted winner, and sort in descending order
final_df['prob diff'] = final_df['probability'] - final_df['actual probability']
final_df.sort_values(by='prob diff', ascending=False, inplace=True)

#Now let's compare the variables from our regression across these large differences in probability by performing some data stitching
player_data = trad_bs_data.iloc[:,-4:] 
final_df = final_df.merge(player_data, how='left', left_on='predicted winner', right_index=True)
final_df.rename(columns={'gscore': 'pred gscore', 'win%': 'pred win%', 'played%':'pred played%', '% vs. max':'pred % vs. max', 'probability': 'pred probability'}, inplace=True)
final_df = final_df.merge(player_data, how='left', left_on='actual winner', right_index=True)
final_df.rename(columns={'gscore': 'actual gscore', 'win%': 'actual win%', 'played%':'actual played%', '% vs. max':'actual % vs. max'}, inplace=True)
final_df = final_df[['week ending', 'conference', 'predicted winner', 'actual winner', 'pred probability', 'actual probability', 'prob diff', 'pred gscore', 'actual gscore', 'pred win%', 'actual win%', 'pred win%', 'pred played%', 'actual played%', 'pred % vs. max', 'actual % vs. max', 'accurate']]

#Show the five biggest values in probability difference and respective player of the week cases between the predicted winner and the actual winner
print(final_df.head(5))

I've linked Reddit threads on these 5 cases - very interesting to read the discourse and compare the sentiment against our model's predictions. I will also attempt to analyze possible explanations for the model's shortcomings in these cases.

__[2023-10-29 Player of the Week: Nikola Jokic (West)](https://www.reddit.com/r/nba/comments/17k1hqj/nba_pr_denver_nuggets_center_nikola_joki%C4%87_and/)__ <br>
Luka Doncic had a fantastic set of games in the week, averaging 41 PTS, 11.5 REB, 8.5 AST, and only 2.0 TOV on 71.3 TS%. However, he only played in two games compared to Jokic's three, which I think hurt his case. The model believes Doncic's exceptional performances should offset the difference in 1 game played between he and Jokic, when clearly the voting committee did not share that belief. Several users in the thread also expected Luka to be the winner.

__[2024-03-03 Player of the Week: Jaylen Brown (East)](https://www.reddit.com/r/nba/comments/1b6kjn0/nba_pr_los_angeles_lakers_forward_lebron_james/)__ <br>
Giannis Antetokounmpo played the same number of games as Jaylen Brown in the week (3GP) and had strong averages of 31.3 PTS, 11.3 REB, 4.0 AST and only 1.0 TOV on  76.0 TS%. However, his opponenents were the Hornets (2x) and the Bulls; the former had the worst Net Rating in the league through that point in the season (according to __[Cleaning The Glass](https://cleaningtheglass.com/stats/league/summary?season=2023&seasontype=regseason&start=09/29/2023&end=03/3/2024)__), while the Bulls were also in the bottom 10. Jaylen Brown's opponents were the 76ers, Mavericks, and Warriors whose net ratings were 11th, 14th, and 15th, respectively. Additionally, Giannis had already won Eastern Conference Player of the Week twice to this point in the season, while this was Jaylen Brown's first and only time winning the award in the 2023-24 season.

__[2024-02-04 Player of the Week: Trae Young (East)](https://www.reddit.com/r/nba/comments/1ajqp4u/nba_pr_la_clippers_forward_kawhi_leonard_and/)__ <br>
I think this one is splitting hairs, as observed in comparing game scores between Donovan Mitchell and Trae Young. Mitchell had a compelling argument to win, having played more games (4 GP) than Trae (3 GP) while also not losing and averaging 32.3 PTS, 5.5 REB, 8.5 AST, 1.5 STL, 1.0 BLK and 2.5 TOV on 67.2 TS%. However, I think this is actually the opposite of case #1, where the model may be penalizing Trae too much for playing one fewer game. I would venture to guess that Trae Young had the second best odds of winning Eastern Conference Player of the Week in this week, despite the large delta in odds output by the model.

__[2024-01-14 Players of the Week: Lauri Markkanen (West) and Bam Adebayo (East)](https://www.reddit.com/r/nba/comments/197j0n7/nba_pr_utah_jazz_forward_lauri_markkanen_and/)__ <br>
In the case of Lauri Markkanen and SGA, I believe this may be another example similar to case #2. Shai Gilgeous-Alexander had won the Western Conference Player of the Week just two weeks prior, and was part of a 62 point victory against the Portland Trail Blazers (the 5th largest victory margin in NBA history through 2024), and a win against the Washington Wizards. The Blazers and Wizards were the 4th and 5th worst teams by net rating, respectively according to __[Cleaning The Glass](https://cleaningtheglass.com/stats/league/summary?season=2023&seasontype=regseason&start=09/29/2023&end=01/14/2024)__.

The Bam Adebayo vs. Donovan Mitchell argument is a simple one - Donovan Mitchell only played in one game during the week (compared to Bam playing in 4) and is not being penalized enough by the model. Bam Adebayo also impacts games defensively at an elite level, and defensive impact is not measured well by box score statistics alone, which the model is unable to capture.