# Fantasy basketball, building a roster around relative stats

- toc: false
- branch: master
- badges: true
- comments: false
- categories: [personal, data science, basketball]

## Overview

Fantasy basketball comes in 2 genres: points and categories leagues.
In points leagues, a player's counting stats combine to yield a fantasy points value (via scoring settings) that contributes to your fantasy team's success.
In category leagues, a player's counting stats are considered separately: you count how many rebounds your team obtained, how many points, how many assists, etc.

In both leagues, there are generally some roster criteria. 
For example, you can play 1 PG, 1 SG, 1 SF, 1 PF, 1 C, 1 G, 1 F, and 2-3 UTIL players each day.
This only introduces a mild positional complexity to fantasy basketball.
I say mild because, on any given day, only a few of your players will be playing, so you are likely not constrained by available positions you can play.
This is unlike fantasy football, where all your players play once a week, so positional constraints are strong.
At the very least, however, there are some positional curveballs in fantasy basketball leagues.
One such curveball is some leagues only allow up to 4 Cs per team.

In points leagues, evaluating players is somewhat 1-dimensional: who is projected to score more fantasy points as a function of their projected box score stats and league scoring settings?

In category leagues, scoring is 9-dimensional (for 9 categories): you can try to optimize on all 9 dimensions, or you can try to admit defeat and "punt" on a couple categories such that you can more easily succeed in the remaining categories.

This post will focus on category leagues, aiming to discuss a quantitative approach to punting and leveraging positional superiority.


## Data exercise

NBA player projections are pulled from [hashtagbasketball](https://hashtagbasketball.com/fantasy-basketball-projections).
In fact, for this exercise, you could pull NBA stats from anywhere you want and anyone you trust -- we just need to have some box score stats per NBA player.
These hashtag basketball projections include average draft pick (ADP) from some common fantasy basketball platforms as well.

Data cleaning aside, we can see the top 10 players are led by two centers, a couple of wings, but a lot of guards. 
We can also see a bunch of numbers for each box score, but they become hard to interpret and compare due to scale:

- Is 6 AST a lot or a little?
- 10 REB seems like a lot for a center, but to what extent?
- How do we juxtapose that against 9 AST from a guard?

In [1]:
from IPython.display import display
import pandas as pd
from scipy.stats import zscore

STAT_COLS = ["FG%", "FT%", "3pm", "PTS", "TREB", "AST", "STL", "BLK", "TO"]
POSITIONS = ["PG", "SG", "SF", "PF", "C"]

def format_percentages(val):
    """ String formatting on the percentages columns """
    if "(" in val:
        return float(val[0:val.index('(')])
    else:
        return float(val)
    
def encode_positions(val):
    """ Split comma-joined list of positions into separate columns """
    positions = {
        pos: False
        for pos in POSITIONS
    }
    for code in val.split(","):
        positions[code] = True
    return pd.Series(positions)

def clean_df(df):
    """ Clean and format data """
    df["FG%"] = df["FG%"].apply(format_percentages)
    df["FT%"] = df["FT%"].apply(format_percentages)   
    positions = df["POS"].apply(encode_positions)
    df = df.merge(positions, left_index=True, right_index=True)
    return df
    
df = (
    pd.read_csv("files/2023hashtagbasketballprojections.csv", index_col=2)
    .sort_values("ADP")
    .pipe(clean_df)
)
df.drop(columns=["PG", "SG", "SF", "PF", "C"]).head(10)

Unnamed: 0_level_0,R#,ADP,POS,TEAM,GP,MPG,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,TOTAL
PLAYER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
Nikola Jokic,1,1.1,C,DEN,72,33.6,0.624,0.821,1.0,24.5,11.7,9.7,1.4,0.7,3.7,14.56
Joel Embiid,2,2.6,C,PHI,67,34.2,0.537,0.836,1.2,32.6,10.2,4.2,1.1,1.6,3.3,13.62
Luka Doncic,6,3.2,PG,DAL,66,35.8,0.492,0.761,3.0,32.7,8.8,8.1,1.3,0.5,3.6,9.78
Jayson Tatum,5,4.5,"SF,PF",BOS,75,36.5,0.466,0.858,3.3,29.5,8.2,4.8,1.1,0.7,3.0,9.82
Tyrese Haliburton,4,5.7,"PG,SG",IND,73,34.7,0.488,0.857,2.8,23.3,4.0,10.5,1.7,0.4,2.6,9.92
Stephen Curry,7,6.6,PG,GS,65,34.6,0.476,0.914,4.7,28.0,5.6,5.6,0.9,0.3,3.3,9.69
Shai Gilgeous-Alexander,3,6.8,"PG,SG",OKC,67,35.5,0.486,0.884,1.0,29.6,5.0,5.6,1.5,0.9,2.9,11.03
Damian Lillard,8,9.6,PG,POR,65,36.3,0.456,0.906,3.8,29.2,4.0,7.1,0.8,0.3,3.1,9.54
Giannis Antetokounmpo,37,10.1,"PF,C",MIL,65,32.0,0.553,0.683,0.9,30.3,11.6,5.7,0.9,1.0,3.6,5.08
Kevin Durant,10,10.8,"SF,PF",PHO,60,34.2,0.522,0.9,1.9,25.3,6.5,5.0,0.7,1.0,3.1,9.46


## Z-scores to normalize and compare stats

The [Z-score](https://en.wikipedia.org/wiki/Standard_score) is a statistical technique to scale data into more "common sense" ranges.
Specifically, when you normalize a range of data, the average shifts to 0 and a "standard deviation" becomes +/-1 .
This is a useful tool to better understand how much better certain players are relative to other players.

Note: for the picky ones, the appication of the z-score does not require your data follow a normal distribution.
While standardizing data that *do* follow a normal distribution, you get the added benefit of [heuristics of the top 95/99th percentiles](https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule).
NBA counting stats don't strongly follow a normal distribution, so we can't say that a player with an AST z-score of 2 is in the 95th percentile.
If we did a deeper investigation into these distributions, we could identify these cumulative distribution properties, but that's for another time.

Continuing, we can take the example of Nikola Jokic. 
His REB z-score is 2.5, AST z-score of 2.99, and PTS z-score of 1.4.
This corroborates the general consensus: not an outstanding scorer (but still gets points), but really a great passer and rebounder.
On the other hand, Joel Embiid has a PTS z-score of 2.8 (we intuitively know Embiid is a stronger scorer than Jokic)

Looking at the top fantasy players, it becomes easy to pick out which players are crazy good in certain categories (roughly, those with z-scores > 2):

- AST: Jokic, Doncic, Haliburton, Ball, Harden, Young
- REB: Jokic, Embiid, Giannis, Davis
- 3PM: Tatum, Curry, Lillard, Ball, Mitchell

By doing this exercise, we can identify which players may be associated with certain punts and focuses for roster composition

In [3]:
(
    df[STAT_COLS]
    .apply(zscore)
    .assign(
        ADP=df["ADP"],
        total_z=lambda df_: df_[STAT_COLS].sum(axis=1)
    )
    .head(50)
    [["ADP", *STAT_COLS, "total_z"]]
)

Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
PLAYER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Nikola Jokic,1.1,1.88013,0.457226,-0.563614,1.427499,2.534388,2.994016,1.47034,0.113336,2.320597,12.633918
Joel Embiid,2.6,0.641376,0.632632,-0.360875,2.755406,1.901424,0.336963,0.512464,1.856959,1.821946,10.098294
Luka Doncic,3.2,0.000641,-0.244399,1.463774,2.7718,1.310658,2.221055,1.151048,-0.274136,2.195934,10.596374
Jayson Tatum,4.5,-0.369562,0.889895,1.767883,2.247195,1.057472,0.626823,0.512464,0.113336,1.447958,8.293463
Tyrese Haliburton,5.7,-0.056313,0.878201,1.261036,1.230773,-0.714827,3.380496,2.428216,-0.467872,0.949307,8.889015
Stephen Curry,6.6,-0.227176,1.544745,3.187055,2.001286,-0.039666,1.013303,-0.12612,-0.661608,1.821946,8.513764
Shai Gilgeous-Alexander,6.8,-0.084791,1.193932,-0.563614,2.263589,-0.292851,1.013303,1.789632,0.500807,1.323295,7.143303
Damian Lillard,9.6,-0.511947,1.451195,2.27473,2.198013,-0.714827,1.737954,-0.445412,-0.661608,1.57262,6.900717
Giannis Antetokounmpo,10.1,0.869193,-1.156512,-0.664983,2.378346,2.492191,1.061614,-0.12612,0.694543,2.195934,7.744204
Kevin Durant,10.8,0.427797,1.381032,0.348711,1.558651,0.340113,0.723443,-0.764705,0.694543,1.57262,6.282206


## Choosing the set of data to compute z-scores

One key characteristic of the z-score is it adjusts the range of input data that are provided.
It becomes extremely versatile if you carefully choose what your input data are.

In the above example, we looked at the pool of all available players.
This is certainly useful to understand the "landscape" of counting stats, but as top tier players get drafted, the remaining players all begin to appear muted and unimpressive with z-scores all close to 0.

We can remove players from our pool as they get drafted and re-tabulate z-scores to continually adjust and re-scale our data.
For example, if we neglected Jokic and Embiid from the dataset, the overall average REB goes down but still gets standardized to 0 (data below).
Anyone who stands out relative to this new average REB will show up with z-scores above 1 or 2.

Admittedly, this re-scaling does nothing more than slide numbers up and down, but will not change the ultimate, qualitative trend of who produces more AST.
In reality, you could do this z-score calculation over all players once, and work off that.
Re-scaling simply helps to make numbers pop out more easily.

In [6]:
(
    df[STAT_COLS]
    .drop(index=["Nikola Jokic", "Joel Embiid"])
    .apply(zscore)
    .assign(
        ADP=df["ADP"],
        total_z=lambda df_: df_[STAT_COLS].sum(axis=1)
    )
    .head(50)
    [["ADP", *STAT_COLS, "total_z"]]
)

Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
PLAYER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Luka Doncic,3.2,0.013443,-0.238064,1.453436,2.849007,1.36133,2.279315,1.162367,-0.265179,2.255881,10.871537
Jayson Tatum,4.5,-0.358623,0.892284,1.756362,2.313868,1.102775,0.655564,0.523065,0.12375,1.494738,8.503784
Tyrese Haliburton,5.7,-0.043798,0.880631,1.251485,1.277036,-0.707108,3.460225,2.440971,-0.459643,0.987308,9.087108
Stephen Curry,6.6,-0.215521,1.544857,3.170019,2.063022,-0.017629,1.049201,-0.116237,-0.654108,1.875309,8.698914
Shai Gilgeous-Alexander,6.8,-0.072418,1.195264,-0.566075,2.330591,-0.276184,1.049201,1.801669,0.512679,1.36788,7.342608
Damian Lillard,9.6,-0.501725,1.451632,2.26124,2.263699,-0.707108,1.787269,-0.435888,-0.654108,1.621595,7.086606
Giannis Antetokounmpo,10.1,0.886367,-1.147004,-0.66705,2.447653,2.567919,1.098405,-0.116237,0.707144,2.255881,8.033077
Kevin Durant,10.8,0.44275,1.381714,0.342705,1.611498,0.370203,0.753973,-0.755539,0.707144,1.621595,6.476043
Anthony Davis,11.6,0.843436,-0.26137,-1.272904,1.527883,2.481734,-0.328528,0.523065,2.846253,0.353022,6.712592
LaMelo Ball,12.6,-0.873791,0.915591,2.160264,1.293759,0.585666,2.525338,2.12132,-0.459643,2.255881,10.524385


## Positional z-score comparisons

The other consideration in fantasy drafting is position.
If you decided you want to build a team that focuses on REB and BLK and are willing to give up something like 3PM, you'd probably want to draft a bunch of Cs.
However, due to positional constraints, you will inevitably have to draft some PGs/SGs.

We can, again, apply z-scores to compare players.
In this situation, however, we restrict our input data to just players among PGs (or another position).
This way, we are asking ourselves "among PGs, who gets rebounds really well"?
Below, we have 5 tables comparing z-scores for each position.

Again, this ultimately doesn't change qualitative trends (we know Luka gets more rebounds than Steph), but in the world of PGs, we can see this gap is huge.

When searching for a good rebounding PG, you might observe someone like Cade fits the role well, but he's somewhat far down in the draft (mid ADP).
Unless you are extremely confident in someone like Cade to outperform his stats or fit your team composition extremely well, it's not a good idea to "reach" and try to draft him super early as your rebounding PG; 
there are likely some better options out there that could still fit your build.
With an ADP of 43.7, perhaps it may be reasonable to pick him 5-10 spots early based on role-fit.

In [5]:
positions_zscores = {}
for pos in POSITIONS:
    positions_zscores[pos] = (
        df.loc[df[pos], STAT_COLS]
        .apply(zscore)
        .assign(
            ADP=df.loc[df[pos], "ADP"],
            total_z=lambda df_: df_[STAT_COLS].sum(axis=1)
        )
    )


for pos, subdf in positions_zscores.items():
    subdf.index.rename(pos, inplace=True)
    display(subdf.head(20)[["ADP", *STAT_COLS, "total_z"]])
    print("--")

Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
PG,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Luka Doncic,3.2,1.035558,-0.746131,1.07894,2.376326,3.695595,1.433339,0.650081,0.763193,1.677393,11.964294
Tyrese Haliburton,5.7,0.926413,0.529644,0.86624,0.947392,-0.094236,2.655579,2.016353,0.254398,0.483382,8.585163
Stephen Curry,6.6,0.598978,1.287135,2.886894,1.661859,1.169041,0.160173,-0.716191,-0.254398,1.31919,8.112681
Shai Gilgeous-Alexander,6.8,0.87184,0.888456,-1.048064,1.905082,0.695312,0.160173,1.333217,2.798374,0.841585,8.445974
Damian Lillard,9.6,0.053252,1.180821,1.929742,1.844276,-0.094236,0.924072,-1.057759,-0.254398,1.080388,5.606158
LaMelo Ball,12.6,-0.656191,0.569512,1.823392,0.962593,2.274408,1.687972,1.674785,0.254398,1.677393,10.268263
Kyrie Irving,15.2,0.87184,1.207399,1.291641,1.46424,0.537402,0.109246,0.308513,1.780784,0.005777,7.576843
Donovan Mitchell,19.7,0.735409,0.569512,1.823392,1.631456,0.221583,-0.450947,1.333217,-0.254398,0.602783,6.212007
Fred VanVleet,21.0,-1.474779,0.941613,1.185291,0.202522,-0.173191,0.567586,1.674785,0.763193,-0.113624,3.573396
James Harden,22.4,-0.628904,0.755562,0.440839,0.536954,1.958589,2.706505,0.308513,0.763193,1.796794,8.638045


--


Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
SG,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Tyrese Haliburton,5.7,0.758798,0.603462,0.944849,1.158943,-0.223524,3.498986,1.990494,0.075512,0.92597,9.733489
Shai Gilgeous-Alexander,6.8,0.702515,1.006429,-1.233553,2.24479,0.597382,0.899831,1.381982,2.68697,1.323498,9.609842
LaMelo Ball,12.6,-0.873397,0.648236,2.034051,1.176178,2.239193,2.49115,1.686238,0.075512,2.251064,11.728224
Anthony Edwards,14.6,0.083407,-0.709912,1.307917,1.693248,1.254106,0.528523,1.990494,2.164678,1.853536,10.165996
Kyrie Irving,15.2,0.702515,1.364621,1.428939,1.744955,0.433201,0.846787,0.469214,1.642387,0.395932,9.028551
Devin Booker,17.0,0.730657,0.66316,0.339738,1.831134,0.515291,1.271139,0.164958,0.075512,0.92597,6.517558
Mikal Bridges,19.1,0.392961,0.94673,0.46076,1.520892,0.26902,-0.267137,0.164958,1.120095,-0.134106,4.474172
Donovan Mitchell,19.7,0.561809,0.648236,2.034051,1.934548,0.104839,0.263303,1.381982,-0.44678,1.058479,7.540465
Desmond Bane,20.4,0.252255,1.095977,1.670984,1.038293,0.925744,0.051127,0.469214,0.075512,0.395932,5.975037
James Harden,22.4,-0.845256,0.857182,0.46076,0.69358,1.910831,3.55203,0.469214,0.597804,2.383573,10.079717


--


Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
SF,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Jayson Tatum,4.5,-0.037308,0.886156,1.781118,2.374858,2.338643,1.346554,0.408842,1.09873,1.858725,12.056318
Kevin Durant,10.8,1.667714,1.566866,-0.051627,1.623311,1.116148,1.493413,-0.895778,2.467324,1.999567,10.986939
Anthony Edwards,14.6,-0.098202,-0.556301,1.519297,1.820145,0.612767,1.419984,2.365773,1.554928,2.281252,10.919643
Devin Booker,17.0,0.602075,0.934778,0.472015,1.963297,-0.034436,2.447998,0.408842,-0.269864,1.295354,7.820059
Mikal Bridges,19.1,0.236713,1.242718,0.602925,1.641205,-0.250171,0.31854,0.408842,0.642532,0.168614,5.011919
Desmond Bane,20.4,0.084479,1.404792,1.912028,1.140174,0.325121,0.759117,0.734997,-0.269864,0.731984,6.82283
Lauri Markkanen,25.4,0.662969,1.080644,1.257477,1.372795,2.19482,-0.782904,-0.895778,0.642532,0.309457,5.842012
LeBron James,25.5,1.028331,-0.68596,0.472015,1.873827,1.907174,2.227709,-0.569623,0.186334,1.999567,8.439373
Kawhi Leonard,26.6,1.119671,0.967193,-0.051627,1.247538,0.900413,0.538828,1.061152,0.186334,0.027772,5.997275
Jimmy Butler,26.8,1.332799,0.902363,-1.884371,0.907552,0.540856,1.493413,2.039618,-0.269864,0.168614,5.23098


--


Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
PF,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Jayson Tatum,4.5,-0.544238,1.19938,2.37276,2.387777,0.864325,1.152265,0.89705,-0.18154,1.551945,9.699723
Giannis Antetokounmpo,10.1,0.962311,-1.021022,-0.515817,2.521639,2.609102,1.682526,0.175286,0.414138,2.285973,9.114136
Kevin Durant,10.8,0.425495,1.732276,0.687756,1.684997,-0.008064,1.270101,-0.546479,0.414138,1.674283,7.334503
Anthony Davis,11.6,0.910361,-0.056734,-1.237962,1.601333,2.506468,-0.026092,0.89705,2.598291,0.450903,7.64362
Jaren Jackson Jr.,15.0,-0.319121,0.400035,0.447042,0.731225,0.299838,-0.968778,0.536168,4.385326,0.328565,5.840299
Domantas Sabonis,21.4,1.914727,-0.259742,-0.997247,0.714492,3.019638,2.625212,0.175286,-0.578659,1.551945,8.165653
Lauri Markkanen,25.4,-0.145955,1.351636,1.89133,1.450737,0.761691,-0.556353,-0.546479,-0.380099,0.206227,4.032735
LeBron James,25.5,0.061845,-0.031358,1.169186,1.919257,0.556423,1.859279,-0.185597,-0.578659,1.674283,6.44466
Jimmy Butler,26.8,0.235012,1.212068,-0.997247,1.015684,-0.4186,1.270101,2.701462,-0.777218,0.083889,4.325149
Karl-Anthony Towns,27.8,0.113795,0.882179,0.928471,1.082615,1.018276,1.034429,-0.546479,-0.18154,1.307269,5.639015


--


Unnamed: 0_level_0,ADP,FG%,FT%,3pm,PTS,TREB,AST,STL,BLK,TO,total_z
C,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Nikola Jokic,1.1,0.739002,1.212933,0.549929,1.754116,1.554789,4.05507,2.766862,-0.780403,2.522974,14.375272
Joel Embiid,2.6,-0.365621,1.384186,0.859264,3.119817,0.854356,1.002204,1.497084,0.634805,2.019312,11.005406
Giannis Antetokounmpo,10.1,-0.162472,-0.36259,0.395261,2.732026,1.508093,1.834804,0.650565,-0.308667,2.397059,8.684078
Anthony Davis,11.6,-0.200563,0.505089,-0.532744,1.804698,1.414702,0.22511,1.497084,1.421032,0.508326,6.642736
Jaren Jackson Jr.,15.0,-1.102036,0.916095,1.632601,0.927952,-0.593206,-0.662996,1.073825,2.836241,0.38241,5.410885
Domantas Sabonis,21.4,0.535853,0.32242,-0.223409,0.911091,1.881657,2.72291,0.650565,-1.094894,1.641565,7.34776
Karl-Anthony Towns,27.8,-0.784616,1.349935,2.251271,1.282022,0.060531,1.22423,-0.195953,-0.780403,1.389734,5.796752
Victor Wembanyama,30.1,-1.419457,0.733426,0.549929,0.54016,-0.079555,-0.329956,-0.195953,1.735523,0.130579,1.664696
Pascal Siakam,30.6,-1.114733,0.550757,0.859264,1.349464,-0.266338,1.612777,1.073825,-1.094894,0.886072,3.856194
Myles Turner,34.4,-0.403712,0.676342,1.323266,0.337834,-0.45312,-0.60749,-0.619213,2.050014,-0.247168,2.056755


--


## Holistically comparing players and re-ranking players according to punts

In the data tables presented so far, I've included a `total_z` column.
Roughly speaking, if you wanted to find the most-general "best" player, you'd want the player with 
the highest z-score across the board.
This can be simply evaluated by summing up a player's zscores.
Revisiting the data, we can see that higher z-score does not directly correlate to higher ADP.
ADP will generally factor in things like player availability (injuries) and season outlook, for which our z-score has not accounted.

As a team is constructed through the draft, perhaps you begin to identify (perhaps by using z-scores) certain categories you want to punt.
If you want to commit to the punt, you can re-evaluate the remaining players solely based on the categories on which you're focusing.
For example, the table below examines a scenario where we focus on PTS, REB, AST, STL, and FT%.
In this case, we pretend the other 4 categories don't exist and only examine these 5;
compute the z-scores and sum across these 5 categories.
By neglecting certain categories, the new "best" players can shift around and this may help prioritze players for the punt.

As a counterpoint, some leagues might be more amenable to drafting the best player available. 
In this situation, you do not draft based on your team build, but instead try to draft the best players to maximize the amount
of "draft capital" you have throughout the season.
Drafting the best player available means the player will likely have good value to both you *and* opposing league managers.
The team you draft might not have any notably strong or weak categories, but you have a lot of valuable trade pieces
for which you can then re-construct your team (if your league trades a lot).
If you draft based on your punt, your valuation of the player *will noticeably differ* from another manager's valuation of a player.
For example, our PTS, REB, AST, STL, FT% focus says Trae Young is the 8th best player in the league even though his ADP is 23.5. 
Spending your first round pick on Trae Young means you likely miss out on some players that would be highly valued by your competition (like missing out on Embiid or Giannis).
In my experience, the fact that there's no objective valuation for a player is what makes fantasy basketball trades so interesting and complex.

In [9]:
FOCUS_COLS = ["PTS", "TREB", "AST", "STL", "FT%"]
(
    df[FOCUS_COLS]
    .apply(zscore)
    .assign(
        ADP=df["ADP"],
        total_z=lambda df_: df_[FOCUS_COLS].sum(axis=1),
        new_rank=lambda df_: df_["total_z"].rank(ascending=False)
    )
    .sort_values("total_z", ascending=False)
    .head(50)
    [["ADP", "new_rank", *FOCUS_COLS, "total_z"]]
)

Unnamed: 0_level_0,ADP,new_rank,PTS,TREB,AST,STL,FT%,total_z
PLAYER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Nikola Jokic,1.1,1.0,1.427499,2.534388,2.994016,1.47034,0.457226,8.88347
LaMelo Ball,12.6,2.0,1.247166,0.551101,2.462605,2.108924,0.913282,7.283079
Luka Doncic,3.2,3.0,2.7718,1.310658,2.221055,1.151048,-0.244399,7.210161
Tyrese Haliburton,5.7,4.0,1.230773,-0.714827,3.380496,2.428216,0.878201,7.202859
James Harden,22.4,5.0,0.788137,0.38231,3.428806,0.831756,1.076995,6.508005
Joel Embiid,2.6,6.0,2.755406,1.901424,0.336963,0.512464,0.632632,6.138889
Shai Gilgeous-Alexander,6.8,7.0,2.263589,-0.292851,1.013303,1.789632,1.193932,5.967605
Trae Young,23.5,8.0,1.853741,-0.968013,3.138946,0.193172,1.322564,5.540409
Jayson Tatum,4.5,9.0,2.247195,1.057472,0.626823,0.512464,0.889895,5.333848
Dejounte Murray,33.1,10.0,0.804531,0.340113,1.399784,2.108924,0.46892,5.122272


## A framework or starting point for evaluating trades

Even though I just said there's no objective valuation for fantasy basketball players, we can still try to establish some princples when it comes to conducting fair trades.

When trading, people quickly react with "X got fleeced". 
However, if it was so easy to identify X got fleeced, why would X have made the trade in the first place?
It ultimately comes down to the values each player has on the new team.

For example, having notoriously-inefficient shooter like Fred Vanvleet is bad if you need to keep a high FG%, but great if you're looking for AST and STL.
If you're looking to add FVV to your AST/STL focus, his "value" comes from how many z-scores of AST/STL by which he could increase your team. 
If you don't care about FG%, you don't need to look into his abysmal FG% z-score.

Conversely, if you're trying to build a reasonable trade offer for FVV, it helps to understand the team composition of your trade counterpart.
If your trade counterpart is looking to optimize on something like BLK and PTS, then examine who on your roster can add as many BLK/PTS z-scores as FVV adds AST/STL z-scores.
In summary, valuations of players for trades comes down to the dimensions/categories that are relevant for you (and then any personal opinions/availability outlook/player news that could affect projections).

Or, yes, you could try to fleece your opponent and trade a player with "value 10" for an unequivocally player with "value 15", but this is probably not the best way to conduct fantasy basketball trades.
If you are successful, then this is certainly one way to try and increase the "net worth" of your team without any considerations for winning categories.

## Building an interactive tool

I've updated the [streamlit fantasy basketball dashboard](https://ahy3nz-catsketball-catsketballapp-aif99i.streamlit.app/) to apply these ideas [github repo here](https://github.com/ahy3nz/catsketball)