## To-Dos

### Basic

- [x] Filter out players that should not acount for statistics
- [x] Calculate averages with top 10 per position
- [x] Calculate averages with top 50 general
- [x] Include new players for this season
- [x] Include players that got injured early last season
- [x] Add position value (position average against general average)
- [x] Assign tiers manually in final board

### Intermediate

- [ ] Include missing players from app
- [ ] Identify unavailable players from last season who will return well (steals) (Cartola blogs)
- [x] Recalculate z-score using per-position number of players (GK: 10, CB: 20, FB: 20, MD: 30, AT: 30)
- [x] Add risk factor (with gamma to penalize unavailability less because z_all does it already)

### Advanced

- [x] Add floor factor (see ChatGPT)
- [x] Add Average Draft Position (ADP)

### Analysis

- [x] Prepare board for usage in 2026 season (quick filters, quick evaluation)
- [x] Rescore position tier
- [x] Score general tier
- [x] Reassign position and general ADP based on tiers
- [x] Send to WhatsApp

In [1]:
! source venv/bin/activate
! pip install pandas
! pip install openpyxl

Looking in indexes: https://pypi.org/simple, https://aws:****@dsi-835811189142.d.codeartifact.eu-west-1.amazonaws.com/pypi/dsi_repository/simple

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Looking in indexes: https://pypi.org/simple, https://aws:****@dsi-835811189142.d.codeartifact.eu-west-1.amazonaws.com/pypi/dsi_repository/simple

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
import pandas as pd

In [3]:
def prepare_data(year: int, num_round: int) -> pd.DataFrame:
    
    data_round = pd.read_csv(f"data/01_raw/{year}/rodada-{num_round}.csv")
    
    cols = {
        "atletas.atleta_id": "id",
        "atletas.apelido": "name",
        "atletas.clube.id.full.name": "team",
        "atletas.posicao_id": "position_id",
        "atletas.jogos_num": "matches",
        "atletas.pontos_num": "pts_round",
        "atletas.media_num": "pts_avg_played",
        "atletas.entrou_em_campo": "has_played",
    }
    data_round = data_round[cols.keys()].copy()
    data_round.columns = cols.values()
    
    dict_position = {
        1: "GK",
        2: "FB",
        3: "CB",
        4: "MD",
        5: "AT",
        6: "HC",
    }
    data_round["position"] = data_round["position_id"].map(dict_position)
    data_round.drop("position_id", axis=1, inplace=True)
    
    data_round["round"] = num_round
    
    return data_round

In [4]:
# Use data from last round since it aggregates full season stats
data_25 = prepare_data(2025, 38)
data_25.head()

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round
0,37281,Mano Menezes,GRE,31,10.67,5.43,True,HC,38
1,37457,Léo Condé,CEA,35,4.09,4.98,True,HC,38
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38


In [5]:
data_all_25 = pd.DataFrame()
for round in range(1, 39):
    data_all_25 = pd.concat([data_all_25, prepare_data(2025, round)], ignore_index=True)
data_all_25.head()

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round
0,37281,Mano Menezes,FLU,0,0.0,0.0,False,HC,1
1,37457,Léo Condé,CEA,0,0.0,0.0,False,HC,1
2,37656,Fábio,FLU,0,0.0,0.0,False,GK,1
3,37715,Thiago Silva,FLU,0,0.0,0.0,False,CB,1
4,38398,Renato Augusto,FLU,0,0.0,0.0,False,MD,1


In [6]:
def count_max_matches(player_id: int) -> int:
    return data_all_25[data_all_25["id"] == player_id]["matches"].count()

In [7]:
data_25["matches_max"] = data_25["id"].apply(count_max_matches)
data_25["availability"] = data_25["matches"] / data_25["matches_max"]
data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,availability
0,37281,Mano Menezes,GRE,31,10.67,5.43,True,HC,38,34,0.911765
1,37457,Léo Condé,CEA,35,4.09,4.98,True,HC,38,38,0.921053
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,0.947368
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,0.605263
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,0.605263
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,0.710526
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,0.684211
7,39850,Vagner Mancini,RBB,5,3.04,4.26,True,HC,38,8,0.625
8,40006,Abel Braga,INT,2,7.58,5.6,True,HC,38,2,1.0
9,40990,Dorival Júnior,COR,32,4.65,5.11,True,HC,38,33,0.969697


In [8]:
def calculate_std_played(player_id: int) -> float:
    return data_all_25[data_all_25["id"] == player_id]["pts_avg_played"].std()

In [9]:
data_25["pts_std_played"] = data_25["id"].apply(calculate_std_played)
data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,availability,pts_std_played
0,37281,Mano Menezes,GRE,31,10.67,5.43,True,HC,38,34,0.911765,0.964043
1,37457,Léo Condé,CEA,35,4.09,4.98,True,HC,38,38,0.921053,1.082931
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,0.947368,0.738093
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,0.605263,0.864865
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,0.605263,0.909614
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,0.710526,1.132822
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,0.684211,0.51608
7,39850,Vagner Mancini,RBB,5,3.04,4.26,True,HC,38,8,0.625,0.853914
8,40006,Abel Braga,INT,2,7.58,5.6,True,HC,38,2,1.0,1.407142
9,40990,Dorival Júnior,COR,32,4.65,5.11,True,HC,38,33,0.969697,1.055737


In [10]:
# Filter players who:
# - Played at least 5 matches
# - Availability higher than 30%
# - Are not head coaches
data_25 = data_25[(data_25["matches"] >= 5) & ((data_25["availability"] >= 0.3)) & (data_25["position"] != "HC")]
data_25.head()

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,availability,pts_std_played
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,0.947368,0.738093
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,0.605263,0.864865
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,0.605263,0.909614
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,0.710526,1.132822
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,0.684211,0.51608


In [11]:
data_25["pts_avg_all"] = data_25["pts_avg_played"] * data_25["availability"]
data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,availability,pts_std_played,pts_avg_all
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,0.947368,0.738093,4.102105
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,0.605263,0.864865,2.554211
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,0.605263,0.909614,1.670526
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,0.710526,1.132822,4.398158
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,0.684211,0.51608,1.628421
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,0.684211,0.720983,1.17
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,0.921053,1.084309,5.084211
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,0.736842,2.74148,4.841053
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,0.789474,1.230944,3.039474
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,0.552632,0.942813,1.132895


In [12]:
# Calculate value of each position for top players

# Average of top 10 players in each position
top10_means = data_25.groupby("position")["pts_avg_played"].apply(lambda x: x.nlargest(10).mean())
(top10_means - top10_means.mean()).sort_values(ascending=False)

position
AT    0.9536
FB    0.7906
MD    0.0096
GK   -0.6784
CB   -1.0754
Name: pts_avg_played, dtype: float64

In [13]:
# Calculate value of each position for top players

# Deviation of top 10 players in each position
top10_std = data_25.groupby("position")["pts_avg_played"].apply(lambda x: x.nlargest(10).std())
top10_std.sort_values(ascending=False)

position
MD    1.318943
GK    1.258522
AT    1.203144
FB    0.806391
CB    0.318580
Name: pts_avg_played, dtype: float64

In [14]:
# Calculate value of each position for general players

# Average of top 30 players in each position
top30_means = data_25.groupby("position")["pts_avg_played"].apply(lambda x: x.nlargest(30).mean())
(top30_means - top30_means.mean()).sort_values(ascending=False)

position
FB    0.865400
AT    0.559067
MD   -0.234933
CB   -0.510600
GK   -0.678933
Name: pts_avg_played, dtype: float64

In [15]:
# Calculate value of each position for general players

# Deviation of top 30 players in each position
top30_std = data_25.groupby("position")["pts_avg_played"].apply(lambda x: x.nlargest(30).std())
top30_std.sort_values(ascending=False)

position
GK    1.376150
AT    1.301418
MD    1.244857
FB    0.890872
CB    0.491676
Name: pts_avg_played, dtype: float64

In [16]:
# Number of players to compare per position
dict_num_players = {
    "GK": 10,
    "FB": 20,
    "CB": 20,
    "MD": 30,
    "AT": 30,
}

In [17]:
# Calculate mean and std of top X players per position
topX_means = data_25.groupby("position")["pts_avg_played"].apply(lambda x: x.nlargest(dict_num_players[x.name]).mean())
topX_stds = data_25.groupby("position")["pts_avg_played"].apply(lambda x: x.nlargest(dict_num_players[x.name]).std())

data_25["position__pts_avg_played"] = data_25["position"].map(topX_means)
data_25["position__pts_std_played"] = data_25["position"].map(topX_stds)

# Calculate z-score
data_25["pts_avg_played__vs_top_position"] = data_25["pts_avg_played"] - data_25["position__pts_avg_played"]
data_25["pts_z_played__vs_top_position"] = (data_25["pts_avg_played"] - data_25["position__pts_avg_played"]) / data_25["position__pts_std_played"]

data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,availability,pts_std_played,pts_avg_all,position__pts_avg_played,position__pts_std_played,pts_avg_played__vs_top_position,pts_z_played__vs_top_position
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,0.947368,0.738093,4.102105,5.541,1.258522,-1.211,-0.96224
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,0.605263,0.864865,2.554211,4.857,0.396818,-0.637,-1.605272
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,0.605263,0.909614,1.670526,4.877333,1.244857,-2.117333,-1.700865
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,0.710526,1.132822,4.398158,5.671333,1.301418,0.518667,0.39854
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,0.684211,0.51608,1.628421,4.877333,1.244857,-2.497333,-2.006121
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,0.684211,0.720983,1.17,5.671333,1.301418,-3.961333,-3.04386
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,0.921053,1.084309,5.084211,5.541,1.258522,-0.021,-0.016686
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,0.736842,2.74148,4.841053,5.541,1.258522,1.029,0.817626
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,0.789474,1.230944,3.039474,4.877333,1.244857,-1.027333,-0.825262
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,0.552632,0.942813,1.132895,5.671333,1.301418,-3.621333,-2.782607


In [18]:
# Calculate mean and std of top X players per position
topX_means = data_25.groupby("position")["pts_avg_all"].apply(lambda x: x.nlargest(dict_num_players[x.name]).mean())
topX_stds = data_25.groupby("position")["pts_avg_all"].apply(lambda x: x.nlargest(dict_num_players[x.name]).std())

data_25["position__pts_avg_all"] = data_25["position"].map(topX_means)
data_25["position__pts_std_all"] = data_25["position"].map(topX_stds)

# Calculate z-score
data_25["pts_avg_all__vs_top_position"] = data_25["pts_avg_all"] - data_25["position__pts_avg_all"]
data_25["pts_z_all__vs_top_position"] = (data_25["pts_avg_all"] - data_25["position__pts_avg_all"]) / data_25["position__pts_std_all"]

data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,pts_std_played,pts_avg_all,position__pts_avg_played,position__pts_std_played,pts_avg_played__vs_top_position,pts_z_played__vs_top_position,position__pts_avg_all,position__pts_std_all,pts_avg_all__vs_top_position,pts_z_all__vs_top_position
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,...,0.738093,4.102105,5.541,1.258522,-1.211,-0.96224,4.16164,0.713226,-0.059535,-0.083473
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,...,0.864865,2.554211,4.857,0.396818,-0.637,-1.605272,3.313605,0.554797,-0.759395,-1.368779
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,...,0.909614,1.670526,4.877333,1.244857,-2.117333,-1.700865,3.59762,1.149005,-1.927094,-1.677185
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,...,1.132822,4.398158,5.671333,1.301418,0.518667,0.39854,4.358933,1.096716,0.039225,0.035766
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,...,0.51608,1.628421,4.877333,1.244857,-2.497333,-2.006121,3.59762,1.149005,-1.969199,-1.71383
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,...,0.720983,1.17,5.671333,1.301418,-3.961333,-3.04386,4.358933,1.096716,-3.188933,-2.907712
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,...,1.084309,5.084211,5.541,1.258522,-0.021,-0.016686,4.16164,0.713226,0.92257,1.293517
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,...,2.74148,4.841053,5.541,1.258522,1.029,0.817626,4.16164,0.713226,0.679413,0.95259
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,...,1.230944,3.039474,4.877333,1.244857,-1.027333,-0.825262,3.59762,1.149005,-0.558147,-0.485765
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,...,0.942813,1.132895,5.671333,1.301418,-3.621333,-2.782607,4.358933,1.096716,-3.226038,-2.941545


In [19]:
# Calculate mean and std of top 200 players
data_25["general__pts_avg_played"] = data_25["pts_avg_played"].nlargest(200).mean()
data_25["general__pts_std_played"] = data_25["pts_avg_played"].nlargest(200).std()

# Calculate z-score
data_25["pts_avg_played__vs_top_general"] = data_25["pts_avg_played"] - data_25["general__pts_avg_played"]
data_25["pts_z_played__vs_top_general"] = (data_25["pts_avg_played"] - data_25["general__pts_avg_played"]) / data_25["general__pts_std_played"]

data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,pts_avg_played__vs_top_position,pts_z_played__vs_top_position,position__pts_avg_all,position__pts_std_all,pts_avg_all__vs_top_position,pts_z_all__vs_top_position,general__pts_avg_played,general__pts_std_played,pts_avg_played__vs_top_general,pts_z_played__vs_top_general
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,...,-1.211,-0.96224,4.16164,0.713226,-0.059535,-0.083473,4.90055,1.10607,-0.57055,-0.515835
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,...,-0.637,-1.605272,3.313605,0.554797,-0.759395,-1.368779,4.90055,1.10607,-0.68055,-0.615287
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,...,-2.117333,-1.700865,3.59762,1.149005,-1.927094,-1.677185,4.90055,1.10607,-2.14055,-1.935276
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,...,0.518667,0.39854,4.358933,1.096716,0.039225,0.035766,4.90055,1.10607,1.28945,1.165795
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,...,-2.497333,-2.006121,3.59762,1.149005,-1.969199,-1.71383,4.90055,1.10607,-2.52055,-2.278835
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,...,-3.961333,-3.04386,4.358933,1.096716,-3.188933,-2.907712,4.90055,1.10607,-3.19055,-2.884583
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,...,-0.021,-0.016686,4.16164,0.713226,0.92257,1.293517,4.90055,1.10607,0.61945,0.560046
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,...,1.029,0.817626,4.16164,0.713226,0.679413,0.95259,4.90055,1.10607,1.66945,1.509353
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,...,-1.027333,-0.825262,3.59762,1.149005,-0.558147,-0.485765,4.90055,1.10607,-1.05055,-0.949805
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,...,-3.621333,-2.782607,4.358933,1.096716,-3.226038,-2.941545,4.90055,1.10607,-2.85055,-2.577188


In [20]:
# Calculate mean and std of top 200 players
data_25["general__pts_avg_all"] = data_25["pts_avg_all"].nlargest(200).mean()
data_25["general__pts_std_all"] = data_25["pts_avg_all"].nlargest(200).std()

# Calculate z-score
data_25["pts_avg_all__vs_top_general"] = data_25["pts_avg_all"] - data_25["general__pts_avg_all"]
data_25["pts_z_all__vs_top_general"] = (data_25["pts_avg_all"] - data_25["general__pts_avg_all"]) / data_25["general__pts_std_all"]

data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,pts_avg_all__vs_top_position,pts_z_all__vs_top_position,general__pts_avg_played,general__pts_std_played,pts_avg_played__vs_top_general,pts_z_played__vs_top_general,general__pts_avg_all,general__pts_std_all,pts_avg_all__vs_top_general,pts_z_all__vs_top_general
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,...,-0.059535,-0.083473,4.90055,1.10607,-0.57055,-0.515835,3.401697,1.082086,0.700408,0.647276
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,...,-0.759395,-1.368779,4.90055,1.10607,-0.68055,-0.615287,3.401697,1.082086,-0.847486,-0.783197
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,...,-1.927094,-1.677185,4.90055,1.10607,-2.14055,-1.935276,3.401697,1.082086,-1.731171,-1.599846
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,...,0.039225,0.035766,4.90055,1.10607,1.28945,1.165795,3.401697,1.082086,0.996461,0.920871
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,...,-1.969199,-1.71383,4.90055,1.10607,-2.52055,-2.278835,3.401697,1.082086,-1.773276,-1.638757
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,...,-3.188933,-2.907712,4.90055,1.10607,-3.19055,-2.884583,3.401697,1.082086,-2.231697,-2.062403
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,...,0.92257,1.293517,4.90055,1.10607,0.61945,0.560046,3.401697,1.082086,1.682514,1.55488
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,...,0.679413,0.95259,4.90055,1.10607,1.66945,1.509353,3.401697,1.082086,1.439356,1.330168
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,...,-0.558147,-0.485765,4.90055,1.10607,-1.05055,-0.949805,3.401697,1.082086,-0.362223,-0.334745
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,...,-3.226038,-2.941545,4.90055,1.10607,-2.85055,-2.577188,3.401697,1.082086,-2.268802,-2.096693


In [21]:
# Calculate Draft Value Score (DVS)

alpha = 0.6
beta = 0.4

data_25["dvs_position"] = (
    alpha * data_25["pts_z_played__vs_top_position"] +
    beta * data_25["pts_z_all__vs_top_position"]
)

data_25["dvs_general"] = (
    alpha * data_25["pts_z_played__vs_top_general"] +
    beta * data_25["pts_z_all__vs_top_general"]
)

data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,general__pts_avg_played,general__pts_std_played,pts_avg_played__vs_top_general,pts_z_played__vs_top_general,general__pts_avg_all,general__pts_std_all,pts_avg_all__vs_top_general,pts_z_all__vs_top_general,dvs_position,dvs_general
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,...,4.90055,1.10607,-0.57055,-0.515835,3.401697,1.082086,0.700408,0.647276,-0.610733,-0.050591
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,...,4.90055,1.10607,-0.68055,-0.615287,3.401697,1.082086,-0.847486,-0.783197,-1.510675,-0.682451
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,...,4.90055,1.10607,-2.14055,-1.935276,3.401697,1.082086,-1.731171,-1.599846,-1.691393,-1.801104
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,...,4.90055,1.10607,1.28945,1.165795,3.401697,1.082086,0.996461,0.920871,0.25343,1.067825
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,...,4.90055,1.10607,-2.52055,-2.278835,3.401697,1.082086,-1.773276,-1.638757,-1.889205,-2.022804
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,...,4.90055,1.10607,-3.19055,-2.884583,3.401697,1.082086,-2.231697,-2.062403,-2.989401,-2.555711
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,...,4.90055,1.10607,0.61945,0.560046,3.401697,1.082086,1.682514,1.55488,0.507395,0.95798
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,...,4.90055,1.10607,1.66945,1.509353,3.401697,1.082086,1.439356,1.330168,0.871612,1.437679
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,...,4.90055,1.10607,-1.05055,-0.949805,3.401697,1.082086,-0.362223,-0.334745,-0.689463,-0.703781
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,...,4.90055,1.10607,-2.85055,-2.577188,3.401697,1.082086,-2.268802,-2.096693,-2.846182,-2.38499


In [22]:
# Calculate Adjusted DVS

data_25["risk"] = 1 - data_25["availability"]

gamma = 0.3  # less penalizing since we already consider availability in pts_avg_all

data_25["dvs_position_adj"] = data_25["dvs_position"] * (1 - data_25["risk"] * gamma)
data_25["dvs_general_adj"] = data_25["dvs_general"] * (1 - data_25["risk"] * gamma)

data_25.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,pts_z_played__vs_top_general,general__pts_avg_all,general__pts_std_all,pts_avg_all__vs_top_general,pts_z_all__vs_top_general,dvs_position,dvs_general,risk,dvs_position_adj,dvs_general_adj
2,37656,Fábio,FLU,36,11.5,4.33,True,GK,38,38,...,-0.515835,3.401697,1.082086,0.700408,0.647276,-0.610733,-0.050591,0.052632,-0.60109,-0.049792
3,37715,Thiago Silva,FLU,23,13.9,4.22,True,CB,38,38,...,-0.615287,3.401697,1.082086,-0.847486,-0.783197,-1.510675,-0.682451,0.394737,-1.331779,-0.601634
4,38913,Nenê,JUV,23,1.0,2.76,True,MD,38,38,...,-1.935276,3.401697,1.082086,-1.731171,-1.599846,-1.691393,-1.801104,0.394737,-1.491096,-1.587815
5,39148,Hulk,CAM,27,19.2,6.19,True,AT,38,38,...,1.165795,3.401697,1.082086,0.996461,0.920871,0.25343,1.067825,0.289474,0.231422,0.975093
6,39656,Alan Franco,CAM,26,1.2,2.38,True,MD,38,38,...,-2.278835,3.401697,1.082086,-1.773276,-1.638757,-1.889205,-2.022804,0.315789,-1.710227,-1.83117
11,42222,Osvaldo,VIT,26,0.0,1.71,False,AT,38,38,...,-2.884583,3.401697,1.082086,-2.231697,-2.062403,-2.989401,-2.555711,0.315789,-2.706194,-2.313591
12,42234,Cássio,CRU,35,0.0,5.52,False,GK,38,38,...,0.560046,3.401697,1.082086,1.682514,1.55488,0.507395,0.95798,0.078947,0.495378,0.935291
14,51413,Walter,MIR,28,0.0,6.57,False,GK,38,38,...,1.509353,3.401697,1.082086,1.439356,1.330168,0.871612,1.437679,0.263158,0.8028,1.324178
15,51772,Everton Ribeiro,BAH,30,1.7,3.85,True,MD,38,38,...,-0.949805,3.401697,1.082086,-0.362223,-0.334745,-0.689463,-0.703781,0.210526,-0.645918,-0.659332
16,61188,Gilberto,JUV,21,0.0,2.05,False,AT,38,38,...,-2.577188,3.401697,1.082086,-2.268802,-2.096693,-2.846182,-2.38499,0.447368,-2.464194,-2.0649


In [23]:
# Ranking players per position based on adjusted position DVS
data_25["adp_position"] = data_25.groupby("position")["dvs_position_adj"].rank(ascending=False, method="min").astype(int)
data_25.sort_values(by=["dvs_position_adj"], ascending=False).head(20)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,general__pts_avg_all,general__pts_std_all,pts_avg_all__vs_top_general,pts_z_all__vs_top_general,dvs_position,dvs_general,risk,dvs_position_adj,dvs_general_adj,adp_position
161,87863,Arrascaeta,FLA,32,0.0,9.32,False,MD,38,38,...,3.401697,1.082086,4.446724,4.109401,3.62111,4.041141,0.157895,3.449584,3.849718,1
359,103445,Kaio Jorge,CRU,33,0.0,9.46,False,AT,38,38,...,3.401697,1.082086,4.813566,4.448415,3.153212,4.252691,0.131579,3.028743,4.084822,1
514,114802,Vitor Roque,PAL,31,0.0,8.31,False,AT,38,38,...,3.401697,1.082086,3.377514,3.121299,2.099256,3.098014,0.184211,1.983245,2.926808,2
258,96610,Matheus Pereira,CRU,34,0.0,7.03,False,MD,38,38,...,3.401697,1.082086,2.888303,2.6692,1.97484,2.222824,0.105263,1.912477,2.15263,2
85,78850,Reinaldo,MIR,32,0.0,7.94,False,FB,38,38,...,3.401697,1.082086,3.284619,3.035451,1.94191,2.862965,0.157895,1.849925,2.72735,1
398,105300,Ferraresi,SAO,23,0.0,5.99,False,CB,38,38,...,3.401697,1.082086,0.223829,0.20685,1.93802,0.673724,0.394737,1.708518,0.593941,1
497,112709,Kaiki Bruno,CRU,34,0.0,7.34,False,FB,38,38,...,3.401697,1.082086,3.165672,2.925527,1.484969,2.493518,0.105263,1.438075,2.414775,2
425,106708,Igor Formiga,JUV,16,6.3,7.47,True,FB,38,19,...,3.401697,1.082086,2.888829,2.669686,1.451088,2.461702,0.157895,1.382352,2.345095,3
565,122486,Rayan,VAS,32,0.0,7.3,False,AT,38,38,...,3.401697,1.082086,2.745672,2.537388,1.403161,2.316564,0.157895,1.336696,2.206832,3
423,106593,Villalba,CRU,32,0.0,5.21,False,CB,38,38,...,3.401697,1.082086,0.985672,0.9109,1.307913,0.532224,0.157895,1.245959,0.507014,2


In [24]:
# Ranking all players based on adjusted general DVS
data_25["adp_general"] = data_25["dvs_general_adj"].rank(ascending=False, method="min").astype(int)
data_25.sort_values(by="dvs_general_adj", ascending=False).head(20)


Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round,matches_max,...,general__pts_std_all,pts_avg_all__vs_top_general,pts_z_all__vs_top_general,dvs_position,dvs_general,risk,dvs_position_adj,dvs_general_adj,adp_position,adp_general
359,103445,Kaio Jorge,CRU,33,0.0,9.46,False,AT,38,38,...,1.082086,4.813566,4.448415,3.153212,4.252691,0.131579,3.028743,4.084822,1,1
161,87863,Arrascaeta,FLA,32,0.0,9.32,False,MD,38,38,...,1.082086,4.446724,4.109401,3.62111,4.041141,0.157895,3.449584,3.849718,1,2
514,114802,Vitor Roque,PAL,31,0.0,8.31,False,AT,38,38,...,1.082086,3.377514,3.121299,2.099256,3.098014,0.184211,1.983245,2.926808,2,3
85,78850,Reinaldo,MIR,32,0.0,7.94,False,FB,38,38,...,1.082086,3.284619,3.035451,1.94191,2.862965,0.157895,1.849925,2.72735,1,4
497,112709,Kaiki Bruno,CRU,34,0.0,7.34,False,FB,38,38,...,1.082086,3.165672,2.925527,1.484969,2.493518,0.105263,1.438075,2.414775,2,5
425,106708,Igor Formiga,JUV,16,6.3,7.47,True,FB,38,19,...,1.082086,2.888829,2.669686,1.451088,2.461702,0.157895,1.382352,2.345095,3,6
565,122486,Rayan,VAS,32,0.0,7.3,False,AT,38,38,...,1.082086,2.745672,2.537388,1.403161,2.316564,0.157895,1.336696,2.206832,3,7
258,96610,Matheus Pereira,CRU,34,0.0,7.03,False,MD,38,38,...,1.082086,2.888303,2.6692,1.97484,2.222824,0.105263,1.912477,2.15263,2,8
176,89275,William,CRU,30,0.0,7.39,False,FB,38,38,...,1.082086,2.432514,2.247986,1.196747,2.249625,0.210526,1.121163,2.107543,5,9
229,93716,Paulo Henrique,VAS,26,0.0,7.82,False,FB,38,38,...,1.082086,1.948829,1.800993,1.274391,2.304086,0.315789,1.153659,2.085804,4,10


In [25]:
data_25 = data_25[
    [
        "id",
        "name",
        "team",
        "position",
        "matches",
        "availability",
        "pts_avg_played",
        "pts_avg_all",
        "pts_std_played",
        "adp_position",
        "dvs_position_adj",
        "dvs_position",
        "pts_z_played__vs_top_position",
        "pts_z_all__vs_top_position",
        "pts_avg_played__vs_top_position",
        "pts_avg_all__vs_top_position",
        "adp_general",
        "dvs_general_adj",
        "dvs_general",
        "pts_z_played__vs_top_general",
        "pts_z_all__vs_top_general",
        "pts_avg_played__vs_top_general",
        "pts_avg_all__vs_top_general",
    ]
]
data_25.head(10)

Unnamed: 0,id,name,team,position,matches,availability,pts_avg_played,pts_avg_all,pts_std_played,adp_position,...,pts_z_all__vs_top_position,pts_avg_played__vs_top_position,pts_avg_all__vs_top_position,adp_general,dvs_general_adj,dvs_general,pts_z_played__vs_top_general,pts_z_all__vs_top_general,pts_avg_played__vs_top_general,pts_avg_all__vs_top_general
2,37656,Fábio,FLU,GK,36,0.947368,4.33,4.102105,0.738093,8,...,-0.083473,-1.211,-0.059535,76,-0.049792,-0.050591,-0.515835,0.647276,-0.57055,0.700408
3,37715,Thiago Silva,FLU,CB,23,0.605263,4.22,2.554211,0.864865,23,...,-1.368779,-0.637,-0.759395,148,-0.601634,-0.682451,-0.615287,-0.783197,-0.68055,-0.847486
4,38913,Nenê,JUV,MD,23,0.605263,2.76,1.670526,0.909614,67,...,-1.677185,-2.117333,-1.927094,293,-1.587815,-1.801104,-1.935276,-1.599846,-2.14055,-1.731171
5,39148,Hulk,CAM,AT,27,0.710526,6.19,4.398158,1.132822,9,...,0.035766,0.518667,0.039225,23,0.975093,1.067825,1.165795,0.920871,1.28945,0.996461
6,39656,Alan Franco,CAM,MD,26,0.684211,2.38,1.628421,0.51608,85,...,-1.71383,-2.497333,-1.969199,331,-1.83117,-2.022804,-2.278835,-1.638757,-2.52055,-1.773276
11,42222,Osvaldo,VIT,AT,26,0.684211,1.71,1.17,0.720983,117,...,-2.907712,-3.961333,-3.188933,401,-2.313591,-2.555711,-2.884583,-2.062403,-3.19055,-2.231697
12,42234,Cássio,CRU,GK,35,0.921053,5.52,5.084211,1.084309,3,...,1.293517,-0.021,0.92257,24,0.935291,0.95798,0.560046,1.55488,0.61945,1.682514
14,51413,Walter,MIR,GK,28,0.736842,6.57,4.841053,2.74148,1,...,0.95259,1.029,0.679413,16,1.324178,1.437679,1.509353,1.330168,1.66945,1.439356
15,51772,Everton Ribeiro,BAH,MD,30,0.789474,3.85,3.039474,1.230944,27,...,-0.485765,-1.027333,-0.558147,161,-0.659332,-0.703781,-0.949805,-0.334745,-1.05055,-0.362223
16,61188,Gilberto,JUV,AT,21,0.552632,2.05,1.132895,0.942813,103,...,-2.941545,-3.621333,-3.226038,363,-2.0649,-2.38499,-2.577188,-2.096693,-2.85055,-2.268802


In [26]:
# Load data for 2026 season
data_26_start = prepare_data(2026, 1)
data_26_start.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round
0,37656,Fábio,FLU,1,2.1,2.1,True,GK,1
1,39148,Hulk,CAM,1,5.3,5.3,True,AT,1
2,39656,Alan Franco,CAM,1,-0.1,-0.1,True,MD,1
3,39850,Vagner Mancini,RBB,1,8.07,8.07,True,HC,1
4,40990,Dorival Júnior,COR,1,6.6,6.6,True,HC,1
5,42135,Willian,GRE,1,-0.3,-0.3,True,MD,1
6,42222,Osvaldo,VIT,1,0.8,0.8,True,AT,1
7,42234,Cássio,CRU,1,-2.7,-2.7,True,GK,1
8,42500,Fagner,CRU,1,0.9,0.9,True,FB,1
9,45125,Tite,CRU,1,1.37,1.37,True,HC,1


In [27]:
# Remove head coaches
data_26_start = data_26_start[data_26_start["position"] != "HC"]
data_26_start.head(10)

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round
0,37656,Fábio,FLU,1,2.1,2.1,True,GK,1
1,39148,Hulk,CAM,1,5.3,5.3,True,AT,1
2,39656,Alan Franco,CAM,1,-0.1,-0.1,True,MD,1
5,42135,Willian,GRE,1,-0.3,-0.3,True,MD,1
6,42222,Osvaldo,VIT,1,0.8,0.8,True,AT,1
7,42234,Cássio,CRU,1,-2.7,-2.7,True,GK,1
8,42500,Fagner,CRU,1,0.9,0.9,True,FB,1
10,50742,Gabriel Leite,CFC,0,0.0,0.0,False,GK,1
11,51413,Walter,MIR,1,4.2,4.2,True,GK,1
12,51772,Everton Ribeiro,BAH,1,1.5,1.5,True,MD,1


In [28]:
# Remove unused columns
data_26_start = data_26_start[["id", "name", "team", "position"]]
data_26_start.head(10)

Unnamed: 0,id,name,team,position
0,37656,Fábio,FLU,GK
1,39148,Hulk,CAM,AT
2,39656,Alan Franco,CAM,MD
5,42135,Willian,GRE,MD
6,42222,Osvaldo,VIT,AT
7,42234,Cássio,CRU,GK
8,42500,Fagner,CRU,FB
10,50742,Gabriel Leite,CFC,GK
11,51413,Walter,MIR,GK
12,51772,Everton Ribeiro,BAH,MD


In [29]:
# Merge 2025 and 2026 data, leaving new players blank
data = pd.merge(left=data_26_start, right=data_25, how="left", on="id", suffixes=("", "_25"))
data.head(10)

Unnamed: 0,id,name,team,position,name_25,team_25,position_25,matches,availability,pts_avg_played,...,pts_z_all__vs_top_position,pts_avg_played__vs_top_position,pts_avg_all__vs_top_position,adp_general,dvs_general_adj,dvs_general,pts_z_played__vs_top_general,pts_z_all__vs_top_general,pts_avg_played__vs_top_general,pts_avg_all__vs_top_general
0,37656,Fábio,FLU,GK,Fábio,FLU,GK,36.0,0.947368,4.33,...,-0.083473,-1.211,-0.059535,76.0,-0.049792,-0.050591,-0.515835,0.647276,-0.57055,0.700408
1,39148,Hulk,CAM,AT,Hulk,CAM,AT,27.0,0.710526,6.19,...,0.035766,0.518667,0.039225,23.0,0.975093,1.067825,1.165795,0.920871,1.28945,0.996461
2,39656,Alan Franco,CAM,MD,Alan Franco,CAM,MD,26.0,0.684211,2.38,...,-1.71383,-2.497333,-1.969199,331.0,-1.83117,-2.022804,-2.278835,-1.638757,-2.52055,-1.773276
3,42135,Willian,GRE,MD,,,,,,,...,,,,,,,,,,
4,42222,Osvaldo,VIT,AT,Osvaldo,VIT,AT,26.0,0.684211,1.71,...,-2.907712,-3.961333,-3.188933,401.0,-2.313591,-2.555711,-2.884583,-2.062403,-3.19055,-2.231697
5,42234,Cássio,CRU,GK,Cássio,CRU,GK,35.0,0.921053,5.52,...,1.293517,-0.021,0.92257,24.0,0.935291,0.95798,0.560046,1.55488,0.61945,1.682514
6,42500,Fagner,CRU,FB,,,,,,,...,,,,,,,,,,
7,50742,Gabriel Leite,CFC,GK,,,,,,,...,,,,,,,,,,
8,51413,Walter,MIR,GK,Walter,MIR,GK,28.0,0.736842,6.57,...,0.95259,1.029,0.679413,16.0,1.324178,1.437679,1.509353,1.330168,1.66945,1.439356
9,51772,Everton Ribeiro,BAH,MD,Everton Ribeiro,BAH,MD,30.0,0.789474,3.85,...,-0.485765,-1.027333,-0.558147,161.0,-0.659332,-0.703781,-0.949805,-0.334745,-1.05055,-0.362223


In [30]:
# Removing duplicate columns
data.drop([col for col in data.columns if col.endswith("_25")], axis=1, inplace=True)
data.head(10)

Unnamed: 0,id,name,team,position,matches,availability,pts_avg_played,pts_avg_all,pts_std_played,adp_position,...,pts_z_all__vs_top_position,pts_avg_played__vs_top_position,pts_avg_all__vs_top_position,adp_general,dvs_general_adj,dvs_general,pts_z_played__vs_top_general,pts_z_all__vs_top_general,pts_avg_played__vs_top_general,pts_avg_all__vs_top_general
0,37656,Fábio,FLU,GK,36.0,0.947368,4.33,4.102105,0.738093,8.0,...,-0.083473,-1.211,-0.059535,76.0,-0.049792,-0.050591,-0.515835,0.647276,-0.57055,0.700408
1,39148,Hulk,CAM,AT,27.0,0.710526,6.19,4.398158,1.132822,9.0,...,0.035766,0.518667,0.039225,23.0,0.975093,1.067825,1.165795,0.920871,1.28945,0.996461
2,39656,Alan Franco,CAM,MD,26.0,0.684211,2.38,1.628421,0.51608,85.0,...,-1.71383,-2.497333,-1.969199,331.0,-1.83117,-2.022804,-2.278835,-1.638757,-2.52055,-1.773276
3,42135,Willian,GRE,MD,,,,,,,...,,,,,,,,,,
4,42222,Osvaldo,VIT,AT,26.0,0.684211,1.71,1.17,0.720983,117.0,...,-2.907712,-3.961333,-3.188933,401.0,-2.313591,-2.555711,-2.884583,-2.062403,-3.19055,-2.231697
5,42234,Cássio,CRU,GK,35.0,0.921053,5.52,5.084211,1.084309,3.0,...,1.293517,-0.021,0.92257,24.0,0.935291,0.95798,0.560046,1.55488,0.61945,1.682514
6,42500,Fagner,CRU,FB,,,,,,,...,,,,,,,,,,
7,50742,Gabriel Leite,CFC,GK,,,,,,,...,,,,,,,,,,
8,51413,Walter,MIR,GK,28.0,0.736842,6.57,4.841053,2.74148,1.0,...,0.95259,1.029,0.679413,16.0,1.324178,1.437679,1.509353,1.330168,1.66945,1.439356
9,51772,Everton Ribeiro,BAH,MD,30.0,0.789474,3.85,3.039474,1.230944,27.0,...,-0.485765,-1.027333,-0.558147,161.0,-0.659332,-0.703781,-0.949805,-0.334745,-1.05055,-0.362223


In [31]:
# data.to_excel("draft-board__new.xlsx", engine="openpyxl")

In [32]:
data_all_26 = pd.DataFrame()
for round in range(1, 2):
    data_all_26 = pd.concat([data_all_26, prepare_data(2026, round)], ignore_index=True)
data_all_26.head()

Unnamed: 0,id,name,team,matches,pts_round,pts_avg_played,has_played,position,round
0,37656,Fábio,FLU,1,2.1,2.1,True,GK,1
1,39148,Hulk,CAM,1,5.3,5.3,True,AT,1
2,39656,Alan Franco,CAM,1,-0.1,-0.1,True,MD,1
3,39850,Vagner Mancini,RBB,1,8.07,8.07,True,HC,1
4,40990,Dorival Júnior,COR,1,6.6,6.6,True,HC,1


In [33]:
def avg_last_n_matches(player_id: int, n: int) -> float:
    player_data = data_all_26[(data_all_26["id"] == player_id) & (data_all_26["has_played"])].sort_values(by="round", ascending=False)
    return player_data["pts_round"].head(n).mean()

In [34]:
def std_last_n_matches(player_id: int, n: int) -> float:
    player_data = data_all_26[(data_all_26["id"] == player_id) & (data_all_26["has_played"])].sort_values(by="round", ascending=False)
    return player_data["pts_round"].head(n).std()

In [41]:
def count_played_in_last_n_matches(player_id: int, n: int) -> int:
    player_data = data_all_26[data_all_26["id"] == player_id].sort_values(by="round", ascending=False).head(n)
    return player_data["has_played"].sum()

In [42]:
data_26 = data_all_26.groupby("id").agg(
    {
        "name": "first",
        "team": "first",
        "position": "first",
    }
).reset_index()
data_26.head(10)

Unnamed: 0,id,name,team,position
0,37656,Fábio,FLU,GK
1,39148,Hulk,CAM,AT
2,39656,Alan Franco,CAM,MD
3,39850,Vagner Mancini,RBB,HC
4,40990,Dorival Júnior,COR,HC
5,42135,Willian,GRE,MD
6,42222,Osvaldo,VIT,AT
7,42234,Cássio,CRU,GK
8,42500,Fagner,CRU,FB
9,45125,Tite,CRU,HC


In [None]:
data_26["pts_avg_played__last_3"] = data_26["id"].apply(lambda idx: avg_last_n_matches(idx, 3))
data_26["pts_std_played__last_3"] = data_26["id"].apply(lambda idx: std_last_n_matches(idx, 3))

data_26["pts_avg_played__last_5"] = data_26["id"].apply(lambda idx: avg_last_n_matches(idx, 5))
data_26["pts_std_played__last_5"] = data_26["id"].apply(lambda idx: std_last_n_matches(idx, 5))

data_26["pts_avg_played__last_10"] = data_26["id"].apply(lambda idx: avg_last_n_matches(idx, 10))
data_26["pts_std_played__last_10"] = data_26["id"].apply(lambda idx: std_last_n_matches(idx, 10))

data_26["matches_played_in_last_5"] = data_26["id"].apply(lambda idx: count_played_in_last_n_matches(idx, 5))

data_26.head(10)

Unnamed: 0,id,name,team,position,pts_avg_played__last_3,pts_std_played__last_3,pts_avg_played__last_5,pts_std_played__last_5,pts_avg_played__last_10,pts_std_played__last_10,matches_played_in_last_5
0,37656,Fábio,FLU,GK,2.1,,2.1,,2.1,,1
1,39148,Hulk,CAM,AT,5.3,,5.3,,5.3,,1
2,39656,Alan Franco,CAM,MD,-0.1,,-0.1,,-0.1,,1
3,39850,Vagner Mancini,RBB,HC,8.07,,8.07,,8.07,,1
4,40990,Dorival Júnior,COR,HC,6.6,,6.6,,6.6,,1
5,42135,Willian,GRE,MD,-0.3,,-0.3,,-0.3,,1
6,42222,Osvaldo,VIT,AT,0.8,,0.8,,0.8,,1
7,42234,Cássio,CRU,GK,-2.7,,-2.7,,-2.7,,1
8,42500,Fagner,CRU,FB,0.9,,0.9,,0.9,,1
9,45125,Tite,CRU,HC,1.37,,1.37,,1.37,,1


In [None]:
# Last 3 matches

# Calculate mean and std of top X players per position
topX_means = data_26.groupby("position")["pts_avg_played__last_3"].apply(lambda x: x.nlargest(dict_num_players[x.name]).mean())
topX_stds = data_26.groupby("position")["pts_avg_played__last_3"].apply(lambda x: x.nlargest(dict_num_players[x.name]).std())

data_26["position__pts_avg_played__last_3"] = data_26["position"].map(topX_means)
data_26["position__pts_std_played__last_3"] = data_26["position"].map(topX_stds)

# Calculate z-score
data_26["pts_avg_played__vs_top_position__last_3"] = data_26["pts_avg_played__last_3"] - data_26["position__pts_avg_played__last_3"]
data_26["pts_z_played__vs_top_position__last_3"] = (data_26["pts_avg_played__last_3"] - data_26["position__pts_avg_played__last_3"]) / data_26["position__pts_std_played__last_3"]



# Calculate mean and std of top 200 players
data_26["general__pts_avg_played__last_3"] = data_26["pts_avg_played__last_3"].nlargest(200).mean()
data_26["general__pts_std_played__last_3"] = data_26["pts_avg_played__last_3"].nlargest(200).std()

# Calculate z-score
data_26["pts_avg_played__vs_top_general__last_3"] = data_26["pts_avg_played__last_3"] - data_26["general__pts_avg_played__last_3"]
data_26["pts_z_played__vs_top_general__last_3"] = (data_26["pts_avg_played__last_3"] - data_26["general__pts_avg_played__last_3"]) / data_26["general__pts_std_played__last_3"]

In [None]:
# Last 5 matches

# Calculate mean and std of top X players per position
topX_means = data_26.groupby("position")["pts_avg_played__last_5"].apply(lambda x: x.nlargest(dict_num_players[x.name]).mean())
topX_stds = data_26.groupby("position")["pts_avg_played__last_5"].apply(lambda x: x.nlargest(dict_num_players[x.name]).std())

data_26["position__pts_avg_played__last_5"] = data_26["position"].map(topX_means)
data_26["position__pts_std_played__last_5"] = data_26["position"].map(topX_stds)

# Calculate z-score
data_26["pts_avg_played__vs_top_position__last_5"] = data_26["pts_avg_played__last_5"] - data_26["position__pts_avg_played__last_5"]
data_26["pts_z_played__vs_top_position__last_5"] = (data_26["pts_avg_played__last_5"] - data_26["position__pts_avg_played__last_5"]) / data_26["position__pts_std_played__last_5"]



# Calculate mean and std of top 200 players
data_26["general__pts_avg_played__last_5"] = data_26["pts_avg_played__last_5"].nlargest(200).mean()
data_26["general__pts_std_played__last_5"] = data_26["pts_avg_played__last_5"].nlargest(200).std()

# Calculate z-score
data_26["pts_avg_played__vs_top_general__last_5"] = data_26["pts_avg_played__last_5"] - data_26["general__pts_avg_played__last_5"]
data_26["pts_z_played__vs_top_general__last_5"] = (data_26["pts_avg_played__last_5"] - data_26["general__pts_avg_played__last_5"]) / data_26["general__pts_std_played__last_5"]

In [None]:
# Last 10 matches

# Calculate mean and std of top X players per position
topX_means = data_26.groupby("position")["pts_avg_played__last_10"].apply(lambda x: x.nlargest(dict_num_players[x.name]).mean())
topX_stds = data_26.groupby("position")["pts_avg_played__last_10"].apply(lambda x: x.nlargest(dict_num_players[x.name]).std())

data_26["position__pts_avg_played__last_10"] = data_26["position"].map(topX_means)
data_26["position__pts_std_played__last_10"] = data_26["position"].map(topX_stds)

# Calculate z-score
data_26["pts_avg_played__vs_top_position__last_10"] = data_26["pts_avg_played__last_10"] - data_26["position__pts_avg_played__last_10"]
data_26["pts_z_played__vs_top_position__last_10"] = (data_26["pts_avg_played__last_10"] - data_26["position__pts_avg_played__last_10"]) / data_26["position__pts_std_played__last_10"]



# Calculate mean and std of top 200 players
data_26["general__pts_avg_played__last_10"] = data_26["pts_avg_played__last_10"].nlargest(200).mean()
data_26["general__pts_std_played__last_10"] = data_26["pts_avg_played__last_10"].nlargest(200).std()

# Calculate z-score
data_26["pts_avg_played__vs_top_general__last_10"] = data_26["pts_avg_played__last_10"] - data_26["general__pts_avg_played__last_10"]
data_26["pts_z_played__vs_top_general__last_10"] = (data_26["pts_avg_played__last_10"] - data_26["general__pts_avg_played__last_10"]) / data_26["general__pts_std_played__last_10"]

In [None]:
# Last 3 matches

# Ranking players per position based on Z-scores
data_26["adp_position__last_3"] = data_26.groupby("position")["pts_z_played__vs_top_position__last_3"].rank(ascending=False, method="min").astype(int)
data_26.sort_values(by=["adp_position__last_3"], ascending=True).head(20)

# Ranking all players based on Z-scores
data_26["adp_general__last_3"] = data_26["dvs_general_adj"].rank(ascending=False, method="min").astype(int)
data_26.sort_values(by="dvs_general_adj", ascending=False).head(20)

In [None]:
# Média de pontos últimos 3 jogos (apenas jogados)
# Média de pontos últimos 5 jogos (apenas jogados)
# Média de pontos últimos 10 jogos (apenas jogados)

# Desvio padrão de pontos últimos 3 jogos (apenas jogados)
# Desvio padrão de pontos últimos 5 jogos (apenas jogados)
# Desvio padrão de pontos últimos 10 jogos (apenas jogados)

# Disponibilidade últimos 5 jogos

# DVS posição considerando apenas últimos 3 jogos
# DVS posição considerando apenas últimos 5 jogos
# DVS posição considerando apenas últimos 10 jogos

# DVS geral considerando apenas últimos 3 jogos
# DVS geral considerando apenas últimos 5 jogos
# DVS geral considerando apenas últimos 10 jogos

# ADP posição considerando apenas últimos 3 jogos
# ADP posição considerando apenas últimos 5 jogos
# ADP posição considerando apenas últimos 10 jogos

# ADP geral considerando apenas últimos 3 jogos
# ADP geral considerando apenas últimos 5 jogos
# ADP geral considerando apenas últimos 10 jogos