#### In this notebook, I will do logistic regression on ratings of each role of teams.

Ratings are from players_stats.csv files.  Player's rating of each match is recorded in this file.

<u> Previously... </u>

As an expriment, I did team rating analysis before: I computed a team rating by adding 5 players' individual rating.
The predictors/features that I used here were "team A rating" and "team B rating" and the target was "team A win", if team A win, it's True, False otherwise.
LinearRegression did quite well on both traing and test set, but it's expected because there is a data leakage in somesense: if your team players perform well, then they will get higher rating and most likely to win.

To prevent this kind of data leakage, I decided to use earlier tournaments of the year to compute team rating.
Goal: predict game results in "Valorant Champions 2023" tournament by using teams' ratings from earlier tournaments.
    There were total 746 games(train set) before "Valorant Champions 2023" and total 84 games(test set) in "Valorant Champions 2023"
I did the following steps.
- Step 1:  Compute team rating by taking average of team's previous ratings by map.
- Step 2:  Train a LogisticRegression model on earlier tournaments (746 games).
- Step 3:  Predict game results for 84 games in "Valorant Champions 2023" with team rating input computed in step 1.

Unfortunately, the accuracy score and confusion matrix don't look great in this experiment.  You can find this analysis in "vct_2023_team_rating.ipynb".


<u> In this notebook... </u>

Instead of simply summing 5 players ratings, I will compute each team's duelist, controller, initiator, sentinel rating.

For example, let's say team "Fruits" had a game and in this game,

- "apple" and "pineapple" played a duelist and their ratings were 1.5 and 1.22, respectively,
- "orange" played a controller and its rating was 1.7,
- "grapefruit" played an initiator and its rating was 1.5 and
- "banana" played a sentinel and its rating was 1.65.

Then team "Fruits"'s <u>*role ratings*</u> data for this game is:
- duelist rating: 2.72 (which is the sum of 1.5 and 1.22)
- controller rating: 1.7
- initiator rating: 1.5
- sentinel rating: 1.65


In this notebook, I worked around with players_stats, overview, maps_scores files and got a dataframe with the processed <u>*role ratings*</u> I just explained above.

I first did LogisticRegression and took a look into coefficients of the model.  Coefficients were all quite different which would imply some roles are more important than others.
Naturally, I got more curious about how these coefficients vary depend on a map.  I did a polynomial logistic regression and it showed that coefficients change when a map changes.
In this polynomial regression, I used each <u>*role ratings*</u> together with <u>*interation terms between role ratings and maps*</u>.


In [1]:
import pandas as pd
import numpy as np
from IPython.display import HTML

In [49]:
def side_by_side(*dfs):
    html = '<div style="display:flex">'
    for df in dfs:
        html += '<div style="margin-right: 2em">'
        html += df.to_html()
        html += '</div>'
    html += '</div>'
    display(HTML(html))

In [2]:
roles = {"duelist": {"jett", "phoenix", "reyna", "raze", "yoru", "neon", "iso"},
             "initiator": {"sova", "breach", "skye", "kayo", "fade", "gekko"},
             "controller": {"brimstone", "omen", "viper", "astra", "harbor"},
             "sentinel": {"cypher", "sage", "killjoy", "chamber", "deadlock"}}
allagents = set()
for role, l in roles.items():
    allagents = allagents.union(set(l))

duelists = roles["duelist"]
initiators = roles["initiator"]
controllers = roles["controller"]
sentinels = roles["sentinel"]

In [3]:
players_stats = pd.read_csv("../../vct-erdos-project/data/vct_2023/players_stats/players_stats.csv")
overview = pd.read_csv("../../vct-erdos-project/data/vct_2023/matches/overview.csv")
maps_scores = pd.read_csv("../../vct-erdos-project/data/vct_2023/matches/maps_scores.csv")

In [4]:
players_stats = players_stats[['Tournament', 'Stage', 'Match Type', 'Player', 'Team', 'Agents', 'Rating']]

players_stats["Agents"] = players_stats["Agents"].apply(lambda x: x.replace(" ", ""))

players_stats = players_stats[players_stats.Agents.isin(allagents)]
players_stats

Unnamed: 0,Tournament,Stage,Match Type,Player,Team,Agents,Rating
0,Champions Tour 2023: Americas Last Chance Qual...,Main Event,Upper Round 1,Melser,KRÜ Esports,brimstone,1.14
1,Champions Tour 2023: Americas Last Chance Qual...,Main Event,Upper Round 1,Melser,KRÜ Esports,omen,1.12
3,Champions Tour 2023: Americas Last Chance Qual...,Main Event,Upper Round 1,DaveeyS,KRÜ Esports,killjoy,1.29
4,Champions Tour 2023: Americas Last Chance Qual...,Main Event,Upper Round 1,keznit,KRÜ Esports,raze,1.24
5,Champions Tour 2023: Americas Last Chance Qual...,Main Event,Upper Round 1,Klaus,KRÜ Esports,skye,1.25
...,...,...,...,...,...,...,...
10509,Champions Tour 2023: Champions China Qualifier,All Stages,All Match Types,Biank,Bilibili Gaming,harbor,
10510,Champions Tour 2023: Champions China Qualifier,All Stages,All Match Types,Biank,Bilibili Gaming,skye,
10511,Champions Tour 2023: Champions China Qualifier,All Stages,All Match Types,Biank,Bilibili Gaming,sova,
10513,Champions Tour 2023: Champions China Qualifier,All Stages,All Match Types,whzy,Bilibili Gaming,jett,


In [5]:
players_stats["Rating"].isna().sum()

449

In [6]:
players_stats.loc[:,"Rating"] = players_stats.groupby(["Player", "Team", "Agents"])["Rating"]\
                                .transform(lambda x: x.fillna(x.mean()))\

players_stats.Rating.isna().sum()

111

In [7]:
players_stats["Rating"] = players_stats.groupby(["Player", "Team"])["Rating"]\
                            .transform(lambda x: x.fillna(x.mean()))\
                            
                            
players_stats.Rating.isna().sum()

34

In [8]:
players_stats["Rating"] = players_stats.groupby(["Team"])["Rating"]\
                            .transform(lambda x: x.fillna(x.mean()))\
                            
                            
players_stats.Rating.isna().sum()

0

In [9]:
players_stats.sort_values(by=["Rating"], ascending=False).head(10)

Unnamed: 0,Tournament,Stage,Match Type,Player,Team,Agents,Rating
3264,Champions Tour 2023: EMEA League,Regular Season,Week 2,Jamppi,Team Liquid,skye,2.29
3180,Champions Tour 2023: EMEA League,Regular Season,Week 1,Sayf,Team Liquid,raze,2.25
7707,Champions Tour 2023: Pacific League,League Play,Week 3,xffero,Rex Regum Qeon,sova,2.1
531,Champions Tour 2023: Pacific Last Chance Quali...,Main Event,Lower Final,invy,Team Secret,kayo,2.08
3314,Champions Tour 2023: EMEA League,Regular Season,Week 3,Shao,Natus Vincere,fade,2.04
9355,Champions Tour 2023: Champions China Qualifier,Preliminary Stage,Round 1,Septem7,Shenzhen NTER,killjoy,1.98
6704,Champions Tour 2023: Americas League,Regular Season,Week 7,Victor,NRG Esports,raze,1.97
141,Champions Tour 2023: Americas Last Chance Qual...,Main Event,Upper Final,NagZ,KRÜ Esports,viper,1.96
6126,Champions Tour 2023: Americas League,Regular Season,Week 2,s0m,NRG Esports,viper,1.96
7238,Champions Tour 2023: Pacific League,Playoffs,Upper Semifinals,BuZz,DRX,raze,1.95


In [10]:
ind1 = overview["Side"] == "both"
ind2 = overview["Map"] != "All Maps"
ind = ind1 & ind2

overview = overview[ind]
overview = overview[['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Player',
       'Team', 'Agents']]
overview

Unnamed: 0,Tournament,Stage,Match Type,Match Name,Map,Player,Team,Agents
0,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,nAts,Team Liquid,viper
3,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Sayf,Team Liquid,breach
6,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,soulcas,Team Liquid,astra
9,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Jamppi,Team Liquid,neon
12,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Redgar,Team Liquid,sova
...,...,...,...,...,...,...,...,...
34929,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Derke,FNATIC,jett
34932,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Boaster,FNATIC,viper
34935,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Alfajer,FNATIC,killjoy
34938,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Leo,FNATIC,sova


In [11]:
rating_df = overview.merge(players_stats, on=['Tournament', 'Stage', 'Match Type', 'Player', "Agents", "Team"], how="left")

rating_df["Rating"] = rating_df.groupby(['Tournament', 'Stage', "Player"])["Rating"]\
                        .transform(lambda x: x.fillna(x.mean()))

rating_df.isna().sum()

Tournament    0
Stage         0
Match Type    0
Match Name    0
Map           0
Player        0
Team          0
Agents        0
Rating        3
dtype: int64

In [12]:
rating_df["Rating"] = rating_df.groupby(['Tournament', 'Stage', 'Match Type', "Team"])["Rating"]\
                        .transform(lambda x: x.fillna(x.mean()))\

rating_df.isna().sum()

Tournament    0
Stage         0
Match Type    0
Match Name    0
Map           0
Player        0
Team          0
Agents        0
Rating        0
dtype: int64

In [13]:
rating_df

Unnamed: 0,Tournament,Stage,Match Type,Match Name,Map,Player,Team,Agents,Rating
0,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,nAts,Team Liquid,viper,1.260
1,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Sayf,Team Liquid,breach,0.960
2,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,soulcas,Team Liquid,astra,0.950
3,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Jamppi,Team Liquid,neon,0.890
4,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Redgar,Team Liquid,sova,0.690
...,...,...,...,...,...,...,...,...,...
8295,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Derke,FNATIC,jett,1.100
8296,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Boaster,FNATIC,viper,0.950
8297,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Alfajer,FNATIC,killjoy,0.960
8298,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Leo,FNATIC,sova,1.195


In [14]:
duelist_r = [row[1].Rating if row[1].Agents in duelists else 0 for row in rating_df[["Agents", "Rating"]].iterrows()]
controller_r = [row[1].Rating if row[1].Agents in controllers else 0 for row in rating_df[["Agents", "Rating"]].iterrows()]
initiator_r = [row[1].Rating if row[1].Agents in initiators else 0 for row in rating_df[["Agents", "Rating"]].iterrows()]
sentinel_r = [row[1].Rating if row[1].Agents in sentinels else 0 for row in rating_df[["Agents", "Rating"]].iterrows()]

In [15]:
role_rating = {"duelist_r":duelist_r\
                ,"controller_r": controller_r\
                ,"initiator_r":initiator_r\
                ,"sentinel_r":sentinel_r}

for role, rating_list in role_rating.items():
    rating_df[role] = rating_list

rating_df

Unnamed: 0,Tournament,Stage,Match Type,Match Name,Map,Player,Team,Agents,Rating,duelist_r,controller_r,initiator_r,sentinel_r
0,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,nAts,Team Liquid,viper,1.260,0.00,1.26,0.000,0.00
1,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Sayf,Team Liquid,breach,0.960,0.00,0.00,0.960,0.00
2,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,soulcas,Team Liquid,astra,0.950,0.00,0.95,0.000,0.00
3,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Jamppi,Team Liquid,neon,0.890,0.89,0.00,0.000,0.00
4,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Redgar,Team Liquid,sova,0.690,0.00,0.00,0.690,0.00
...,...,...,...,...,...,...,...,...,...,...,...,...,...
8295,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Derke,FNATIC,jett,1.100,1.10,0.00,0.000,0.00
8296,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Boaster,FNATIC,viper,0.950,0.00,0.95,0.000,0.00
8297,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Alfajer,FNATIC,killjoy,0.960,0.00,0.00,0.000,0.96
8298,Champions Tour 2023: Lock-In Sao Paulo,Playoffs,Grand Final,LOUD vs FNATIC,Icebox,Leo,FNATIC,sova,1.195,0.00,0.00,1.195,0.00


In [16]:
rating_df.isna().sum()

Tournament      0
Stage           0
Match Type      0
Match Name      0
Map             0
Player          0
Team            0
Agents          0
Rating          0
duelist_r       0
controller_r    0
initiator_r     0
sentinel_r      0
dtype: int64

In [17]:
rating_df_by_roles_temp =\
    rating_df.groupby(['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Team'])\
        [["Rating", "duelist_r", "controller_r", "initiator_r", "sentinel_r"]]\
        .agg("sum").reset_index()

In [18]:
maps_scores["Team_A_win"] = maps_scores["Team A Score"] > maps_scores["Team B Score"]
maps_scores["Team_B_win"] = maps_scores["Team B Score"] > maps_scores["Team A Score"]

maps_scores.head()

Unnamed: 0,Tournament,Stage,Match Type,Match Name,Map,Team A,Team A Score,Team A Attacker Score,Team A Defender Score,Team A Overtime Score,Team B,Team B Score,Team B Attacker Score,Team B Defender Score,Team B Overtime Score,Duration,Team_A_win,Team_B_win
0,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Fracture,Team Liquid,11,6,5,,Natus Vincere,13,7,6,,1:18:55,False,True
1,Valorant Champions 2023,Group Stage,Opening (D),Team Liquid vs Natus Vincere,Bind,Team Liquid,15,7,5,3.0,Natus Vincere,17,7,5,5.0,1:22:57,False,True
2,Valorant Champions 2023,Group Stage,Opening (D),DRX vs LOUD,Lotus,DRX,13,7,5,1.0,LOUD,15,7,5,3.0,1:17:19,False,True
3,Valorant Champions 2023,Group Stage,Opening (D),DRX vs LOUD,Split,DRX,13,8,5,,LOUD,6,2,4,,47:47,True,False
4,Valorant Champions 2023,Group Stage,Opening (D),DRX vs LOUD,Ascent,DRX,13,8,5,,LOUD,8,4,4,,,True,False


In [19]:
maps_scores = maps_scores[["Tournament", "Stage", "Match Type", "Match Name", "Map", "Team A", "Team A Score", "Team_A_win", "Team B", "Team B Score", "Team_B_win"]]

In [20]:
rating_df_by_roles_temp.keys()

Index(['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Team',
       'Rating', 'duelist_r', 'controller_r', 'initiator_r', 'sentinel_r'],
      dtype='object')

In [21]:
rating_df_by_roles_2023 =\
maps_scores.set_index(['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Team A'])\
                  .join(rating_df_by_roles_temp\
                        .rename(columns={\
                              "Team":"Team A",
                              "Rating":"Team_A_rating",
                              "duelist_r":"A_duel_r",
                              "controller_r":"A_cont_r",
                              "initiator_r":"A_init_r",
                              "sentinel_r":"A_sent_r"
                              })\
                        .set_index(['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Team A'])\
                  )\
                  .reset_index()\
                  .set_index(['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Team B'])\
                  .join(rating_df_by_roles_temp\
                        .rename(columns={\
                              "Team":"Team B",
                              "Rating":"Team_B_rating",
                              "duelist_r":"B_duel_r",
                              "controller_r":"B_cont_r",
                              "initiator_r":"B_init_r",
                              "sentinel_r":"B_sent_r"
                            })\
                        .set_index(['Tournament', 'Stage', 'Match Type', 'Match Name', 'Map', 'Team B'])\
                  )\
                  .reset_index()\
                  


In [22]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

In [23]:
features = ["A_duel_r", "A_cont_r", "A_init_r", "A_sent_r",
            "B_duel_r", "B_cont_r", "B_init_r", "B_sent_r"]
X = rating_df_by_roles_2023[features]
y = rating_df_by_roles_2023["Team_A_win"].to_numpy().reshape(-1,1)

In [24]:
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True, stratify=y, random_state=1)

In [25]:
# sanity check
assert len(X_train) == len(y_train), "X and y sizes are different"
assert len(X_test) == len(y_test), "X and y sizes are different"

In [26]:
lr = LogisticRegression()

In [27]:
lr.fit(X_train, y_train)
pred = lr.predict(X_test)
acc = accuracy_score(y_test, pred)
print("mean of true value:", np.mean(y_test))
print("mean of prediction:", np.mean(pred))
print("Accuracy:", acc)

mean of true value: 0.5096153846153846
mean of prediction: 0.5528846153846154
Accuracy: 0.8798076923076923


  y = column_or_1d(y, warn=True)


In [28]:
lr.coef_

array([[ 1.69396458,  1.54651472,  1.33528804,  1.90713237, -1.67593836,
        -1.90207293, -1.65382624, -2.07438117]])

In [29]:
print(np.mean(lr.coef_[0][:4]))
print(np.mean(lr.coef_[0][4:]))

1.6207249253093339
-1.8265546761587297


In [30]:
# Let's test by map

features = ["A_duel_r", "A_cont_r", "A_init_r", "A_sent_r",
            "B_duel_r", "B_cont_r", "B_init_r", "B_sent_r"]

maps = rating_df_by_roles_2023["Map"].unique()
analysis_df = pd.DataFrame()
analysis_df["map"] = maps
analysis_dictionary = {\
    "played_time":[],
    "mean_y_train":[],
    "mean_y_test":[],
    "mean_train_pred":[],
    "mean_test_pred":[],
    "train_acc":[],
    "test_acc" :[],
    "coefficients":[]
    }\

for map in maps:
    X = rating_df_by_roles_2023[rating_df_by_roles_2023["Map"] == map][features]
    y = rating_df_by_roles_2023[rating_df_by_roles_2023["Map"] == map]["Team_A_win"]
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True, stratify=y, random_state=1)

    logr = LogisticRegression()
    logr.fit(X_train, y_train)
    train_pred = logr.predict(X_train)
    pred = logr.predict(X_test)
    
    
    analysis_dictionary["played_time"].append(len(rating_df_by_roles_2023[rating_df_by_roles_2023["Map"] == map]))
    analysis_dictionary["mean_y_train"].append(np.mean(y_train))
    analysis_dictionary["mean_y_test"].append(np.mean(y_test))
    analysis_dictionary["mean_train_pred"].append(np.mean(train_pred))
    analysis_dictionary["mean_test_pred"].append(np.mean(pred))
    analysis_dictionary["train_acc"].append(accuracy_score(y_train, train_pred))
    analysis_dictionary["test_acc"].append(accuracy_score(y_test, pred))
    analysis_dictionary["coefficients"].append(np.round(logr.coef_[0],2))

for feature_name, lst in analysis_dictionary.items():
    analysis_df[feature_name] = lst

analysis_df

Unnamed: 0,map,played_time,mean_y_train,mean_y_test,mean_train_pred,mean_test_pred,train_acc,test_acc,coefficients
0,Fracture,93,0.536232,0.541667,0.57971,0.625,0.898551,0.833333,"[1.37, 1.52, 0.77, 0.99, -0.83, -1.3, -0.85, -..."
1,Bind,81,0.433333,0.428571,0.416667,0.333333,0.916667,0.619048,"[0.77, 0.96, 0.84, 1.5, -0.34, -2.1, -0.5, -0.26]"
2,Lotus,134,0.53,0.529412,0.49,0.441176,0.86,0.794118,"[1.17, 0.56, 0.69, 1.57, -0.69, -1.9, -0.63, -..."
3,Split,129,0.53125,0.545455,0.572917,0.545455,0.895833,0.939394,"[1.16, 0.49, 0.93, 0.64, -1.67, -1.75, -1.58, ..."
4,Ascent,131,0.540816,0.545455,0.561224,0.545455,0.816327,0.818182,"[0.46, 1.75, 1.97, 0.73, -0.91, -0.51, -0.75, ..."
5,Pearl,108,0.555556,0.555556,0.580247,0.666667,0.82716,0.740741,"[1.04, 1.24, 1.06, 1.42, -0.95, -1.14, -1.01, ..."
6,Haven,119,0.449438,0.433333,0.41573,0.433333,0.853933,0.8,"[0.6, 1.31, 1.9, 0.39, -1.65, -0.73, -1.26, -1..."
7,Icebox,35,0.423077,0.444444,0.423077,0.444444,1.0,1.0,"[0.42, 1.26, -0.04, 0.61, -1.42, -0.21, -0.83,..."


In [31]:
coef_list = ["A_duel_coef", "A_cont_coef", "A_init_coef", "A_sent_coef",
            "B_duel_coef", "B_cont_coef", "B_init_coef", "B_sent_coef"]

In [32]:
for i in range(len(coef_list)):
    analysis_df[coef_list[i]] = analysis_df.coefficients.apply(lambda l: l[i])

analysis_df[["map"]+coef_list]

Unnamed: 0,map,A_duel_coef,A_cont_coef,A_init_coef,A_sent_coef,B_duel_coef,B_cont_coef,B_init_coef,B_sent_coef
0,Fracture,1.37,1.52,0.77,0.99,-0.83,-1.3,-0.85,-1.53
1,Bind,0.77,0.96,0.84,1.5,-0.34,-2.1,-0.5,-0.26
2,Lotus,1.17,0.56,0.69,1.57,-0.69,-1.9,-0.63,-1.43
3,Split,1.16,0.49,0.93,0.64,-1.67,-1.75,-1.58,-1.64
4,Ascent,0.46,1.75,1.97,0.73,-0.91,-0.51,-0.75,-1.3
5,Pearl,1.04,1.24,1.06,1.42,-0.95,-1.14,-1.01,-1.04
6,Haven,0.6,1.31,1.9,0.39,-1.65,-0.73,-1.26,-1.08
7,Icebox,0.42,1.26,-0.04,0.61,-1.42,-0.21,-0.83,-0.9


`We can see here that coefficients varies depending on map.`  features_2 will be role ratings plus map information.

In [33]:
features_2 = ["Map"] + features

In [34]:
W = rating_df_by_roles_2023[features_2]
z = rating_df_by_roles_2023["Team_A_win"]

In [35]:
W = pd.get_dummies(W, columns=["Map"], dtype=int)

In [36]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline


In [37]:
W.keys()

Index(['A_duel_r', 'A_cont_r', 'A_init_r', 'A_sent_r', 'B_duel_r', 'B_cont_r',
       'B_init_r', 'B_sent_r', 'Map_Ascent', 'Map_Bind', 'Map_Fracture',
       'Map_Haven', 'Map_Icebox', 'Map_Lotus', 'Map_Pearl', 'Map_Split'],
      dtype='object')

In [38]:
rating_features = ['A_duel_r', 'A_cont_r', 'A_init_r', 'A_sent_r', 'B_duel_r', 'B_cont_r',
       'B_init_r', 'B_sent_r']
map_features = ['Map_Ascent', 'Map_Bind', 'Map_Fracture',
       'Map_Haven', 'Map_Icebox', 'Map_Lotus', 'Map_Pearl', 'Map_Split']


In [39]:
interaction_features = []  # Use this list to select interaction terms and drop "Map_Ascent", "Map_Bind", ...
for map in map_features:
    for rating in rating_features:
        W[f"{map}_{rating}"] = W[map]*W[rating]
        interaction_features.append(f"{map}_{rating}")


In [40]:
W_interaction = W[rating_features + interaction_features]

In [41]:
W_train, W_test, z_train, z_test = train_test_split(W_interaction, z, shuffle=True, stratify=z, random_state=1)

In [42]:
poly_logr = LogisticRegression()
poly_logr.fit(W_train, z_train)
poly_train_pred = poly_logr.predict(W_train)
poly_test_pred = poly_logr.predict(W_test)
poly_train_acc = accuracy_score(z_train, poly_train_pred)
poly_test_acc = accuracy_score(z_test, poly_test_pred)

print("Train set true mean:", np.mean(z_train))
print("Train prediction mean:", np.mean(poly_train_pred))
print("Test set true mean:", np.mean(z_test))
print("Test prediction mean:", np.mean(poly_test_pred))
print("Train accuracy:", poly_train_acc)
print("Test accuracy:", poly_test_acc)

Train set true mean: 0.5112540192926045
Train prediction mean: 0.5209003215434084
Test set true mean: 0.5096153846153846
Test prediction mean: 0.5336538461538461
Train accuracy: 0.842443729903537
Test accuracy: 0.8605769230769231


In [46]:
poly_logr_coef = pd.DataFrame()
poly_logr_coef["feature_name"] = W_interaction.keys()
poly_logr_coef["coef"] = poly_logr.coef_[0]


In [55]:
side_by_side(poly_logr_coef[:8], poly_logr_coef[8:24], poly_logr_coef[24:40],  poly_logr_coef[40:56], poly_logr_coef[56:])

Unnamed: 0,feature_name,coef
0,A_duel_r,1.387638
1,A_cont_r,1.571408
2,A_init_r,1.236617
3,A_sent_r,1.87197
4,B_duel_r,-1.602893
5,B_cont_r,-1.660232
6,B_init_r,-1.611142
7,B_sent_r,-1.987616

Unnamed: 0,feature_name,coef
8,Map_Ascent_A_duel_r,-0.091224
9,Map_Ascent_A_cont_r,0.30625
10,Map_Ascent_A_init_r,0.365488
11,Map_Ascent_A_sent_r,-0.497668
12,Map_Ascent_B_duel_r,0.385476
13,Map_Ascent_B_cont_r,-0.360618
14,Map_Ascent_B_init_r,0.041009
15,Map_Ascent_B_sent_r,-0.462264
16,Map_Bind_A_duel_r,-0.043347
17,Map_Bind_A_cont_r,-0.181862

Unnamed: 0,feature_name,coef
24,Map_Fracture_A_duel_r,0.65221
25,Map_Fracture_A_cont_r,0.658954
26,Map_Fracture_A_init_r,0.070718
27,Map_Fracture_A_sent_r,0.392985
28,Map_Fracture_B_duel_r,0.223034
29,Map_Fracture_B_cont_r,-0.744814
30,Map_Fracture_B_init_r,-0.539053
31,Map_Fracture_B_sent_r,-0.783882
32,Map_Haven_A_duel_r,-0.090979
33,Map_Haven_A_cont_r,0.647001

Unnamed: 0,feature_name,coef
40,Map_Icebox_A_duel_r,-0.303627
41,Map_Icebox_A_cont_r,0.59512
42,Map_Icebox_A_init_r,-0.238494
43,Map_Icebox_A_sent_r,0.034512
44,Map_Icebox_B_duel_r,-0.976057
45,Map_Icebox_B_cont_r,0.161332
46,Map_Icebox_B_init_r,-0.234251
47,Map_Icebox_B_sent_r,-0.223026
48,Map_Lotus_A_duel_r,0.554312
49,Map_Lotus_A_cont_r,-0.539022

Unnamed: 0,feature_name,coef
56,Map_Pearl_A_duel_r,-0.155322
57,Map_Pearl_A_cont_r,0.000579
58,Map_Pearl_A_init_r,-0.396824
59,Map_Pearl_A_sent_r,0.161977
60,Map_Pearl_B_duel_r,0.090459
61,Map_Pearl_B_cont_r,-0.092376
62,Map_Pearl_B_init_r,0.32768
63,Map_Pearl_B_sent_r,0.195696
64,Map_Split_A_duel_r,0.865616
65,Map_Split_A_cont_r,0.08439
