## 1) Inspiration

- I have enjoyed watching league of legend championships since Middle school and I recently started actually playing the game. I thought it would be fun to look into different components of league of legend and see how they affect the outcome. 
- The dataset contains in-game statistics, making it ideal for building a predictive model.
- The insights gained from this model can help players, analysts, and esports teams make better decisions regarding team compositions and gameplay strategies.

## 2) Stakeholders


- Esports Coaches & Analysts – They can use the insights from this model to optimize game strategies. By understanding which in-game factors (e.g., gold difference, objective control, early kills) contribute most to winning, analysts can focus on the most impactful areas during training sessions.
- League of Legends Players – Competitive and ranked players can benefit from knowing which key statistics influence match outcomes. This information can help them prioritize objectives such as securing dragons or maintaining a gold lead.
- Game Developers – The insights from this project could be valuable for game balancing. If the model finds that certain features (e.g., early kills or dragons) have an overwhelming influence on victory, Riot Games could adjust gameplay mechanics to ensure more balanced matches.

## 3) Task and Metrics

This is a classification problem. Therefore, I will evaluate my models using multiple performance metrics:

- Accuracy Score: Measures overall correctness but can be misleading in imbalanced datasets.
- Precision and Recall: Helps evaluate how well the model predicts wins vs. losses.
- ROC-AUC Score: Evaluates the model’s ability to distinguish between wins and losses across different probability thresholds.
- Confusion Matrix: Provides insight into false positives and false negatives, which will help fine-tune model thresholds.

## 4) Data

- Data Source: The dataset is from Kaggle’s League of Legends dataset (https://www.kaggle.com/datasets/bobbyscience/league-of-legends-diamond-ranked-games-10-min). 
- League of Legends is a multiplayer online battle by Riot Games. 
- The dataset contains data on the first 10 minutes of solo queues from a high ELO (Diamond & Masters). 
- The dataset was created by one person (Yi Lan Ma) and the expected update frequency is monthly. 

How many observations and variables does the dataset have?

- There are 9879 observations and 40 variables

What does each observation and variable represent?

- Each observation represents a single game match snapshot at the 10-minute mark, with statistics for both the blue and red teams.
- Variables
    - `gameId`: Unique identifier for each match.
    - `blueWins`: Binary variable (1 if the blue team wins, 0 if the red team wins).
    - `blueWardsPlaced`: Number of wards placed by the blue team in the first 10 minutes.
    - `blueWardsDestroyed`: Number of enemy wards destroyed by the blue team.
    - `redWardsPlaced`: Number of wards placed by the red team.
    - `redWardsDestroyed`: Number of enemy wards destroyed by the red team.
    - `blueFirstBlood`: 1 if the blue team got first blood (first kill of the game), 0 otherwise.
    - `blueKills`: Number of kills secured by the blue team.
    - `blueDeaths`: Number of times the blue team died.
    - `blueAssists`: Number of assists made by the blue team.
    - `blueEliteMonsters`: Number of elite monsters (Dragon or Rift Herald) killed by the blue team.
    - `blueDragons`: Number of Dragons killed by the blue team.
    - `blueHeralds`: Number of Rift Heralds killed by the blue team.
    - `redKills`: Number of kills secured by the red team.
    - `redDeaths`: Number of times the red team died.
    - `redAssists`: Number of assists made by the red team.
    - `redEliteMonsters`: Number of elite monsters (Dragon or Rift Herald) killed by the red team.
    - `redDragons`: Number of Dragons killed by the red team.
    - `redHeralds`: Number of Rift Heralds killed by the red team.
    - `blueTowersDestroyed`: Number of towers destroyed by the blue team.
    - `blueTotalGold`: Total gold accumulated by the blue team.
    - `blueAvgLevel`: Average champion level of all blue team players.
    - `blueTotalExperience`: Total experience accumulated by the blue team.
    - `blueTotalMinionsKilled`: Total minions (lane creeps) killed by the blue team.
    - `blueTotalJungleMinionsKilled`: Total jungle monsters killed by the blue team.
    - `blueGoldDiff`: Gold difference between blue and red teams (positive if blue is ahead).
    - `blueExperienceDiff`: Experience difference between blue and red teams.
    - `blueCSPerMin`: Creep Score (CS) per minute for the blue team.
    - `blueGoldPerMin`: Gold per minute for the blue team.
    - `redTowersDestroyed`: Number of towers destroyed by the red team.
    - `redTotalGold`: Total gold accumulated by the red team.
    - `redAvgLevel`: Average champion level of all red team players.
    - `redTotalExperience`: Total experience accumulated by the red team.
    - `redTotalMinionsKilled`: Total minions (lane creeps) killed by the red team.
    - `redTotalJungleMinionsKilled`: Total jungle monsters killed by the red team.
    - `redGoldDiff`: Gold difference between red and blue teams (positive if red is ahead).
    - `redExperienceDiff`: Experience difference between red and blue teams.
    - `redCSPerMin`: Creep Score (CS) per minute for the red team.
    - `redGoldPerMin`: Gold per minute for the red team.

How many variables are numeric and how many variables are categorical?

- There are 40 numeric variables and 0 categorical variables. 

Which variables are the predictors and which variable is the response?

- Predictors: All variables related to the blue team except blueWins are predictors. These include game performance stats (kills, assists, gold, experience, objectives).We will discard the variables that start with "red" because we are focusing on the blue team. 
- Response Variable: blueWins (indicating whether the blue team won).

Data cleaning

- Cleaning was not necessary because the dataset has no missing values and no categorical variables.

## 5) Prediction

In [6]:
#| echo: false
import pandas as pd
import statsmodels.formula.api as smf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, recall_score, precision_score, confusion_matrix, roc_curve, auc, roc_auc_score, precision_recall_curve
from sklearn.linear_model import Ridge, Lasso, LogisticRegression

### Linear and Unregularized Model

The first predictor used was "blueGoldDiff" because this variable had the highest correlation with "blueWins" with a correlation of 0.511. This is because gold is the primary economic resource in League of Legends, determining item purchases and power spikes.

Training and Test Performance:



In [8]:
#| echo: false
df = pd.read_csv("high_diamond_ranked_10min.csv") 
train_df, test_df = train_test_split(df, test_size=0.2, random_state=2)
X_train = train_df[["blueGoldDiff"]]  # Only blue team variables
y_train = train_df["blueWins"]  # Response variable

X_test = test_df[["blueGoldDiff"]]
y_test = test_df["blueWins"]

scaler = StandardScaler()
scaler.fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

logreg = LogisticRegression(penalty=None)
logreg.fit(X_train_scaled, y_train)
y_pred = logreg.predict(X_test_scaled)
y_pred_proba = logreg.predict_proba(X_test_scaled)[:, 1]

accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_proba)
conf_matrix = confusion_matrix(y_test, y_pred)

print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"ROC-AUC: {roc_auc:.4f}")
print("Confusion Matrix:")
print(conf_matrix)

Accuracy: 0.7298
Precision: 0.7286
Recall: 0.7331
ROC-AUC: 0.8080
Confusion Matrix:
[[717 270]
 [264 725]]


After that, each predictor was added in a stepwise manner, and the training and test performance was recorded.

In [10]:
#| echo: false
all_predictors = [col for col in df.columns if col != "blueWins"]
stepwise_results = []
selected_features = []
for predictor in all_predictors:
    if predictor not in selected_features:
        selected_features.append(predictor)

        # Slice predictors and response
        X_train = train_df[selected_features]
        y_train = train_df["blueWins"]
        X_test = test_df[selected_features]
        y_test = test_df["blueWins"]

        # Standardize the predictors
        scaler = StandardScaler()
        scaler.fit(X_train)
        X_train_scaled = scaler.transform(X_train)
        X_test_scaled = scaler.transform(X_test)

        # Train logistic regression model with no regularization
        logreg = LogisticRegression(penalty=None)
        logreg.fit(X_train_scaled, y_train)

        # Make predictions
        y_pred = logreg.predict(X_test_scaled)
        y_pred_proba = logreg.predict_proba(X_test_scaled)[:, 1]

        # Compute evaluation metrics
        accuracy = accuracy_score(y_test, y_pred)
        precision = precision_score(y_test, y_pred)
        recall = recall_score(y_test, y_pred)
        roc_auc = roc_auc_score(y_test, y_pred_proba)

        # Store results
        stepwise_results.append({
            "Added Predictor": predictor,
            "Accuracy": accuracy,
            "Precision": precision,
            "Recall": recall,
            "ROC-AUC": roc_auc
        })

# Convert results into DataFrame and display
stepwise_results_df = pd.DataFrame(stepwise_results)
stepwise_results_df

Unnamed: 0,Added Predictor,Accuracy,Precision,Recall,ROC-AUC
0,gameId,0.489879,0.489572,0.450961,0.491311
1,blueWardsPlaced,0.50253,0.506818,0.22548,0.493609
2,blueWardsDestroyed,0.485324,0.483452,0.413549,0.487393
3,blueFirstBlood,0.602227,0.603255,0.599596,0.597592
4,blueKills,0.648785,0.6551,0.629929,0.710353
5,blueDeaths,0.696356,0.697062,0.695652,0.781864
6,blueAssists,0.698381,0.697487,0.701719,0.781862
7,blueEliteMonsters,0.706478,0.708037,0.703741,0.789241
8,blueDragons,0.708502,0.708797,0.708797,0.789714
9,blueHeralds,0.70749,0.707786,0.707786,0.789701


Key Takeaways:

- Adding gold-based predictors (blueTotalGold, blueGoldDiff) increased accuracy the most.
- Kills, deaths, and assists also improved performance, reinforcing that combat success is linked to winning.
- Wards placed/destroyed did not contribute much, suggesting that vision alone does not determine a game outcome.

### Non-linear and Regularized Model

In [38]:
#| echo: false
import warnings
warnings.filterwarnings("ignore")
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import RidgeCV, LassoCV, LogisticRegressionCV


X = df.drop(['gameId', 'blueWins'], axis=1)
y = df['blueWins']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2)
poly = PolynomialFeatures(degree=2, include_bias=False)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
feature_names = poly.get_feature_names_out(X.columns)


scaler = StandardScaler()
X_train_poly_scaled = scaler.fit_transform(X_train_poly)
X_test_poly_scaled = scaler.transform(X_test_poly)

Cs = [0.001, 0.01, 0.1, 1, 10, 100]  # 6 hyperparameter values
logregcv = LogisticRegressionCV(penalty="l2", Cs=Cs, cv=10, solver='saga')
logregcv.fit(X_train_poly_scaled, y_train)

best_C = logregcv.C_[0]
print(f"Best C value: {best_C}")

best_cv_score = logregcv.scores_[1].mean(axis=0).max()
print(f"Best CV Score: {best_cv_score:.4f}")

y_pred = logregcv.predict(X_test_poly_scaled)
y_pred_proba = logregcv.predict_proba(X_test_poly_scaled)[:, 1]


Best C value: 0.001
Best CV Score: 0.7311


In [13]:
#| echo: false
# compute test performance metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_proba)
conf_matrix = confusion_matrix(y_test, y_pred)

print("\nTest Performance of the Trained and Tuned Model:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"ROC-AUC: {roc_auc:.4f}")
print("Confusion Matrix:")
print(conf_matrix)


Test Performance of the Trained and Tuned Model:
Accuracy: 0.7379
Precision: 0.7420
Recall: 0.7300
ROC-AUC: 0.8194
Confusion Matrix:
[[736 251]
 [267 722]]


To systematically introduce non-linearities, all interaction terms were created using PolynomialFeatures (degree=2).
A Ridge Regularized Logistic Regression (L2 penalty) was applied to select the most informative non-linear terms.

- The non-linear model outperformed the linear model, particularly in ROC-AUC, showing it better distinguishes winning from losing games.
- Higher recall (0.7300) suggests fewer false negatives, meaning the model correctly predicts wins more often.

In [14]:
#| echo: false
# feature coefficients from the best model
coefficients = logregcv.coef_[0]

# use DataFrame for feature importance
feature_importance = pd.DataFrame({'Feature': feature_names, 'Coefficient': coefficients})

# sort by absolute coefficient value
feature_importance['Abs_Coefficient'] = feature_importance['Coefficient'].abs()
sorted_importance = feature_importance.sort_values('Abs_Coefficient', ascending=False)

# top 5 most important features
print("\n5 Most Important Features:")
print(sorted_importance.head(5))

# 5 least important features
print("\n5 Least Important Features:")
print(sorted_importance.tail(5))

# exactly zero coefficients
zero_coefs = sorted_importance[sorted_importance['Coefficient'] == 0]
print(f"\nFeatures with Zero Coefficients ({len(zero_coefs)}):")
if len(zero_coefs) > 0:
    print(zero_coefs)
else:
    print("No features have exactly zero coefficients.")


5 Most Important Features:
                                       Feature  Coefficient  Abs_Coefficient
363     blueTowersDestroyed redTowersDestroyed     0.029965         0.029965
595            redWardsPlaced redEliteMonsters    -0.026285         0.026285
138                  blueFirstBlood redHeralds     0.026087         0.026087
480  blueTotalJungleMinionsKilled blueGoldDiff     0.023834         0.023834
499   blueTotalJungleMinionsKilled redGoldDiff    -0.023834         0.023834

5 Least Important Features:
                                 Feature  Coefficient  Abs_Coefficient
698   redEliteMonsters redExperienceDiff    -0.000002         0.000002
535  blueExperienceDiff redEliteMonsters     0.000002         0.000002
333               blueHeralds redHeralds     0.000000         0.000000
302               blueDragons redDragons     0.000000         0.000000
132         blueFirstBlood redFirstBlood     0.000000         0.000000

Features with Zero Coefficients (3):
                 

Key Takeaways for 5 Most Important Features:

- Tower destruction interactions are the most significant. This is because to win the game, players have to destroy the towers to get to the Nexus. Blue team also has to defend their towers so that the red team doesn't destory their towers.
- If the blue team gets First Blood and the red team secures Rift Herald, it has an effect on the game’s direction. This is interesting because I didn't think these two variables would affect each other.
- Warding and elite monster interactions have a negative impact, implying warding alone does not guarantee objective control.

Key Takeaways for 5 Least Important Features:

- The number of elite monsters taken by red team has almost no impact on experience difference.
    - This suggests getting objectives like Dragon/Herald doesn’t always translate to an XP lead. 
- If both teams take Herald/Dragon/First blood, it has no impact on the model’s predictions.
    - Likely means that trading cancels out its effect on game advantage.

Currently, the features include both the red team and blue team variables. I decided to try using only the blue team data to investigate if the coefficients of the features and test performance improve. 

In [39]:
#| echo: false
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import RidgeCV, LassoCV, LogisticRegressionCV

import warnings
warnings.filterwarnings("ignore")


blue_team_columns = [
    "blueWardsPlaced", "blueWardsDestroyed", "blueFirstBlood",
    "blueKills", "blueDeaths", "blueAssists", "blueEliteMonsters", "blueDragons",
    "blueHeralds", "blueTowersDestroyed", "blueTotalGold", "blueAvgLevel",
    "blueTotalExperience", "blueTotalMinionsKilled", "blueTotalJungleMinionsKilled",
    "blueGoldDiff", "blueExperienceDiff", "blueCSPerMin", "blueGoldPerMin"
]
X = df[blue_team_columns] 
y = df['blueWins']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2)
poly = PolynomialFeatures(degree=2, include_bias=False)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
feature_names = poly.get_feature_names_out(X.columns)


scaler = StandardScaler()
X_train_poly_scaled = scaler.fit_transform(X_train_poly)
X_test_poly_scaled = scaler.transform(X_test_poly)

Cs = [0.001, 0.01, 0.1, 1, 10, 100]  # 6 hyperparameter values
logregcv = LogisticRegressionCV(penalty="l1", Cs=Cs, cv=10, solver='saga')
logregcv.fit(X_train_poly_scaled, y_train)

best_C = logregcv.C_[0]
print(f"Best C value: {best_C}")

best_cv_score = logregcv.scores_[1].mean(axis=0).max()
print(f"Best CV Score: {best_cv_score:.4f}")

y_pred = logregcv.predict(X_test_poly_scaled)
y_pred_proba = logregcv.predict_proba(X_test_poly_scaled)[:, 1]

# feature coefficients from the best model
coefficients = logregcv.coef_[0]

# use DataFrame for feature importance
feature_importance = pd.DataFrame({'Feature': feature_names, 'Coefficient': coefficients})

# sort by absolute coefficient value
feature_importance['Abs_Coefficient'] = feature_importance['Coefficient'].abs()
sorted_importance = feature_importance.sort_values('Abs_Coefficient', ascending=False)

# top 5 most important features
print("\n5 Most Important Features:")
print(sorted_importance.head(5))

# 5 least important features
print("\n5 Least Important Features:")
print(sorted_importance.tail(5))

# exactly zero coefficients
zero_coefs = sorted_importance[sorted_importance['Coefficient'] == 0]
print(f"\nFeatures with Zero Coefficients ({len(zero_coefs)}):")
if len(zero_coefs) > 0:
    print(zero_coefs)
else:
    print("No features have exactly zero coefficients.")

# compute test performance metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_proba)
conf_matrix = confusion_matrix(y_test, y_pred)

print("\nTest Performance of the Trained and Tuned Model:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"ROC-AUC: {roc_auc:.4f}")
print("Confusion Matrix:")
print(conf_matrix)


Best C value: 0.01
Best CV Score: 0.7295

5 Most Important Features:
                                       Feature  Coefficient  Abs_Coefficient
15                                blueGoldDiff     0.306387         0.306387
195  blueTotalJungleMinionsKilled blueGoldDiff     0.188640         0.188640
177                  blueAvgLevel blueGoldDiff     0.146996         0.146996
201                  blueGoldDiff blueCSPerMin     0.139644         0.139644
190        blueTotalMinionsKilled blueGoldDiff     0.139644         0.139644

5 Least Important Features:
                         Feature  Coefficient  Abs_Coefficient
74          blueKills blueDeaths          0.0              0.0
75         blueKills blueAssists          0.0              0.0
76   blueKills blueEliteMonsters          0.0              0.0
77         blueKills blueDragons          0.0              0.0
208             blueGoldPerMin^2          0.0              0.0

Features with Zero Coefficients (192):
                      

Here, we can see that the coefficients for the most important features increased compared to the previous model that included red team data. 

- This is because when red team data was included, some features were competing for importance (e.g., redGoldDiff and blueGoldDiff both represent the same concept in opposite directions).
- Now that only blue team data is used, there is no redundancy, so the model assigns higher weights to the remaining important features.
- This means that the model is more confident in the importance of blueGoldDiff and its interactions, leading to larger coefficients.

Moreover, ROC-AUC of 0.8193 is nearly identical to the model that included both teams.

- This suggests that red team data was not providing additional useful information—likely because gold, experience, and objectives are already reflected in blue team stats.

### The Interpretation of the Results

Most Informative Predictors

- Gold-related features (blueGoldDiff, blueTotalGold, blueGoldPerMin) consistently showed the highest predictive power.
    - blueGoldDiff achieved an accuracy of 0.7267 and an ROC-AUC of 0.8145.
    - blueGoldPerMin performed similarly, with an accuracy of 0.7363 and ROC-AUC of 0.8186.
    - Justification: Gold difference is a direct measure of a team's economic strength, affecting item purchases and overall game control. Higher gold often correlates with better team fights and objective control.
- Experience-based predictors (blueExperienceDiff, blueAvgLevel) also ranked high in importance.
    - blueExperienceDiff had an accuracy of 0.7358 and an ROC-AUC of 0.8186.
    - blueAvgLevel performed similarly with an accuracy of 0.7231 and ROC-AUC of 0.8039.
    - Justification: Experience difference determines champion level, which translates to stronger abilities and better survivability. A higher experience lead can mean winning skirmishes and fights more easily.
- Objective control, particularly blueTowersDestroyed, played a crucial role.
    - blueTowersDestroyed had an accuracy of 0.7110 and an ROC-AUC of 0.7906.
    - Justification: Towers provide vision, map pressure, and strategic control, allowing a team to dominate rotations and jungle camps.

Least Informative Predictors

- Warding statistics (blueWardsPlaced, blueWardsDestroyed) had minimal impact.
    - blueWardsPlaced resulted in an accuracy of just 0.5025, slightly above random chance.
    - Justification: While vision is important for strategy, placing wards alone does not directly lead to winning. It depends on how teams act on the vision gained.
- First Blood (blueFirstBlood) was moderately useful but not as strong as economy-based stats.
    - Accuracy: 0.6022 | ROC-AUC: 0.5976
    - Justification: Getting the first kill provides an early advantage, but it does not guarantee sustained dominance throughout the game.

## 6) Inference

In constructing our final model, I chose to use only blue team variables rather than incorporating both blue and red team statistics. This decision was based on the observation that blue team metrics alone provided sufficient predictive power without redundancy. When both teams' data were included, many features were highly correlated with one another (e.g., blueGoldDiff and redGoldDiff are essentially the same but in opposite directions), leading to redundancy in the model without adding new information. By selecting only blue team variables, I aimed to create a more interpretable model while maintaining high accuracy.

To determine the most important predictors, I used absolute coefficient values from the full model, as well as their impact on performance metrics such as ROC-AUC, accuracy, and precision. The five selected predictors— blueGoldDiff, blueTotalJungleMinionsKilled, blueAvgLevel, blueCSPerMin, and blueTotalMinionsKilled— consistently showed the highest contribution to predicting game outcomes. These features directly influence gold accumulation, experience gains, and farming efficiency, which are the most reliable indicators of team strength.

In [40]:
#| echo: false
import warnings
warnings.filterwarnings("ignore")
final_predictors = [
    "blueGoldDiff",
    "blueTotalJungleMinionsKilled",
    "blueAvgLevel",
    "blueCSPerMin",
    "blueTotalMinionsKilled"
]

X = df[final_predictors]  # Use only selected predictors
y = df["blueWins"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2)

poly = PolynomialFeatures(degree=2, include_bias=False)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
feature_names = poly.get_feature_names_out(X.columns)

# Standardization
scaler = StandardScaler()
X_train_poly_scaled = scaler.fit_transform(X_train_poly)
X_test_poly_scaled = scaler.transform(X_test_poly)

# Train Logistic Regression with Ridge Regularization (L2)
Cs = [0.001, 0.01, 0.1, 1, 10, 100]  # 6 hyperparameter values
logregcv = LogisticRegressionCV(penalty="l2", Cs=Cs, cv=10, solver='saga', max_iter=1000)
logregcv.fit(X_train_poly_scaled, y_train)

# Best hyperparameter value
best_C = logregcv.C_[0]

# Best cross-validation score
best_cv_score = logregcv.scores_[1].mean(axis=0).max()

# Make test predictions
y_pred = logregcv.predict(X_test_poly_scaled)
y_pred_proba = logregcv.predict_proba(X_test_poly_scaled)[:, 1]

# Compute test performance metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_proba)
conf_matrix = confusion_matrix(y_test, y_pred)

print("\nTest Performance of the Trained and Tuned Model:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"ROC-AUC: {roc_auc:.4f}")
print("Confusion Matrix:")
print(conf_matrix)



Test Performance of the Trained and Tuned Model:
Accuracy: 0.7318
Precision: 0.7330
Recall: 0.7300
ROC-AUC: 0.8106
Confusion Matrix:
[[724 263]
 [267 722]]


### Effect of Each Predictor on the Response (blueWins)

- blueGoldDiff (Coefficient: 0.3064)
    - For each 1,000 gold increase in blueGoldDiff, the log-odds of blueWins increase by 0.3064.
    - The exponent of 0.3064 is around 1.36, meaning that for each additional 1,000 gold lead, the odds of the blue team winning increase by 36%.
    - A higher gold difference translates into better itemization, stronger champions, and more map control, increasing the likelihood of victory.
- blueTotalJungleMinionsKilled * blueGoldDiff (Coefficient: 0.1893)
    - For each unit increase in blueTotalJungleMinionsKilled * blueGoldDiff, the log-odds of blueWins increase by 0.1893.
    - The exponent of 0.1893 is around 1.21, meaning that for each additional jungle minion farmed while ahead in gold, the odds of winning increase by 21%.
    - Jungle farming is important for maintaining a gold lead, and teams that farm the jungle efficiently while leading in gold have a significantly higher chance of winning.
- blueAvgLevel * blueGoldDiff (Coefficient: 0.1468)
    - For each unit increase in blueAvgLevel * blueGoldDiff, the log-odds of blueWins increase by 0.1468.
    - The exponent of 0.1468 is around 1.16, meaning that for each additional level gained while ahead in gold, the odds of winning increase by 16%.
    - A higher level advantage combined with a gold lead significantly increases the blue team’s chances of winning, as champions with more experience deal more damage, have stronger abilities, and survive longer in fights.
- blueGoldDiff * blueCSPerMin (Coefficient: 0.1397)
    - For each unit increase in blueGoldDiff * blueCSPerMin, the log-odds of blueWins increase by 0.1397.
    - The exponent of 0.1397 is around 1.15, meaning that for each increase in CS per minute while ahead in gold, the odds of winning increase by 15%.
    - Teams that continue farming efficiently after getting ahead in gold sustain their lead and make it harder for the enemy to recover, ensuring a higher win probability.
- blueTotalMinionsKilled * blueGoldDiff (Coefficient: 0.1397)
    - For each unit increase in blueTotalMinionsKilled * blueGoldDiff, the log-odds of blueWins increase by 0.1397.
    - The exponent of 0.1397 is around 1.15, meaning that for each additional minion killed while ahead in gold, the odds of winning increase by 15%.
    - Consistently farming minions increases a team's gold lead, which directly contributes to their chances of winning.
    - Minion kills alone aren’t enough—teams must translate that farm into meaningful economic and objective advantages.

### Reliability

To assess the reliability of each predictor, we need to evaluate whether the estimated coefficients are statistically significant.

In [41]:
#| echo: false
X_train_poly_df = pd.DataFrame(X_train_poly, columns=feature_names, index=train_df.index)
X_train_poly_df["blueWins"] = train_df["blueWins"]
logit_model = smf.logit(formula = "blueWins ~ blueGoldDiff + blueTotalJungleMinionsKilled + blueAvgLevel + blueCSPerMin + blueTotalMinionsKilled + blueGoldDiff*blueTotalJungleMinionsKilled + blueGoldDiff*blueAvgLevel + blueGoldDiff*blueCSPerMin + blueGoldDiff*blueTotalMinionsKilled", data=X_train_poly_df)
result = logit_model.fit()
print(result.summary())

         Current function value: 0.539110
         Iterations: 35
                           Logit Regression Results                           
Dep. Variable:               blueWins   No. Observations:                 7903
Model:                          Logit   Df Residuals:                     7893
Method:                           MLE   Df Model:                            9
Date:                Mon, 17 Mar 2025   Pseudo R-squ.:                  0.2222
Time:                        15:53:51   Log-Likelihood:                -4260.6
converged:                      False   LL-Null:                       -5477.9
Covariance Type:            nonrobust   LLR p-value:                     0.000
                                                coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------------------------------
Intercept                                    -2.1020      0.820     -2.564      0.

From the summary, we can see that blueGoldDiff, blueTotalJungleMinionsKilled, blueAvgLevel, blueGoldDiff:blueTotalJungleMinionsKilled are statistically significant and reliably contributes to predicting wins because their p-values are below 0.05. blueGoldDiff:blueAvgLevel, blueGoldDiff:blueCSPerMin, blueGoldDiff:blueTotalMinionsKilled might not be significant, meaning its effect could be due to random chance. The p-values for blueCSPerMin and blueTotalMinionsKilled are NaN could be due to multicollinearity. To check this, I plan to compute VIF for each predictor

In [30]:
#| echo: false
from statsmodels.stats.outliers_influence import variance_inflation_factor

# Compute VIF for each predictor
vif_data = pd.DataFrame()
vif_data["Feature"] = X_train_poly_df.columns
vif_data["VIF"] = [variance_inflation_factor(X_train_poly_df.values, i) for i in range(X_train_poly_df.shape[1])]

print(vif_data.sort_values(by="VIF", ascending=False))


  vif = 1. / (1. - r_squared_i)


                                              Feature           VIF
8                           blueGoldDiff blueCSPerMin           inf
17                                     blueCSPerMin^2           inf
15                          blueAvgLevel blueCSPerMin           inf
13  blueTotalJungleMinionsKilled blueTotalMinionsK...           inf
12          blueTotalJungleMinionsKilled blueCSPerMin           inf
9                 blueGoldDiff blueTotalMinionsKilled           inf
16                blueAvgLevel blueTotalMinionsKilled           inf
18                blueCSPerMin blueTotalMinionsKilled           inf
19                           blueTotalMinionsKilled^2           inf
4                              blueTotalMinionsKilled           inf
3                                        blueCSPerMin           inf
14                                     blueAvgLevel^2  1.626993e+05
2                                        blueAvgLevel  1.431420e+05
11          blueTotalJungleMinionsKilled blueAvg

My Variance Inflation Factor (VIF) results indicate severe multicollinearity in my dataset. blueCSPerMin and blueTotalMinionsKilled are highly problematic—both their original and squared terms show infinite multicollinearity. Other interaction terms (e.g., blueGoldDiff * blueCSPerMin, blueTotalJungleMinionsKilled * blueTotalMinionsKilled) also have perfect collinearity. This is a problem because multicollinearity makes coefficient estimates unstable.

### Variation in the Response

Looking at the model summary, we can see that the Pseudo R-squared value is 0.222. This means the model explains 22.22% of the variation in blueWins compared to a null model that only includes an intercept, which indicates a moderately strong predictive ability. The model is significantly better than random guessing (R^2 = 0). However, there is still 77.78% of variation unexplained, which could be explained by adding more variables. 

## 7) Recommendation to Stakeholders

### Main Action Items for Stakeholders

For esports coaches and analysts, the key takeaway from this analysis is that early-game advantages, particularly gold difference, jungle farming, and level leads, are crucial for setting up a win. Since blueGoldDiff was the strongest predictor of victory, teams should focus on securing a gold lead through efficient farming, early objective control, and kill conversions. Players can apply this by prioritizing dragons and towers and making sure their gold advantage leads to action rather than passive farming. Since this analysis only looks at the first 10 minutes of Diamond-ranked games, these findings are most relevant for high-level play, where small advantages can snowball quickly. For Riot Games, this data might highlight potential balance concerns—if gold or jungle control is too dominant, adjustments may be needed to keep matches competitive rather than one-sided from the start.



### Limitations

While the model offers valuable insights, it comes with some important limitations. The biggest one is that correlation doesn’t mean causation—just because a gold lead strongly correlates with winning doesn’t mean simply increasing gold guarantees a win. The model also lacks real-time decision-making context—it only looks at game stats, not the choices players make in response to their opponents. Things like team communication, rotations, and overall strategy are missing from this data. Another major gap is that the analysis doesn’t factor in champion matchups, player skill, or team compositions, all of which can completely change how a game plays out. There’s also the issue of scope—since this data only comes from Diamond-ranked games, the findings might not hold for lower-ranked players, where mistakes and playstyles are different, or for Challenger-level games, where coordination is much tighter. Finally, the model’s pseudo R-squared of 0.2222 tells us that while it captures some of what leads to a win, a lot of variation remains unexplained.

### Overcoming Limitations

To improve on this analysis, future work should bring in more contextual data—things like champion picks, team compositions, and player rankings could provide a clearer picture of how early-game decisions influence the rest of the match. It would also be useful to repeat this analysis across different ranks or game patches to see if the same trends hold, since strategies that work in Diamond might not be as effective in lower or higher tiers. Finally, expanding the dataset to track games beyond the first 10 minutes would give a better understanding of how early leads translate into mid and late-game wins, rather than just assuming a strong start always leads to victory.

## 8) Conclusion

My analysis shows that in Diamond-ranked League of Legends matches, early-game advantages in gold, jungle farming, and level leads play a major role in determining the outcome. Teams that build a gold lead while efficiently farming jungle camps and minions set themselves up for success, as these factors work together to create a snowball effect. That said, the model only captures part of the picture—it doesn’t account for champion matchups, team strategy, or what happens after the 10-minute mark, which are all crucial in shaping a game. To get a fuller understanding, future work should bring in more contextual data and explore more complex models to better capture the dynamics of high-level play.