### **Predicting and Understanding Factors Affecting Game Outcomes in League of Legends**

##### By Antonio Alphonse

In [2]:
import os

import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt

import numpy as np
import sklearn.tree as tree
import sklearn.metrics as mt
import sklearn.ensemble as ens
import matplotlib.pyplot as plt

from sklearn.inspection import PartialDependenceDisplay, partial_dependence
from sklearn.model_selection import train_test_split

## **Abstract**

&nbsp;&nbsp;&nbsp;&nbsp; With the increasing popularity of League of Legends in the esports scene, many want to understand what parts of the game are most important to winning. There are many factors in a game that may affect the outcome of a game, with this study seeking to understand the factors that most affect the chances of winning a game, and the possibility of being able to predict a game's outcome before it has even started. This study will help in assessing the fairness of the game, and help in informing new players what are the most important things to work for as they play more games. This study is also helpful in determining what machine learning algorithm is most useful in building a prediction model. Using Random Forests on data from the three highest skill groups of the game, the overall length of the game, the number of kills obtained, the amount of gold a team earned, and the team's respective skill on a character were the most influential factors in predicting outcomes. With these factors, as well as the additional factors looked at, the model was able to classify games correctly over 80% of the time. Only observing pregame factors such as account level and champion skill, the model was able to predict correctly over 50% of the time.

## **Introduction**

### &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Background**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Since its release in 2009 by Riot Games, League of Legends has grown to be adored by many worldwide. The game, with over 150 different characters to choose from, pits two teams of five against each other in a quest for control over the map, ending when one team destroys the enemy's nexus. It's no surprise that with its popularity there exists a competitive scene home to teams from around the world. Besides its presence in esports, League of Legends also has a ranked mode alongside its casual mode, with its lowest three ranks being Iron, Bronze, and Silver, and its highest three ranks being Master, Grandmaster, and Challenger. 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Like any sport, anyone wants to get better at playing it, or understand what they should be working towards to improve. However, that's difficult to do when there isn't a clear understanding as to what's important in a game. The outcome of a game of League of Legends can depend on a mix of multiple factors, or there might be one overarching factor that decides who wins the game. So, when League of Legends depends on certain factors to win, which factors are those? Are there any behaviors that we should focus on to improve our chances of winning? Is there one defining factor that rules predicting match outcomes? These are the questions this experiment seeks to answer, as well as be able to accurately predict and classify matches off of the factors used.  

### &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Prior Art**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;In the past, neural networks were used in order to predict game outcomes, achieving an 84% accuracy rate (Hall, 2017). In 2016, Gradient Boosted Trees and Logistic Regression was used, gaining an Area Under Curve score of over 0.5, and achieved an error of less than 1 (Lin, 2016). Most recently, one experiment used Logistic Regression, Random Forests, and Decision Trees, and then used the best model to predict game outcomes. They found that the Logistic Regression model was the best for their situation with an accuracy of 92.2%, Random Forests with an accuracy of 92%, and Decision Trees with an accuracy of 88% (Quintana, 2019). All of these experiments look at both pregame and postgame information to fit their decision-making models. 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; There is one paper that stands out in pregame prediction, which used Random Forests, Logistic Regression, and Support Vector Machines on data from professional gameplay. The study looked at pregame statistics, like winrate and games played, as well as the champions that were banned and picked, in order to help train the model. In training the model, they found that selected champions for ban and play amounted to a model that predicted as well as random guessing. However, using player statistics in conjunction with champions picked helped create a model with an AUC score higher than 0.9, noting that using less features led to higher AUC scores (Costa et al., 2021). It is interesting how their use of player winrates prior to the game to be analyzed translated to accurate predictions for wins. Unlike Costa's study, which aims to find a model that predicts game outcomes based on trends, this study attempts to see if the matchmaking algorithm in League of Legends allows for a fair matchup.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A study performed in 2020 attempted to look at predicting game outcomes based on the five roles in the game using regression techniques and econometrics. In their results, they find that having more kills as a player in the support or midlane role had an adverse effect on chances for winning, whereas playing as a top laner or attack-damage-carry character increases chances of winning (Hubbard, 2020). Though the results are interesting in finding how different roles and their in-game statistics affect the chances of winning a game, it fails to take into account the different characters and their purposes. It generalizes the results of the study to assuming all characters are the same, rather than accounting for these differences in growth and impact.

### &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **Experiment and Motivations**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; This experiment is done to understand how different variables in League of Legends impact the chances of a team winning the game. The experiment looks at variables that can be easily seen by a player of the game, whether that be ingame or before the game. In looking at these variables, newer players can gain an understanding of what parts of the game are most important to ensure a victory. In conjunction with previous studies, players and analysts alike can see how the importance of different factors has changed over the course of the game. Aside from understanding more about the inner working of League of Legends, this study can be used to assess the degree of fairness of the game, and help communicate to the developers what changes need to be made to the game to ensure it is more fair. If there are factors prior to a game that heavily affect the chances of winning, this would suggest that Riot Games' matchmaking algorithm is not fairly designed.
 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Two experiments were performed to explore the data: an experiment on post-match data, and an experiment on pre-match data. In both situations, the algorithm takes factors such as total team gold earned, total kills, vision score, game duration, average team account level, average champion mastery, and total time immobilized, to produce a boolean value representing whether the team won (1) or lost (0). The terms mentioned above will be explained more in detail. In the first part of this experiment, I use Random Forests on match data from players in the Master, Grandmaster, and Challenger skill divisions to analyze the degree to which each model considered the variables used in postgame data. I expect that the order of importance for the features will not change over the course of the skill groups and remain constant. For the second part of the experiment, I use Random Forests on the same datasets, however, on only pregame factors, to investigate if a game can be considered won before the game is even played out. Here I expect that the model will predict wins accurately over 50% of the time. These pregame factors, specifically, were average champion mastery and average team account level.

## **Data** 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 525 ranked games from the Master, Grandmaster, and Challenger skill groups in North America were gathered using Riot's API in conjunction with RiotWatcher, a library used to gather match and player information. 21 random players from each division were selected, with their 25 most recent games used for data. Since match data provided information for both teams in the game, the data from the other team was also taken as data, doubling the available data to 1050 matches for use. This data was then placed into a csv file. The data in question was gathered on April 16th, 2022 during patch 12.7. Below is an example of the data from the Challenger skill group.

In [3]:
challengerData = pd.read_csv(os.path.join("Data","ChallengerMatchData.csv"))

# Time is reported in seconds, converted to minutes.
challengerData["Game Duration"] = challengerData["Game Duration"] / 60

challengerData.sample(10)

Unnamed: 0,Game Duration,Total Team Gold Earned,Total Team Crowd Control Time,Average Team Account Level,Average Team Champion Mastery,Total Team Kills,Total Team Vision Score,Win
625,39.066667,82857,96,168.2,19995.6,35,285,1
173,22.8,48009,105,318.0,10596.8,26,112,1
140,22.7,48286,81,443.8,10763.0,19,146,1
60,28.933333,50597,61,345.4,11605.2,10,159,0
692,28.666667,50361,74,298.2,11604.8,18,151,0
30,3.216667,4515,3,197.2,703.0,0,2,0
518,30.033333,61799,74,295.8,15750.6,32,164,1
882,28.083333,62769,133,144.8,14060.4,43,170,1
150,26.35,61754,123,322.0,13699.4,38,160,1
1000,33.016667,67950,185,186.4,15730.4,37,173,0


&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; For the experiments, a 60/20/20 split was applied to the data. Both teams' data were added into the csv file at the same time, creating an alternating order of 1 and 0 for the Win column. To fix this issue, the data was shuffled prior to splitting. 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  Previously, I listed the various features that I planned to use for the experiments. Here I go more into detail about the data and the metrics:

1.  **Game Duration**: *The total time for which the game lasted in minutes.* 
    Time was recorded in seconds by RiotWatcher, so the data was converted into minutes to create data that could be more easily understood.
     
2. **Total Team Gold Earned**: *The total amount of gold obtained by the team.*
    In League of Legends, players earn gold through gaining kills, destroying towers, and conquering objectives. Gold is then spent on items to make their individual character more powerful.

3. **Total Team Crowd Control Time**: *The total amount of time the team immobilized the enemy in seconds.* 
    Crowd control is a term referring to an ability that immobilizes an enemy.

4. **Average Team Account Level**: *The average account level for a team.*
    In many games, the account level of a player is a good indicator for how long a player has played the game. While this still holds true for League of Legends, since it's release, the game had a level cap at level 30. In 2018, this cap was removed (Çakır, 2021), which may possibly skew the results.

5. **Average Team Champion Mastery**: *The average champion mastery for a team.*
    Champion mastery is a metric employed by League of Legends that quantifies a player's skill on a particular champion. This number increases the more a player plays that champion and the better that player performs on that champion. Champion mastery does not decrease.

6. **Total Team Kills**: *The total amount of kills gained by a team.* 

7. **Total Team Vision Score**: *The total amount of vision on the map a team has provided and denied.*
    To monitor areas on the map, teams can place wards, which provide vision of the surrounding area for a fixed amount of time, after which, they self-destruct. Teams can also destroy wards placed by enemies to cripple the enemy team's awareness of the map. Destroying a ward adds to your vision score however long that ward was placed.  Certain character abilities also provide vision score ("Vision"), so for the purposes of the experiment, we assume that vision score is only obtained through ward placement and removal.
    
8. **Win**: *A value denoting whether the team won or lost.*
    This value is a boolean value, with 1 signifying a win, and 0 signifying a loss.

## **Methods**

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;In order to understand the Random Forest algorithm, first we need to understand decision trees. Decision trees are basically formed through binary splitting of the dataset into non-overlapping regions. Though incredibly useful in its applications, they can produce really complex trees, leading to poor performance, which is easily solved through pruning. Random Forests are essentially another solution to complex trees, allowing us to build multiple decision trees that run their separate predictions. These separate predictions are then combined, where the prediction with the highest occurence among the trees is selected as the overall vote. <img align="left" style="background-color: white; margin: 2%;"  src="Images/postgame_precision_rates.png">Due to this, they make better classifiers than decision trees. 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; To create the model used for classifying games based on post-match information, the number of trees to use in the Random Forest and the number of random features to use had to be decided on. For deciding on the number of trees to use, I looked at where the precision stabilizes, which seemed to be around 300 trees. For the number of predictors to use, I used the validation set to determine how many predictors maximized the precision of the model. These numbers varied across the models, and were decided to be 5 for Grandmaster, and 7 for both Master and Challenger. Node splitting procedure used the Gini impurity index to determine how to split the nodes. This process was similar for the pre-match analysis, though, two predictors were used as the number of random features to use. Due to time constraints, cross-validation was not performed. 

Below is the implementation of the model algorithm for the Challenger dataset. Scikit-Learn's ensemble methods library was used to create the Random Forest.

In [4]:
# Initialize variables
data = challengerData
featureSet = data.columns

# Split into training and testing set
trainingX, testingX, trainingY, testingY = train_test_split(data[featureSet[:len(featureSet) - 1]], data["Win"], random_state=0, shuffle=True, test_size=0.2)

# Split into validation set from training set
trainingX, validX, trainingY, validY = train_test_split(trainingX, trainingY, random_state=0, shuffle=True, test_size=0.25)

In [5]:
# Find the optimal number of predictors for the random forest algorithm
predictorRange = range(1, len(data.columns))

precisions = []

for predictors in predictorRange:
    classifier = ens.RandomForestClassifier(n_estimators = 300, max_features=predictors)

    model = classifier.fit(trainingX, trainingY)

    validationPredict = model.predict(validX)

    precision = mt.precision_score(validY, validationPredict)

    precisions.append(precision)

maxPredictors = predictorRange[precisions.index(max(precisions))]

In [6]:
# Find the optimal number of trees for the random forest algorithm

numTrees = range(50 , 501, 1)
errors = []

for trees in numTrees:
    classifier = ens.RandomForestClassifier(n_estimators= trees, max_features=maxPredictors)
    treeModel = classifier.fit(trainingX, trainingY)

    treeValidationPredict = treeModel.predict(validX)

    error = mt.mean_squared_error(validY, treeValidationPredict)
    errors.append(error)

# plt.figure(0)
# plt.title("Precision of Validation Set Across Number of Trees")
# plt.xlabel("Number of Trees")
# plt.ylabel("Precision Rate")
# plt.plot(numTrees, errors)

In [7]:
# Implement the optimal random forest, fit, and predict
optimalRF = ens.RandomForestClassifier(n_estimators=300, max_features=maxPredictors)
optimalModel = optimalRF.fit(trainingX, trainingY)

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Before reporting the results of the experiments, it is important to note what metrics are used to judge the model's power. Since the problem to be looked at are classification problems, precision, recall, and AUC score were the primary metrics. In addition, feature importance, noted by the Gini impurity index of each node, was looked at to quantify a feature's importance. The main interest in the experiment was the model's ability to correctly predict wins and the outcome, so the precision, recall, and AUC scores for the win class was focused on. Recall, precision, and AUC scores were computed with SciKit-Learn's metrics library. Feature importance was computed using SciKit-Learn's feature_importance attribute for Random Forests. 

## **Results**

<h3 align=center><strong>Table 1 - AUC Curves</strong></h3>

**Master - Postgame**             |  **Grandmaster - Postgame**         | **Challenger - Postgame** |
:-------------------------:|:-------------------------: |:-------------------------:
<img style="background-color: white;" src="./Images/Master Postgame/master_ROC_Curve_post.png">  | <img style="background-color: white;" src="./Images/Grandmaster Postgame/grandmaster_ROC_Curve.png"> |<img style="background-color: white;" src="./Images/Challenger Postgame/challenger_postgame_ROC_Curve.png">
**Master - Pregame**            |  **Grandmaster - Pregame**        | **Challenger - Pregame**|
<img style="background-color: white;" src="./Images/Master Pregame/master_pregame_ROC_Curve.png">  | <img style="background-color: white;" src="./Images/Grandmaster Pregame/grandmaster_pregame_ROC_curve.png"> |<img style="background-color: white;" src="./Images/Challenger Pregame/challenger_pregame_ROC_curve.png">

<h3 align=center><strong>Table 2 - Precision, Recall, and Important Features of Postgame Matches</strong></h3>

<table align=center>
  <thead align=center>
    <tr>
      <th></th>
      <th><strong>Master - Postgame</strong></th>
      <th><strong>Grandmaster - Postgame</strong></th>
      <th><strong>Challenger - Postgame</strong></th>
    </tr>
  </thead>

  <tbody>
    <tr style="text-align:center">
      <td>Precision: </td>
      <td>82%</td>
      <td>87%</td>
      <td>84%</td>
    </tr>
    <tr style="text-align:center">
      <td>Recall: </td>
      <td>85%</td>
      <td>86%</td>
      <td>83%</td>
    </tr>
    <tr>
      <td>Important Features: </td>
      <td>
        <ol>
          <li>Kills (37)</li>
          <li>Duration (22)</li>
          <li>Gold Earned (13)</li>
          <li>Champion Mastery (11)</li>
        </ol>
      </td>
      <td>
        <ol>
          <li>Kills (35)</li>
          <li>Duration (24)</li>
          <li>Gold Earned (13)</li>
          <li>Champion Mastery (12)</li>
        </ol>
      </td>
      <td>
        <ol>
          <li>Kills (30)</li>
          <li>Duration (25)</li>
          <li>Gold Earned (13)</li>
          <li>Champion Mastery (13)</li>
        </ol>
      </td>
    </tr>
  </tbody>
</table>

<h3 align=center><strong>Table 3 - Precision and Recall of Pregame Prediction</strong></h3>

<table align=center style="text-align:center">
<thead>
  <tr>
    <th></th>
    <th><strong>Master - Pregame</strong></th>
    <th><strong>Grandmaster - Pregame</strong></th>
    <th><strong>Challenger - Pregame</strong></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>Precision: </td>
    <td>52%</td>
    <td>56%</td>
    <td>56%</td>
  </tr>
  <tr>
    <td>Recall: </td>
    <td>59%</td>
    <td>62%</td>
    <td>55%</td>
  </tr>
</tbody>
</table>

<h3 align=center><strong>Figure 1 - Feature Importance of Pregame and Postgame Random Forests</strong></h3>
<div align=center>
    <img style="background-color: white;" src="./Images/pregame_feature_importance_together.png">
    <img style="background-color: white;" src="./Images/postgame_feature_importance.png">

    *note: for postgame on the x-axis, the order is as follows from left to right: "Average Team Account Level", "Total Team Vision Score", "Total Team Crowd Control Time", "Average Team Champion Mastery", "Total Team Gold Earned", "Game Duration", "Total Team Kills"
</div>

<h3 align=center><strong>Table 4 - Partial Dependence Plot of Postgame Features </strong></h3>

**Total Team Kills**             |  **Game Duration**         | **Total Team Gold Earned** |
:-------------------------:|:-------------------------: |:-------------------------:
<img style="background-color: white;" src="./Images/postgame_kills_pdp.png">  | <img style="background-color: white;" src="./Images/postgame_gameduration_pdp.png"> |<img style="background-color: white;" src="./Images/postgame_gold_pdp.png">
**Average Team Champion Mastery**            |  **Total Team Crowd Control Time**        | **Total Team Vision Score**|
<img style="background-color: white;" src="./Images/postgame_mastery_pdp.png">  | <img style="background-color: white;" src="./Images/postgame_cc_pdp.png"> |<img style="background-color: white;" src="./Images/postgame_vision_pdp.png">
**Average Team Account Level**|          | |
<img style="background-color: white;" src="./Images/postgame_accountlevel_pdp.png">| |

<h3 align=center><strong>Table 5 - Feature Importances of Postgame Random Forests</strong></h3>

| Challenger Post Game | Importance |
:-----------------|:-------------------:
Total Team Vision Score |4.180141
Average Team Account Level |5.261508
Total Team Crowd Control Time |7.019170
Average Team Champion Mastery |11.230696
Total Team Gold Earned |13.541553
Game Duration |21.602689
Total Team Kills |37.164242
      

| Grandmaster Post Game | Importance |
:-----------------|:-------------------:
Average Team Account Level |3.964566
Total Team Vision Score |5.638101
Total Team Crowd Control Time |6.075774
Total Team Gold Earned |12.103502
Average Team Champion Mastery|13.010113
Game Duration |23.920335
Total Team Kills |35.287609
      

| Master Post Game | Importance |
:-----------------|:-------------------:
Average Team Account Level |5.277408
Total Team Vision Score |6.406849
Total Team Crowd Control Time |6.890230
Average Team Champion Mastery |13.135445
Total Team Gold Earned |13.320580
Game Duration |24.579913
Total Team Kills |30.389576

## **Discussion**

### &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **Summary of Results**

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; For all skill groups in the postgame analysis, the most important factors were Total Team Kills, Game Duration, and a tie between Total Team Gold Earned and Average Team Champion Mastery (Table 2). Both team kills and game duration maintained a high importance in decision making for the random forests, while total team gold earned and average team champion mastery remained around the same values throughout skill groups. The importance of Total Team Vision, Average Team Champion Mastery, and Game Duration, decreases over the three skill groups, whereas the importance of Total Team Kills increases across the ranks. All other features observed in the experiment had no observable trend across skill groups (Figure 1). 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The data suggests that Average Team Account Level, Total Team Vision Score, and Total Team Crowd Control Time has little to no impact on a team's chances of winning. The longer a game progresses, the lower the chances of winning the game. The more kills obtained by a team, the chance of winning the game dramatically increases. The more skill a player has, the higher a chance of winning the game. Teams with higher amounts of gold tend to have a higher chance of winning. (Table 4). 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; In classification of matches postgame, all models were able to achieve over an 80% precision and accuracy rate (Table 2), with AUC scores over 0.9 (Table 1). Prediction of games using only pregame factors obtained a precision rate of over 50% but less than 60%. Pregame factors were able to achieve an AUC score of over 0.6. Overfitting, if any, of the training set might be low, since the purpose of using random forests is to decrease the chances of overfitting data.

### &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **Discussion of Results**

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The order of importance of factors in the first experiment did change over the course of ranks, but the order of importance for the most impactful factors remained in the same order. It's interesting how the importance of vision score decreases as the skill group gets higher, which suggests that map awareness for higher skilled players is unimportant in impacting a game. It would be expected that the higher a team's vision score was, the higher their chances of winning, since having view of the map and various objectives allows for the team to bette coordinate their actions. The experiment suggests the opposite, suggesting that vision score isn't extremely important. What's also interesting is how kills are partially associated with vision score in lower ranks, in that a lack of awareness of the enemy team leads to more ally deaths. I would have thought that this would apply to higher ranks as well, but it seems not. The lack of vision score affecting probabilities of winning might also be because players at these ranks opt to not take risky plays. Most risky plays tend to be based in lack of awareness, so by creating a low risk environment, players won't expose themselves to the enemy team. I did expect that more kills and gold would lead to a higher chance in wins, since both factors are slightly connected. However, it makes me question what makes kills more valuable than gold. Kills lead to more gold, but gold can be gained from a variety of sources besides kills. It raises the question: What made the model value kills more than total team gold? It does make sense that the longer a game goes, the lower the chance of winning a game, but it's interesting how there's more of a defined trend. The trend makes sense in that players gain power over the course of the game, making it harder to more efficiently win a game. This holds true especially with the existence of champions that infinitely gain power over gameplay. However, for there to be such a defined trend might say something about the characters often played in these higher divisions. It suggests that the team compositions in these higher ranks are designed to make it efficient to finish a game quickly. The starting probabilities according to the partial dependence plot for game duration decreases as the ranks increases, with about 0.83 for Master and about 0.78 for Challenger, suggesting that more of these efficient team compositions are pit against each other as the ranks increase. 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Upon comparison of the previous studies, all of the models performed better than the Hall and Lin study. This supports the running theme that the random forests algorithm is a good algorithm for predicting wins, but not better than Logistic Regression as noted in Quintana's study. It's difficult to really make a comparision between the results of this study and the results of previous studies, mainly because of the difference in time of datasets. This issue is made worse in that Riot Games releases patches for the game every two weeks, making it harder to compare values. At one point in the game, gold might have carried a greater importance than kills, or game duration might have had opposite effect on classifying wins than this study. Other factors include the difference in factors used in this study compared to other studies. 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The precision rate for pregame predictions being less than 60% is disappointing in that games can't be predicted as accurately with only champion mastery and account level, but it's extremely good news for players of the game. The low precision for predicting a game's outcome suggests that the game is fair, providing an equal probability for both teams to win the game before it starts. However, it would be extremely nice if there was access to other pregame features, such as laptop type, average internet ping, latency, and the use of external assistive programs such as Overwolf, OP.gg, or Mobalytics. I feel these factors could improve the accuracy of pregame prediction, as well as classifying postgame matches into wins and losses. Having access to these factors would also make the model much more realistic and considerate of factors outside of the domain of League of Legends. Analysis on the effect of ping on game outcomes in conjunction with these factors would be looked at given the facilities to gain this information. Due to time constraints, the experiments were forced to be limited in that the models could have gone through k-fold cross validation. In addition, looking at more machine learning algorithms, such as Logistic Regression, Bagging, and Boosted Trees, to predict and classify matches would add more information to deciding on what approach should be used when tackling prediction and classification in League of Legends. Given more time, similar experiments done here performed on the lower skilled groups would be interesting in analyzing how importance of game features change over the skill ladder. It would offer more insight into the importance of vision, or if vision is even that important in the game. I fear, though, that there would be issues related to lack of seriousness to the game, as well as the issue of higher skilled players being placed into lower tiers, skewing the results.

## **Conclusion**

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Kills, Total Team Gold, Champion Mastery, and Game Duration are the highest determiners for an outcome of a game. Higher kills, team gold, and champion mastery help in increasing chances for winning, whereas dragging the game on allows for a higher chance of winning. Using random forests, prediction of game outcomes gained an AUC score of over 0.6. The models did not perform as well as hoped, however this low performance helps suggest that the game is currently fair. Both teams are offered an even chance of winning the game before it starts, especially with differing skill on their respective champions. 


## **References**

 Quintana, Diego Angulo. “Predicting Wins in League of Legends.” RPubs - Predicting Wins in League of Legends, 30 Aug. 2019, https://rpubs.com/diegolas/LogisticLoL. 

 Hall, Kenneth. “Deep Learning for League of Legends Match Prediction.” LoL-Match-Prediction/FINAL REPORT.pdf at Master · Minihat/LoL-Match-Prediction, 11 Dec. 2017, https://github.com/minihat/LoL-Match-Prediction/blob/master/FINAL%20REPORT.pdf. 

Lin, Lucas. League of Legends Match Outcome Prediction, 2016, http://cs229.stanford.edu/proj2016/report/Lin-LeagueOfLegendsMatchOutcomePrediction-report.pdf. 

L. M. Costa, R. G. Mantovani, F. C. Monteiro Souza and G. Xexéo, "Feature Analysis to League of Legends Victory Prediction on the Picks and Bans Phase," 2021 IEEE Conference on Games (CoG), 2021, pp. 01-05, doi: 10.1109/CoG52621.2021.9619019.

Hubbard, Chandler. Esports Win Probability: A Role Specific Look into League of Legends, Samford University, 27 May 2020, https://www.samford.edu/sports-analytics/fans/2020/Esports-Win-Probability-A-Role-Specific-Look-into-League-of-Legends. 

“Vision Score.” Vision Score | League of Legends Wiki | Fandom, https://leagueoflegends.fandom.com/wiki/Vision_score. 

Çakır, Gökhan. “What's the Highest Level in League of Legends?” What's the Highest Level in League of Legends - Dot Esports, Dot Esports, 1 June 2021, https://dotesports.com/league-of-legends/news/whats-the-highest-level-in-league-of-legends#:~:text=Though%20the%20level%20cap%20was,account%20with%20the%20highest%20level. 

“Welcome to Riotwatcher's Documentation!¶.” Welcome to RiotWatcher's Documentation! - RiotWatcher 3.2.2 Documentation, https://riot-watcher.readthedocs.io/en/latest/index.html. 