# League of Legends Match Predictor

### Project introduction <br/>
LeLeague of Legends, a popular multiplayer online battle arena (MOBA) game, generates extensive data from matches, providing an excellent opportunity to apply machine learning techniques to real-world scenarios. I will be building a logistic regression model aimed at predicting the outcomes of League of Legends matches.  

Use the [league_of_legends_data_large.csv](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/rk7VDaPjMp1h5VXS-cUyMg/league-of-legends-data-large.csv) file to perform the tasks. 

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import torch
from torch.utils.data import DataLoader, TensorDataset
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt

In [2]:
LoL = pd.read_csv('league_of_legends_data_large.csv')  # Load the League of Legends dataset from a CSV file into a DataFrame

print(LoL.head())      # Print the first five rows of the DataFrame to get a quick look at the data
print(LoL.describe())  # Print summary statistics (mean, std, min, max, etc.) for numerical columns
print(LoL.info())      # Print concise summary of the DataFrame, including column types and non-null counts

   win  kills  deaths  assists  gold_earned   cs  wards_placed  wards_killed  \
0    0     16       6       19        17088  231            11             7   
1    1      8       8        5        14865  259            10             2   
2    0      0      17       11        15919  169            14             5   
3    0     19      11        1        11534  264            14             3   
4    0     12       7        6        18926  124            15             7   

   damage_dealt  
0         15367  
1         38332  
2         24642  
3         15789  
4         40268  
              win        kills       deaths      assists   gold_earned  \
count  1000.00000  1000.000000  1000.000000  1000.000000   1000.000000   
mean      0.51000     9.332000     9.487000     9.395000  12433.808000   
std       0.50015     5.798569     5.773488     5.765086   4388.138751   
min       0.00000     0.000000     0.000000     0.000000   5002.000000   
25%       0.00000     4.000000     4.0000

In [4]:
# Splitting the Data in features and target
X = LoL.drop('win', axis=1)  # Features: all columns except 'win'
y = LoL['win']               # Target: the 'win' column

# Splitting data into train, test, and validation 
# First split: split off 20% of the data for validation, keep 80% for training/testing
X_temp, X_val, y_temp, y_val = train_test_split(X, y, test_size=0.20, random_state=42)  # 20% validation, 80% temp

# Second Split: split the remaining 80% into 75% train and 25% test (results in 60% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X_temp, y_temp, test_size=0.25, random_state=42)  # 25% of 80% is 20%