# League of Legends Esports 2025 - Exploratory Data Analysis

This notebook performs comprehensive exploratory data analysis on professional League of Legends match data from the 2025 season.

## Objectives
- Load and clean the 2025 esports dataset
- Analyze player performance metrics
- Examine champion meta and pick/ban patterns
- Compare team performances
- Visualize key trends and insights

## 1. Setup and Imports

In [None]:
""" 
Imports all required libraries for data analysis and visualization.

Returns:
    None: Imports libraries into the notebook environment.
"""

import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings

warnings.filterwarnings('ignore')

# Add parent directory to path for imports
sys.path.append('..')

from src.data_loader import DataLoader
from src.eda_analysis import EsportsEDA

# Configure visualization settings
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette('husl')
%matplotlib inline

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
pd.set_option('display.float_format', lambda x: '%.3f' % x)

print("✓ Libraries imported successfully")

## 2. Load Data

In [None]:
"""
Loads the 2025 LoL Esports dataset from local storage or Google Drive.

Returns:
    pd.DataFrame: Raw esports match data for 2025 season.
"""

# Initialize data loader
loader = DataLoader()

# Load 2025 data
print("Loading 2025 LoL Esports data...")
df_raw = loader.load_year_data(2025, download_if_missing=True)

print(f"\n✓ Data loaded successfully!")
print(f"Dataset shape: {df_raw.shape}")
print(f"Columns: {len(df_raw.columns)}")

## 3. Initial Data Exploration

In [None]:
"""
Displays basic information about the dataset structure and content.

Returns:
    None: Prints dataset information and first few rows.
"""

# Display first few rows
print("First 5 rows of the dataset:")
display(df_raw.head())

# Data info
print("\nDataset Info:")
df_raw.info()

In [None]:
"""
Examines column names and data types in the dataset.

Returns:
    None: Prints column names and types.
"""

print("Column Names:")
for i, col in enumerate(df_raw.columns, 1):
    print(f"{i:2d}. {col}")

In [None]:
"""
Generates summary statistics for numeric columns.

Returns:
    pd.DataFrame: Descriptive statistics for numerical features.
"""

print("Summary Statistics:")
display(df_raw.describe())

## 4. Data Cleaning and Preprocessing

In [None]:
"""
Initializes EDA analyzer and cleans the dataset.

Returns:
    pd.DataFrame: Cleaned dataset ready for analysis.
"""

# Initialize EDA analyzer
eda = EsportsEDA(df_raw)

# Clean data
df_clean = eda.clean_data()

print(f"\nCleaned dataset shape: {df_clean.shape}")
print(f"Rows removed: {len(df_raw) - len(df_clean)}")

In [None]:
"""
Checks for missing values in the cleaned dataset.

Returns:
    pd.Series: Count and percentage of missing values per column.
"""

missing_data = df_clean.isnull().sum()
missing_percent = (missing_data / len(df_clean)) * 100

missing_df = pd.DataFrame({
    'Missing_Count': missing_data,
    'Percentage': missing_percent
})

missing_df = missing_df[missing_df['Missing_Count'] > 0].sort_values('Missing_Count', ascending=False)

if len(missing_df) > 0:
    print("Columns with missing values:")
    display(missing_df)
else:
    print("✓ No missing values in critical columns!")

## 5. Player Performance Analysis

In [None]:
"""
Analyzes and ranks player performance based on KDA and other metrics.

Returns:
    pd.DataFrame: Player statistics sorted by performance.
"""

player_stats = eda.analyze_player_performance(top_n=30)
display(player_stats.head(20))

In [None]:
"""
Visualizes top players by KDA ratio.

Returns:
    None: Displays bar chart of top players.
"""

eda.visualize_top_players_kda(top_n=20)

## 6. Champion Meta Analysis

In [None]:
"""
Analyzes champion pick rates, win rates, and meta trends.

Returns:
    pd.DataFrame: Champion statistics including picks and win rates.
"""

champion_stats = eda.analyze_champion_meta(top_n=30)
display(champion_stats.head(25))

In [None]:
"""
Visualizes most picked champions in professional play.

Returns:
    None: Displays bar chart of champion pick rates.
"""

eda.visualize_champion_pickrate(top_n=25)

In [None]:
"""
Analyzes win rates for most popular champions.

Returns:
    None: Displays champions with highest win rates (min 20 picks).
"""

# Filter champions with at least 20 picks
popular_champions = champion_stats[champion_stats['games_picked'] >= 20].copy()
top_winrate = popular_champions.nlargest(20, 'win_rate')

print("Champions with Highest Win Rates (min 20 picks):")
display(top_winrate[['champion', 'games_picked', 'win_rate', 'avg_kda']])

## 7. Position-Based Analysis

In [None]:
"""
Compares performance metrics across different positions/roles.

Returns:
    pd.DataFrame: Average statistics for each position.
"""

position_stats = eda.analyze_position_metrics()
display(position_stats)

In [None]:
"""
Visualizes performance metric distributions across positions.

Returns:
    None: Displays box plots comparing positions.
"""

eda.visualize_position_comparison()

## 8. Team Performance Analysis

In [None]:
"""
Analyzes and ranks team performance based on win rates.

Returns:
    pd.DataFrame: Team statistics sorted by win rate.
"""

team_stats = eda.analyze_team_performance(top_n=25)
display(team_stats.head(20))

## 9. Game Duration Impact

In [None]:
"""
Analyzes how game duration affects performance metrics.

Returns:
    None: Displays visualization of duration impact.
"""

eda.analyze_game_duration_impact()

## 10. Correlation Analysis

In [None]:
"""
Creates correlation heatmap of performance metrics.

Returns:
    None: Displays correlation matrix visualization.
"""

eda.create_correlation_heatmap()

## 11. Custom Analysis

Use the cells below for your own custom analysis and exploration.

In [None]:
"""
Custom analysis space - add your own explorations here.

Returns:
    None: Space for user-defined analysis.
"""

# Example: Analyze specific league or region
if 'league' in df_clean.columns:
    print("Available Leagues:")
    print(df_clean['league'].value_counts())

## Summary

This notebook provided a comprehensive exploratory data analysis of the 2025 League of Legends esports season, covering:

1. **Data Loading & Cleaning** - Imported and preprocessed match data
2. **Player Analysis** - Identified top performers and calculated KDA metrics
3. **Champion Meta** - Analyzed pick/ban rates and champion performance
4. **Position Comparison** - Compared role-specific statistics
5. **Team Performance** - Ranked teams by win rates and statistics
6. **Game Duration** - Examined impact of game length on metrics
7. **Correlations** - Identified relationships between performance metrics

All visualizations and processed data have been saved to the output directories.