# Exploratory Data Analysis

This notebook is used for performing exploratory data analysis (EDA) on the match data. The goal is to understand the data better, visualize key features, and identify patterns that may be useful for model training.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style='whitegrid')

In [2]:
# Load the processed data
data = pd.read_csv('../data/processed/match_data.csv')

# Display the first few rows of the dataset
data.head()

In [3]:
# Summary statistics
data.describe()

In [4]:
# Visualize the distribution of a key feature
plt.figure(figsize=(10, 6))
sns.histplot(data['feature_name'], bins=30, kde=True)
plt.title('Distribution of Feature Name')
plt.xlabel('Feature Name')
plt.ylabel('Frequency')
plt.show()

In [5]:
# Correlation heatmap
plt.figure(figsize=(12, 8))
correlation_matrix = data.corr()
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

## Conclusion

This exploratory analysis provides insights into the match data, highlighting important features and their relationships. The findings will guide the feature engineering and model training processes.