# 🏐 Exploratory Data Analysis on Volleyball Nations League 2023

This project explores the statistics of players participating in the Volleyball Nations League 2023 (VNL 2023). The goal is to analyze player performance, compare countries, and find patterns in the data using data visualization and summary statistics.


In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
df = pd.read_csv('VNL2023.csv')

## 📄 Dataset Overview
Let's take a look at the structure and basic statistics of the dataset.

In [None]:
df.head()

In [None]:
df.describe()

In [None]:
df.isnull().sum()  # Checking missing values

In [None]:
df.duplicated().sum()  # Checking for duplicate rows

## 🔗 Correlation Analysis
Checking correlations between numerical features to identify strong relationships.

In [None]:
numeric_cols = df.select_dtypes(include=['int', 'float']).columns
corr_matrix = df[numeric_cols].corr()
corr_matrix

In [None]:
sns.heatmap(corr_matrix, annot=True, linewidths=4)
plt.title('Correlation Heatmap')
plt.show()

## 🧩 Player Positions Distribution

In [None]:
df['Position'].value_counts().plot(kind='pie', autopct='%.2f')
plt.title('Distribution of Player Positions')
plt.ylabel('')
plt.show()

## ⚔️ Average Attack by Country

In [None]:
avg_attack = df.groupby("Country")["Attack"].mean().sort_values(ascending=False)
avg_attack.plot(kind='bar')
plt.title('Average Attack by Country')
plt.ylabel('Average Attack')
plt.show()

## 🛡️ Total Digs by Country

In [None]:
df.groupby("Country")["Dig"].sum().sort_values(ascending=False).plot(kind="bar")
plt.title('Total Digs by Country')
plt.ylabel('Total Digs')
plt.show()

## 📉 Block vs Receive Scatter Plot

In [None]:
sns.scatterplot(x=df['Block'], y=df['Receive'])
plt.title('Block vs Receive')
plt.xlabel('Block')
plt.ylabel('Receive')
plt.show()

## 📦 Serve Performance Distribution

In [None]:
sns.boxplot(x=df['Serve'])
plt.title('Boxplot of Serve')
plt.show()

## 🎂 Age Distribution of Players

In [None]:
plt.hist(df['Age'], bins=20, color='blue', edgecolor='black')
plt.title('Age Distribution')
plt.xlabel('Age')
plt.ylabel('Count')
plt.show()

## 📈 Serve vs Age Line Plot

In [None]:
sns.lineplot(x=df['Serve'], y=df['Age'])
plt.title('Serve vs Age')
plt.show()

## 🧱 Total Attack & Block by Country

In [None]:
total_attack_block_by_country = df.groupby("Country")[["Attack", "Block"]].sum()
total_attack_block_by_country.plot(kind="bar", colormap="viridis")
plt.title("Total Attack & Block by Country")
plt.ylabel("Total Count")
plt.show()

## ✅ Conclusion

- Countries with higher average attack scores are likely more offensive-oriented.
- Some countries dominate in total digs, indicating stronger defense.
- Serve performance and block/receive relationships offer insights into strategic roles.
- The dataset shows a good distribution of player positions and age range.

This analysis provides a strong base for deeper performance or team composition analysis in sports analytics.
