# Analysis of Bundesliga

We will be analysing the data of top 7 teams in Bundesliga with respect to following aspects:
1. Proportion of German players
2. Aggressive Defenders
3. Attacking Defenders
4. Goal scoring goalkeepers
5. Penalty Attempts

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
bundesliga_file_path = '../input/bundesliga-top-7-teams-offensive-stats/bundesliga_top7_offensive.csv'

bundesliga = pd.read_csv(bundesliga_file_path)
bundesliga['German_National'] = bundesliga['Nationality'] == 'GER'
bundesliga['German_National'] = bundesliga['German_National'].replace({True: 'Yes', False: 'No'})

bundesliga.head()


### Proportion of German players in top 7 teams

**Why does top 7 teams have more international players than German players?**

As it can be seen that except **Union Berlin**, every team has more international players than German players which might not be the case with other teams in Bundesliga. The reason could be that top 7 teams get more funds/revenue and thus are able to hire more international talent.

In [None]:
player_count = bundesliga.groupby(['Club', 'German_National']).size().unstack()
player_count['%german_players'] = (player_count['Yes'])/(player_count['No']+player_count['Yes'])
player_count['%international_players'] = 1 - player_count['%german_players']
player_count.drop(['Yes', 'No'], axis=1, inplace=True)
_ = player_count.plot.bar(figsize=(8,6), stacked=True)

### Aggressive Defenders

**Which teams are most aggressive and how much aggressive are their defenders?**

Eintracht Frankfurt and Wolfsburg are the most aggressive teams. If we consider only defenders in the respective teams, then we can see that Wolfsburg is the most aggressive team. On average they get 2 cards every 9 games.

In [None]:
total_club_stats = bundesliga.groupby('Club').sum().reset_index()
total_club_stats['Total_Cards'] = total_club_stats['Yellow_Cards'] + 2*total_club_stats['Red_Cards']
total_club_stats['Cards_per_Minute'] = total_club_stats['Total_Cards']/total_club_stats['Mins']

total_club_stats_DF = bundesliga.loc[bundesliga['Position']=='DF'].groupby('Club').sum().reset_index()
total_club_stats_DF['Total_Cards'] = total_club_stats_DF['Yellow_Cards'] + 2*total_club_stats_DF['Red_Cards']
total_club_stats_DF['Cards_per_Minute'] = total_club_stats_DF['Total_Cards']/total_club_stats_DF['Mins']

fig, axes = plt.subplots(nrows=1, ncols=2, sharey=True, figsize=(16,6))
_ = total_club_stats.sort_values(by='Cards_per_Minute', ascending=False, axis=0).plot.bar(x='Club', y='Cards_per_Minute', ax=axes[0], title='All Players')
_ = total_club_stats_DF.sort_values(by='Cards_per_Minute', ascending=False, axis=0).plot.bar(x='Club', y='Cards_per_Minute', ax=axes[1], title='Only Defenders')

### Attacking Defenders

**Which team has the most attacking defenders?**
Borussia Dortmund has the most attacking defenders, very closely followed by RB Leipzig. Borussia Dortmund's defender score one goal every 9 games.

In [None]:
total_club_stats = bundesliga.loc[bundesliga['Position']=='DF'].groupby('Club').sum().reset_index()
total_club_stats['Goals_per_Minute'] = total_club_stats['Goals'] / total_club_stats['Mins']
_ = total_club_stats.sort_values(by='Goals_per_Minute', ascending=False, axis=0).plot.bar(x='Club', y='Goals_per_Minute')

### Goal scoring goalkeepers

It's really surprising to see that no goalkeeper has scored or assisted an goal in Bundesliga.

In [None]:
no_of_goalkeepers = bundesliga[(bundesliga['Position'] == 'GK') & ((bundesliga['Goals'] > 0) | (bundesliga['Assists'] > 0))].shape[0]
no_of_goalkeepers

### Penalty Attempts

**What teams had the most most penalty attempts and how many they converted?**

Bayern Munich had the most attempts whereas Eintracht Frankfurt managed to score every single penalty, **8/8!!**

In [None]:
_ = bundesliga.groupby('Club')[['Penalty_Attempted', 'Penalty_Goals']].sum().sort_values(by='Penalty_Attempted', ascending=False).plot(kind='bar')