# Project Title: Analyzing Performance and Market Value of Top 5 German Bundesliga Clubs (Season 2023-2024)

# 1. Introduction

This portfolio project, conducted for itzmore | Data Empowers, focuses on 

*Bayer 04 Leverkusen*'s 

performance and market dynamics throughout the 2023-2024 Bundesliga season. Utilizing a combination of Python, SQL, and R, this analysis aims to deliver actionable insights for the club and its stakeholders, identifying key trends, drivers of success, and potential future outcomes.

Last Updated: 2024-04-14 by:

Moritz Philipp Haaf, BSc (WU) MA - Founder/CEO of itzmore | Data Empowers e.U.

*Contact:*
Mail:       moritz.haaf@itzmore.net

Tel.:       +43 664 404 38 64

Website:    https://itzmore.net

GitHub:     https://github.com/itzmore-mph

# 2. Methodology

2.1 Data Collection:
The data was sourced from the Bundesliga official website, Kaggle datasets, and sports APIs, capturing player statistics, match results, and market values.

2.2 Data Preparation:
The raw data was thoroughly cleaned and preprocessed to ensure consistency and accuracy. This stage involved handling missing values, converting data types, and merging datasets.

2.3 Exploratory Data Analysis (EDA):
We utilized summary statistics and visualization techniques to examine player performance metrics and market value trends, conducting correlation analysis to pinpoint relationships between variables.

2.4 Performance Analysis:
We performed a comparative analysis of wins, losses, and draws to evaluate Bayer 04 Leverkusen's performance against league competitors.

2.5 Market Value Analysis:
Market value changes were visualized to understand player valuation dynamics over the season, analyzing the impact of various factors such as player performance and positions.

2.6 Predictive Modeling:
Predictive models were developed to forecast future market values of players, using historical data as a base for our projections.

2.7 SQL Analysis:
SQL queries were executed to efficiently extract specific performance and market value data from our databases.

2.8 R Programming Analysis:
We applied R for detailed statistical analysis and data visualization to supplement insights gained from Python and SQL.

2.9 Conclusion:
Our methodology combined various analytical techniques to provide a comprehensive overview of Bayer 04 Leverkusen's performance and market dynamics, offering actionable insights for strategic decision-making.



# 3. Data Collection and Preparation

We meticulously collected data from reliable sources including the Bundesliga official website and reputable Kaggle datasets. The data was rigorously cleaned and preprocessed to eliminate inconsistencies and ensure accuracy.

3.1 Importing Necessary Libraries

In [1]:
# Importing necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.io as pio
from datetime import datetime






ModuleNotFoundError: No module named 'matplotlib'

3.2 Setting Default Renderer for Plotly

In [None]:
# Setting default renderer for Plotly
pio.renderers.default = 'notebook'

3.3 Loading CSV Files

In [None]:
# File path to CSV files
file_path = "itzmore-mph/portfolio-github-mph/Football-Analysis_German-Bundesliga/"

# List of data files loaded
appearances = pd.read_csv(file_path + "appearances.csv")
club_games = pd.read_csv(file_path + "club_games.csv")
clubs = pd.read_csv(file_path + "clubs.csv")
competitions = pd.read_csv(file_path + "competitions.csv")
game_events = pd.read_csv(file_path + "game_events.csv")
game_lineups = pd.read_csv(file_path + "game_lineups.csv")
games = pd.read_csv(file_path + "games.csv")
player_valuations = pd.read_csv(file_path + "player_valuations.csv")
players = pd.read_csv(file_path + "players.csv")

# 4. Enhanced Data Overview

4.1 Detailed Dataset Descriptions

In [None]:
print("Detailed description of the 'players' dataset:")
players.info()
print(players.describe())



# 5. Dynamic Team Analysis Setup

5.1 Selecting Team and Season for Detailed Analysis

In [None]:
team_name = 'Leverkusen'  # Parameterized for user selection
season_year = 2023

selected_team_data = filter_by_team_and_season(data['players'], team_name, season_year)
print(f"Analysis for {team_name} in the {season_year}/{season_year+1} season:")


# 6. Exploratory Data Analysis (EDA)

6.1 Summary Statistics

In [None]:
summary_statistics = player_valuations.describe()
print(summary_statistics)


6.2 Visualization: Interactive Market Value Trend

Using Plotly for dynamic, interactive visualizations:

In [None]:
fig = px.line(selected_team_data, x='date', y='market_value_eur', title=f'Market Value Trends for {team_name}')
fig.show()



6.3 Correlation Analysis

Visualizing the correlation matrix to identify relationships:

In [None]:
correlation_matrix = player_valuations.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Matrix')
plt.show()



Productivity and Discipline Metrics

In [None]:
# Calculating and visualizing productivity and discipline
players['productivity'] = players['goals'] + players['assists']
players['discipline_score'] = players['yellow_cards'] + 2 * players['red_cards']

fig = px.scatter(players, x='productivity', y='discipline_score', color='age',
                 size='market_value_eur', hover_data=['player_name'],
                 title='Player Productivity vs. Discipline')
fig.show()


Interpretation:

These analyses provide insights into the distribution and relationships of player performance metrics and market values. The interactive visualizations facilitate a deeper understanding of the data, highlighting trends and outliers that are critical for strategic decisions.

# 7. Performance Analysis

7.1 Comparative Analysis of Key Performance Indicators

We conducted a detailed comparison of key performance indicators to gauge Bayer 04 Leverkusen's standings in the league.

Using bar plots to visualize performance metrics:

In [None]:
sns.barplot(x=['Wins', 'Losses', 'Draws'], y=[wins, losses, draws], palette='viridis')
plt.title('Performance Metrics of Bayer 04 Leverkusen')
plt.show()



Interpretation:

This analysis helps us understand the club's competitive performance, identifying strengths and areas for improvement relative to league rivals.

7.2 Match Results Visualization

In [None]:
# Python code
plt.figure(figsize=(10, 6))
sns.barplot(x=['Wins', 'Losses', 'Draws'], y=[wins, losses, draws], palette='viridis')
plt.xlabel('Performance Metrics')
plt.ylabel('Count')
plt.title('Performance Metrics of Bayer 04 Leverkusen')
plt.show()


Interpretation:
Analyzing total matches, wins, losses, draws, and win percentage provides insights into the club's performance in the Bundesliga.
Comparing performance metrics with other clubs helps identify strengths and weaknesses.

# 8. Market Value Analysis

8.1 Visualization of Market Value Changes Over Time

We analyzed fluctuations in player market values to identify trends and influencing factors.

Visual consistency in line plots:

In [None]:
sns.lineplot(x='Date', y='Market_Value', data=leverkusen_valuations, color='b')
plt.title('Market Value Trend of Bayer 04 Leverkusen')
plt.xticks(rotation=45)
plt.grid(True)
plt.show()

8.2 Factors Influencing Market Value Fluctuation

Scatter plot to explore the relationship between performance metrics and market values:

In [None]:
sns.scatterplot(x='Performance_Metric', y='Market_Value', data=leverkusen_valuations, hue='Player_Position')
plt.title('Factors Influencing Market Value Fluctuations')
plt.show()


Interpretation:

Understanding market value trends is crucial for financial planning and player investment strategies. This analysis aids in identifying potential opportunities for optimal player acquisition and sales.

# 9. Predictive Modeling

9.1 Model Evaluation

Predictive modeling was used to forecast future market values, providing a basis for strategic planning.

Plotting actual vs. predicted values to evaluate model performance:

In [None]:
plt.plot(y_test, label='Actual')
plt.plot(y_pred, label='Predicted')
plt.title('Actual vs. Predicted Market Value')
plt.legend()
plt.show()


9.2 Forecasting Future Market Values

Forecasting future market values to aid strategic planning:

In [None]:
plt.plot(leverkusen_valuations['Date'], leverkusen_valuations['Market_Value'], label='Actual')
plt.plot(future_dates, future_forecast, label='Forecast')
plt.title('Forecasting Future Market Values of Bayer 04 Leverkusen')
plt.legend()
plt.grid(True)
plt.show()

Interpretation:

These models offer predictions that are vital for preparing for future market conditions, helping the club make informed decisions regarding player contracts and transfers.

# 10. SQL Analysis

This section leverages SQL to extract specific performance metrics and market value data, providing direct insights from the database.

10.1 Querying Performance Metrics:

In [None]:
-- Extract performance metrics for Bayer 04 Leverkusen
SELECT AVG(goals_scored) AS avg_goals_scored FROM matches WHERE team = 'Bayer 04 Leverkusen';



10.2 Extracting Market Value Data

In [None]:
-- Retrieve date and market value for players from Bayer 04 Leverkusen
SELECT
    Date,
    Market_Value
FROM
    player_valuations
WHERE
    Club = 'Bayer 04 Leverkusen';


Interpretation:

Efficient data extraction via SQL supports targeted analysis of performance metrics, providing a quick overview and detailed breakdowns as needed.

# 11. R Programming Analysis

Statistical analysis and visualization in R complemented our findings, providing additional depth:

11.1 Statistical Analysis

In [None]:
# Load the Bayer 04 Leverkusen data
leverkusen_data <- read.csv("leverkusen_data.csv")

# Perform a summary of key statistics
summary(leverkusen_data)


Interpretation:

Using R for statistical analysis helps in uncovering deeper insights, such as distribution characteristics and anomalies within the data, enhancing the overall strategic understanding.

11.2 Data Visualization

In [None]:
# Plot the market value trend using R
plot(leverkusen_data$Date, bayer_data$Market_Value,
     type="l", col="blue", xlab="Date", ylab="Market Value",
     main="Market Value Trend of Bayer 04 Leverkusen")


Interpretation:

This line graph provides a visual representation of how the market values of players have trended over time. Visual trends can indicate periods of significant increase or decrease in player values, correlating these changes with external events like transfers, performance peaks, or injuries. This helps in strategic decision-making regarding player contracts and investments.

# 12. Conclusion

This extensive analysis offers a comprehensive view of Bayer 04 Leverkusen's performance and market dynamics throughout the 2023-2024 Bundesliga season. Our approach combined advanced data analytics techniques using Python, SQL, and R to highlight significant patterns and determinants of the club's operational successes and areas needing improvement.

The process began with rigorous data collection and preprocessing, ensuring a robust basis for our analysis. Our exploratory efforts unveiled intricate details about player performance metrics and their market values, revealing how these elements interact within the broader competitive framework of the league. Advanced visualization tools were instrumental in these discoveries, offering clear, dynamic representations of our data that enhanced both the accessibility and the depth of our analysis.

SQL queries played a pivotal role in streamlining the extraction of precise performance metrics and market valuation data, enabling us to build a detailed quantitative foundation for further analysis. On the predictive front, our modeling efforts provided forecasts of player market values, which are essential for strategic planning. These predictions are vital for optimizing transfer and contract negotiation strategies, helping the club anticipate financial outlays and investment returns.

Moreover, employing R for detailed statistical analysis and visualization enriched our understanding of underlying trends and anomalies. This analytical depth was crucial for strategic insight, particularly in identifying and understanding the nuances of market value fluctuations and their financial implications for Bayer 04 Leverkusen.

Our findings do more than just summarize the club's current standings within the Bundesliga; they equip Bayer 04 Leverkusen and its stakeholders with actionable insights, facilitating informed decisions that optimize performance and financial outcomes. The analysis clearly demonstrates the power of data-driven strategies in enhancing competitive edge and operational efficiency in the realm of professional football.

Looking ahead, it is imperative for Bayer 04 Leverkusen to continue embracing and advancing their data analytics capabilities. The dynamic nature of football, with its ever-evolving strategies and technologies, demands a proactive and predictive approach to management. By further refining their analytical methodologies and integrating cutting-edge technologies, Bayer 04 Leverkusen can not only adapt to emerging trends but also shape them, securing a sustainable and successful future in both domestic and international arenas..