# Premier Leage Data Warehouse Analysis

### Analysis 01: Top 5 Teams by Total Goals Scored
**Description:** Find the top 5 teams with the highest total goals.

In [3]:
SELECT TOP 5 
    T.team_name, 
    SUM(TS.goals) AS total_goals
FROM Team_Stats TS
JOIN Teams T ON TS.team_name = T.team_name
GROUP BY T.team_name
ORDER BY total_goals DESC;

team_name,total_goals
Liverpool,84
Manchester City,69
Newcastle Utd,66
Arsenal,65
Brentford,64


### Analysis 02: Top 5 Players by Goals Scored
**Description:** Find the top 5 players who scored the most goals.

In [2]:
SELECT TOP 5 
    P.player_name, 
    SUM(PS.goals) AS total_goals
FROM Player_Stats PS
JOIN Players P ON PS.player_name = P.player_name
GROUP BY P.player_name
ORDER BY total_goals DESC;

player_name,total_goals
Mohamed Salah,28
Alexander Isak,23
Erling Haaland,21
Chris Wood,20
Bryan Mbeumo,19


### Analysis 03: Average Age of Players Per Team
**Description:** Calculate the average age of players in each team.

In [4]:
SELECT 
    P.team_name, 
    AVG(P.age) AS avg_age
FROM Players P
GROUP BY P.team_name
ORDER BY avg_age DESC;

team_name,avg_age
Fulham,27
Newcastle Utd,27
West Ham,27
Wolves,26
Crystal Palace,26
Everton,26
Nott'ham Forest,26
Ipswich Town,26
Leicester City,26
Liverpool,26


### Analysis 04: Team with Highest Possession
**Description:** Find the team with the highest average possession.

In [5]:
SELECT TOP 1
    team_name,
    AVG(possession) AS avg_possession
FROM Team_Stats
GROUP BY team_name
ORDER BY avg_possession DESC;

team_name,avg_possession
Manchester City,61.59999847412109


### Analysis 05: Top 5 Teams by Salary Spending (Annual)
**Description:** Find the teams that spend the most on salaries annually.

In [6]:
SELECT TOP 5
    TS.team_name,
    SUM(TS.annual) AS total_annual_salary
FROM Team_Salaries TS
GROUP BY TS.team_name
ORDER BY total_annual_salary DESC;

team_name,total_annual_salary
Manchester City,270992872
Manchester Utd,242779887
Chelsea,231062958
Arsenal,227432965
Liverpool,172913278


### Analysis 06: Player with Highest Expected Goals (xG)

**Description:** Find the player with the highest total xG (Expected Goals).

In [7]:
SELECT TOP 1 
    P.player_name,
    SUM(PS.expected_goals) AS total_xG
FROM Player_Stats PS
JOIN Players P ON PS.player_name = P.player_name
GROUP BY P.player_name
ORDER BY total_xG DESC;

player_name,total_xG
Mohamed Salah,24.799999237060547


### Analysis 07: Team Standings Summary (Win Ratio)

**Description:** Calculate win ratio for each team.

In [9]:
SELECT 
    team_name,
    win,
    draw,
    loss,
    CAST(win AS FLOAT) / (win + draw + loss) AS win_ratio
FROM Standings
ORDER BY win_ratio DESC;

team_name,win,draw,loss,win_ratio
Liverpool,25,8,4,0.6756756756756757
Manchester City,20,8,9,0.5405405405405406
Newcastle Utd,20,6,11,0.5405405405405406
Nott'ham Forest,19,8,10,0.5135135135135135
Arsenal,19,14,4,0.5135135135135135
Aston Villa,19,9,9,0.5135135135135135
Chelsea,19,9,9,0.5135135135135135
Brentford,16,7,14,0.4324324324324324
Brighton,15,13,9,0.4054054054054054
Fulham,15,9,13,0.4054054054054054


Analysis 09: Player Efficiency (Goals per 90 Minutes)

**Description:** Calculate goals per 90 minutes for each player.

In [12]:
SELECT TOP 10
    P.player_name,
    SUM(PS.goals) * 90.0 / NULLIF(SUM(PS.minutes), 0) AS goals_per_90
FROM Player_Stats PS
JOIN Players P ON PS.player_name = P.player_name
GROUP BY P.player_name
HAVING SUM(PS.minutes) > 0
ORDER BY goals_per_90 DESC;


player_name,goals_per_90
Matheus França,1.666666666666
Daniel Jebbison,0.98901098901
Jáder Durán,0.987460815047
Donyell Malen,0.906040268456
Alexander Isak,0.776444111027
James Mcatee,0.773638968481
Mohamed Salah,0.768058518744
Richarlison,0.751565762004
Rodrigo Muniz,0.746887966804
Erling Haaland,0.712669683257


Analysis 10: Best Defensive Team (Least Goals Conceded)

**Description:** Find the team that conceded the least number of goals.

In [13]:
SELECT TOP 1
    team_name,
    conceded
FROM Standings
ORDER BY conceded ASC;

team_name,conceded
Arsenal,33


Analysis 11: Fixtures with the Highest Attendance

**Description**: List the top 5 fixtures with the highest attendance.

In [14]:
SELECT TOP 5
    home_team,
    away_team,
    match_date,
    attendance
FROM Fixtures
ORDER BY attendance DESC;

home_team,away_team,match_date,attendance
Manchester Utd,Aston Villa,2025-05-25,73839
Manchester Utd,Leicester City,2024-11-10,73829
Manchester Utd,Ipswich Town,2025-02-26,73827
Manchester Utd,Wolves,2025-04-20,73819
Manchester Utd,Everton,2024-12-01,73817


Analysis 12: Team Carry Distance Leaders

**Description**: Find the teams with the most total distance carried.

In [15]:
SELECT TOP 5
    team_name,
    SUM(total_distance_carried) AS total_carry_distance
FROM Team_Possession_Stats
GROUP BY team_name
ORDER BY total_carry_distance DESC;


team_name,total_carry_distance
Manchester City,101274
Tottenham,81603
Chelsea,76825
Brighton,74918
Liverpool,73791


Analysis 13: Players with Most Carries

**Description**: Identify the players with the highest number of carries.

In [16]:
SELECT TOP 5
    player_name,
    SUM(carries) AS total_carries
FROM Player_Possession_Stats
GROUP BY player_name
ORDER BY total_carries DESC;

player_name,total_carries
Joško Gvardiol,1966
Virgil van Dijk,1779
William Saliba,1756
Rúben Dias,1724
Jan Paul van Hecke,1660


Analysis 14: Most Frequent Referee

**Description**: Find the referee who officiated the most matches.

In [17]:
SELECT TOP 1
    referee,
    COUNT(*) AS matches_officiated
FROM Fixtures
GROUP BY referee
ORDER BY matches_officiated DESC;

referee,matches_officiated
Anthony Taylor,31


Analysis 15: Day with the Most Matches

**Description**: Find which day had the most matches played.

In [18]:
SELECT TOP 1
    day,
    COUNT(*) AS matches_played
FROM Fixtures
GROUP BY day
ORDER BY matches_played DESC;

day,matches_played
Sat,179


## Final Note: Endless Analysis Opportunities

The analyses presented in this notebook represent only a small fraction of the insights that can be derived from this rich football dataset.

With the available data, hundreds of other analyses can be conducted, such as:

- **Injury Impact Analysis:** Studying how player injuries affect team performance over a season.
    
- **Home vs Away Performance Comparison:** Evaluating whether teams perform better at home versus away fixtures.
    
- **Player Value vs Salary Efficiency:** Comparing player market value to their salaries to determine cost-efficiency.
    
- **Pass Network Visualization:** Analyzing and visualizing passing networks between players during matches.
    
- **Expected Points (xP) Calculation:** Predicting final league standings based on advanced metrics like xG and xA.
    
- **Managerial Impact Analysis:** Measuring changes in team performance before and after managerial changes.
    
- **Weather Conditions Effect:** Investigating if weather conditions (e.g., rain, temperature) impact match outcomes or player performance.
    
- **Fan Attendance Impact:** Evaluating whether high attendance correlates with better home team performance.
    
- **Financial Fair Play (FFP) Analysis:** Monitoring if teams comply with spending regulations related to player salaries and transfers.
    
- **Clustering and Segmentation:** Segmenting teams or players based on playing style, performance, or financial metrics using machine learning.
    

These are just a few of the many directions possible; the potential for creative and insightful data exploration in football analytics is practically limitless.