# Premier League xG for and against animation
The aim of this project is to animate Premier League teams' xG for and against over a 5-game rolling average.

This [repository](https://github.com/stanleyrudkin/Premier-League-xG-for-and-against-animation.git) has been uploaded to GitHub.

The data can be downloaded from [football-data.co.uk](https://www.football-data.co.uk/englandm.php).

## Set up
Import packages.

In [1]:
import pandas as pd
import plotly.express as px
import numpy as np

## Load data
The data can be downloaded from [football-data.co.uk](https://www.football-data.co.uk/englandm.php).

In [2]:
data = pd.read_csv("D:\Data analysis projects\Premier-League-xG-for-and-against-animation\Data\Premier League 2021-22 data.csv")
data

Unnamed: 0,Div,Date,Time,HomeTeam,AwayTeam,FTHG,FTAG,FTR,HTHG,HTAG,...,AvgC<2.5,AHCh,B365CAHH,B365CAHA,PCAHH,PCAHA,MaxCAHH,MaxCAHA,AvgCAHH,AvgCAHA
0,E0,13/08/2021,20:00,Brentford,Arsenal,2,0,H,1,0,...,1.62,0.50,1.75,2.05,1.81,2.13,2.05,2.17,1.80,2.09
1,E0,14/08/2021,12:30,Man United,Leeds,5,1,H,1,0,...,2.25,-1.00,2.05,1.75,2.17,1.77,2.19,1.93,2.10,1.79
2,E0,14/08/2021,15:00,Burnley,Brighton,1,2,A,1,0,...,1.62,0.25,1.79,2.15,1.81,2.14,1.82,2.19,1.79,2.12
3,E0,14/08/2021,15:00,Chelsea,Crystal Palace,3,0,H,2,0,...,1.94,-1.50,2.05,1.75,2.12,1.81,2.16,1.93,2.06,1.82
4,E0,14/08/2021,15:00,Everton,Southampton,3,1,H,0,1,...,1.67,-0.50,2.05,1.88,2.05,1.88,2.08,1.90,2.03,1.86
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
375,E0,22/05/2022,16:00,Crystal Palace,Man United,1,0,H,1,0,...,2.04,0.25,1.68,2.15,1.74,2.23,1.88,2.25,1.74,2.16
376,E0,22/05/2022,16:00,Leicester,Southampton,4,1,H,0,0,...,2.63,-0.75,1.83,2.07,1.88,2.03,1.94,2.26,1.87,2.01
377,E0,22/05/2022,16:00,Liverpool,Wolves,3,1,H,1,1,...,3.28,-2.50,2.02,1.77,2.06,1.83,2.19,1.99,2.07,1.80
378,E0,22/05/2022,16:00,Man City,Aston Villa,3,2,H,0,1,...,3.36,-2.25,2.06,1.84,2.05,1.86,2.09,2.03,2.01,1.87


## Calculate 5-game rolling average of goals for and against
I have not yet been able to find xG data by match, so I will settle for actual goals for and against in the meantime.

In [3]:
# Take a look at the teams
data['HomeTeam'].unique().tolist()

['Brentford',
 'Man United',
 'Burnley',
 'Chelsea',
 'Everton',
 'Leicester',
 'Watford',
 'Norwich',
 'Newcastle',
 'Tottenham',
 'Liverpool',
 'Aston Villa',
 'Crystal Palace',
 'Leeds',
 'Man City',
 'Brighton',
 'Southampton',
 'Wolves',
 'Arsenal',
 'West Ham']

In [7]:
# Create empty dictionary to add dataframe for each team
TeamDict = {}

# Loop through each team and get goals for and against for each game
for Team in data['HomeTeam'].unique().tolist():
    
    # Filter data to matches where the chosen team is either the home or away side
    HomeData = data.loc[data['HomeTeam'] == Team]
    AwayData = data.loc[data['AwayTeam'] == Team]   
    
    # Rename columns to count goals for and against the chosen team and one to mark which they actually are!
    HomeData = HomeData.rename(columns={'FTHG':'For', 'FTAG':'Against', 'AwayTeam':'Opponent'})
    HomeData['HomeOrAway'] = 'Home'
        
    AwayData = AwayData.rename(columns={'FTAG':'For', 'FTHG':'Against', 'HomeTeam':'Opponent'})
    AwayData['HomeOrAway'] = 'Away'
    
    # Keep only certain columns
    TeamData = pd.concat([HomeData, AwayData], axis=0)[['Date', 'HomeOrAway', 'Opponent', 'For', 'Against']]
    
    # Mark the team
    TeamData['Team'] = Team
    
    # Calculate rolling averages over 5 games
    TeamData['MovingAverageFor'] = TeamData['For'].rolling(5).mean()
    TeamData['MovingAverageAgainst'] = TeamData['Against'].rolling(5).mean()
    
    # Add identifier for matches played
    TeamData['MatchesPlayed'] = TeamData.reset_index().index + 1
    
    # Put the chosen team's new dataframe in the aggregated dictionary,
    # removing first 4 games since they can't have a rolling average
    TeamDict[Team] = TeamData.loc[TeamData['MatchesPlayed'] >= 5]

# Stack the dataframes in the dictionary on top of each other and take a look
AllData = pd.concat(TeamDict, axis=0)
AllData

Unnamed: 0,Unnamed: 1,Date,HomeOrAway,Opponent,For,Against,Team,MovingAverageFor,MovingAverageAgainst,MatchesPlayed
Brentford,87,24/10/2021,Home,Leicester,1,2,Brentford,1.2,1.4,5
Brentford,102,06/11/2021,Home,Norwich,1,2,Brentford,1.0,1.8,6
Brentford,125,28/11/2021,Home,Everton,1,0,Brentford,1.2,1.6,7
Brentford,149,10/12/2021,Home,Watford,2,1,Brentford,1.0,1.2,8
Brentford,181,29/12/2021,Home,Man City,0,1,Brentford,1.0,1.2,9
...,...,...,...,...,...,...,...,...,...,...
West Ham,287,20/03/2022,Away,Tottenham,1,3,West Ham,1.2,1.8,34
West Ham,305,10/04/2022,Away,Brentford,0,2,West Ham,0.6,1.8,35
West Ham,328,24/04/2022,Away,Chelsea,0,1,West Ham,0.6,1.8,36
West Ham,350,08/05/2022,Away,Norwich,4,0,West Ham,1.0,1.4,37


In [41]:
# Create animated plot
fig = px.scatter(data_frame=AllData,
                 title='<b>Premier League 2021-22: 5-game rolling average of goals<br>for and against by team',
                 color='Team',
                 x='MovingAverageFor',
                 y='MovingAverageAgainst',
                 animation_frame='MatchesPlayed',
                 template='simple_white',
                 opacity=0.8,
                 labels={
                     'MovingAverageFor':'Average goals scored last five games',
                     'MovingAverageAgainst':'Average goals conceded last five games',
                     'MatchesPlayed' : 'Matches played'
                 },
                 height=600,
                 width=700
                )

# Update fonts
fig.update_layout(
    font_family="Courier New",
    font_color="black",
    font_size=12,
    legend_title=None
)

# Update marker appearance
fig.update_traces(marker=dict(size=12,
                              line=dict(width=2,
                                        color='black'))
                 )

# Harmonise axis ranges
fig.update_xaxes(range=[0, 5])
fig.update_yaxes(range=[0, 5])

fig

In [None]:
# Add animated league table as legend?

# Add club badges as markers?