# EPL 2022-23 - Arsenal vs Man City

#### In this notebook, we have performed a mid season analysis on the English premier league fixtures and the difficulty levels of each game.

* Difficulty ranges from 1 to 20. 1 being the lowest and 20 being the highest

## Packages and Functions 

- Python packages and functions neatly tucked away in util folder

In [1]:
from utils.packages import *
from utils.functions import *

### Data processing

- Read latest epl results downloaded from https://fixturedownload.com/results/epl-2022

In [2]:
data = {}
data['df'] = pd.read_csv('data/epl-2022-UTC.csv')
data['df'] = data['df'].dropna()
data['df'].columns = map(str.lower, data['df'].columns)
data['df'].columns = data['df'].columns.str.replace(' ', '_')
data['df'].head(5)

Unnamed: 0,match_number,round_number,date,location,home_team,away_team,result
0,1,1,05/08/2022 19:00,Selhurst Park,Crystal Palace,Arsenal,0 - 2
1,2,1,06/08/2022 11:30,Craven Cottage,Fulham,Liverpool,2 - 2
2,3,1,06/08/2022 14:00,Vitality Stadium,Bournemouth,Aston Villa,2 - 0
3,4,1,06/08/2022 14:00,Elland Road,Leeds,Wolves,2 - 1
4,6,1,06/08/2022 14:00,St. James' Park,Newcastle,Nottingham Forest,2 - 0


* Find winners of each tie and calculate Points for each game 

In [3]:
data = get_points(data)
data['df'].head(5)

Unnamed: 0,match_number,round_number,date,location,home_team,away_team,result,home_score,away_score,home_points,away_points
0,1,1,05/08/2022 19:00,Selhurst Park,Crystal Palace,Arsenal,0 - 2,0,2,0.0,3.0
1,2,1,06/08/2022 11:30,Craven Cottage,Fulham,Liverpool,2 - 2,2,2,1.0,1.0
2,3,1,06/08/2022 14:00,Vitality Stadium,Bournemouth,Aston Villa,2 - 0,2,0,3.0,0.0
3,4,1,06/08/2022 14:00,Elland Road,Leeds,Wolves,2 - 1,2,1,3.0,0.0
4,6,1,06/08/2022 14:00,St. James' Park,Newcastle,Nottingham Forest,2 - 0,2,0,3.0,0.0


* Calculate the epl rank and difficulty for each team. 
    * Being the league leaders, Arsenal has the most difficulty points (20)    
    * Southampton's difficulty points is just 1 as they are at the bottom of the league table

In [15]:
data = get_rank_difficulty(data)

data['points_df'].iloc[[0,-1]]

Unnamed: 0,club,games_played,points,curr_rank,difficulty
1,Arsenal,20,50.0,1.0,20.0
20,Southampton,21,15.0,20.0,1.0


### Remaining Difficulty Calculation

* Some of the teams have already played most of the difficult fixtures and they have relatively easier fixtures from now on until the end of the season


* Using the difficulty points and remaining fixtures, we can calculate the remaining difficulty for each team 
    * Average remaining Difficulty ((avg_rem_diff)) is calculated, which is sum of difficulty of remaining fixtures divided by the number of remaining fixtures

### Findings  
* Interestingly Spurs have the least difficult run remaining followed by Manchester United
* Arsenal and Man City have the 4th and 6th easiest average remaining difficulty (avg_rem_diff)

* Arsenal's avg_rem_diff  : 9.72
* Man City's avg_rem_diff : 10.06

In [10]:
data = get_remining_difficulty(data)
data['points_df'].to_csv('data/results.csv')
data['points_df'][0:10]

Unnamed: 0,club,games_played,points,curr_rank,difficulty,total_rem_diff,num_opp_rem,avg_rem_diff,pend_double_head,yet_to_play
1,Spurs,22,39.0,5.0,16.0,142.0,16.0,8.88,[],"['Aston Villa', 'Bournemouth', 'Brentford', 'B..."
2,Man Utd,22,43.0,3.0,18.0,144.5,16.0,9.03,[],"['Aston Villa', 'Bournemouth', 'Brentford', 'B..."
3,Fulham,22,32.0,8.0,13.0,152.0,16.0,9.5,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
4,Arsenal,20,50.0,1.0,20.0,175.0,18.0,9.72,['Man City'],"['Aston Villa', 'Bournemouth', 'Brentford', 'B..."
5,Crystal Palace,21,24.0,12.5,8.5,169.5,17.0,9.97,['Brighton'],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
6,Man City,21,45.0,2.0,19.0,171.0,17.0,10.06,['Arsenal'],"['Arsenal', 'Arsenal', 'Aston Villa', 'Bournem..."
7,Newcastle,21,40.0,4.0,17.0,171.5,17.0,10.09,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
8,Liverpool,20,29.0,10.0,11.0,184.0,18.0,10.22,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
9,Brighton,20,34.0,6.0,15.0,185.5,18.0,10.31,['Crystal Palace'],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
10,Chelsea,21,30.0,9.0,12.0,176.5,17.0,10.38,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."


### Pending Double Header

* But most interesting factor is on the pending double header column, which shows, Arsenal and Man city havent played each other in the premier league yet. Yes! the last time they faced each other was in FA Cup

* The difficulty scores are impacted by the positions of Arsenal and Man City
* So it is essential to see the remaining fixtures of Arsenal excluding Man City and vice versa for Man City

In [6]:
data['team_a'] = 'Arsenal'
data['team_b'] = 'Man City'
data = remove_and_compare(data)

Arsenal has 16 games left with average difficulty of 8.56 (excluding Man City)
Man City has 15 games left with average difficulty of 8.73 (excluding Arsenal)


* After removing these two title contenders from their respective fixtures, Both Arsenal and Man City have almost similar difficulty levels left until the end of the season

### Key Double Header

* Arsenal and Man City are meeting each other on 16th Feb in Emirates and on 27th Apr in Etihad
* Arsenal has a 5 points lead over Man City with 1 game advantage, we can consider it as 8 point advantage
* Man City needs to win both of these fixtures to reduce the gap just 2 points

* Even if Arsenal manages to draw one of these fixtures, it puts Arsenal in a commanding position to win the league

### Full Points Table

In [9]:
data['points_df']

Unnamed: 0,club,games_played,points,curr_rank,difficulty,total_rem_diff,num_opp_rem,avg_rem_diff,pend_double_head,yet_to_play
1,Spurs,22,39.0,5.0,16.0,142.0,16.0,8.88,[],"['Aston Villa', 'Bournemouth', 'Brentford', 'B..."
2,Man Utd,22,43.0,3.0,18.0,144.5,16.0,9.03,[],"['Aston Villa', 'Bournemouth', 'Brentford', 'B..."
3,Fulham,22,32.0,8.0,13.0,152.0,16.0,9.5,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
4,Arsenal,20,50.0,1.0,20.0,175.0,18.0,9.72,['Man City'],"['Aston Villa', 'Bournemouth', 'Brentford', 'B..."
5,Crystal Palace,21,24.0,12.5,8.5,169.5,17.0,9.97,['Brighton'],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
6,Man City,21,45.0,2.0,19.0,171.0,17.0,10.06,['Arsenal'],"['Arsenal', 'Arsenal', 'Aston Villa', 'Bournem..."
7,Newcastle,21,40.0,4.0,17.0,171.5,17.0,10.09,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
8,Liverpool,20,29.0,10.0,11.0,184.0,18.0,10.22,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
9,Brighton,20,34.0,6.0,15.0,185.5,18.0,10.31,['Crystal Palace'],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
10,Chelsea,21,30.0,9.0,12.0,176.5,17.0,10.38,[],"['Arsenal', 'Aston Villa', 'Bournemouth', 'Bre..."
