# Are referees bias towards the home team in the Premier League?

# Dataset information 

Data taken from [https://www.football-data.co.uk](https://www.football-data.co.uk)

Key to results data:  

- Div = League Division
- Date = Match Date (dd/mm/yy)
- Time = Time of match kick off
- HomeTeam = Home Team
- AwayTeam = Away Team
- FTHG and HG = Full Time Home Team Goals
- FTAG and AG = Full Time Away Team Goals
- FTR and Res = Full Time Result (H=Home Win, D=Draw, A=Away Win)
- HTHG = Half Time Home Team Goals
- HTAG = Half Time Away Team Goals
- HTR = Half Time Result (H=Home Win, D=Draw, A=Away Win)  

<u> **Match Statistics (where available)** </u>  
- Attendance = Crowd Attendance
- Referee = Match Referee
- HS = Home Team Shots
- AS = Away Team Shots
- HST = Home Team Shots on Target
- AST = Away Team Shots on Target
- HHW = Home Team Hit Woodwork
- AHW = Away Team Hit Woodwork
- HC = Home Team Corners
- AC = Away Team Corners
- HF = Home Team Fouls Committed
- AF = Away Team Fouls Committed
- HFKC = Home Team Free Kicks Conceded
- AFKC = Away Team Free Kicks Conceded
- HO = Home Team Offsides
- AO = Away Team Offsides
- HY = Home Team Yellow Cards
- AY = Away Team Yellow Cards
- HR = Home Team Red Cards
- AR = Away Team Red Cards
- HBP = Home Team Bookings Points (10 = yellow, 25 = red)
- ABP = Away Team Bookings Points (10 = yellow, 25 = red)

*Note that Free Kicks Conceeded includes fouls, offsides and any other offense commmitted and will always be equal to or higher than the number of fouls. Fouls make up the vast majority of Free Kicks Conceded. Free Kicks Conceded are shown when specific data on Fouls are not available (France 2nd, Belgium 1st and Greece 1st divisions).*

*Note also that English and Scottish yellow cards do not include the initial yellow card when a second is shown to a player converting it into a red, but this is included as a yellow (plus red) for European games.*


In [1]:
# Importing packages ------------------------------------------------------

import pandas as pd
import numpy as np 
import os 

# Reading in data ------------------------------------------------------

years = np.arange(5,23) 
years_1 = np.arange(6,24)

## Convert to string and add trailing 0 
## required for website url data is stored on 

years = years.astype(str)
years = np.char.zfill(years, 2)

years_1 = years_1.astype(str)
years_1 = np.char.zfill(years_1, 2)

## Combining to give out the final years
url_years = np.core.defchararray.add(years, years_1)

data_url_base = 'https://www.football-data.co.uk/mmz4281/{Year}/E0.csv'


## Check if the data already exists locally (previously compiled incase of no internet)

if os.path.exists("football_data.csv"):
    df_raw = pd.read_csv("football_data.csv", low_memory=False, index_col=False)
else:
    ## creating url list with all years needed
    ## Also adding delimiter '/' between seasons for new column 
    data_urls_list = []
    season = []
    for year in url_years:
        url_with_year = data_url_base.replace('{Year}', year)
        season_year = year[:2] + '/' + year[2:]
        season.append(season_year)
        data_urls_list.append(url_with_year)

    df_list =[] 

    for file in enumerate(data_urls_list):
        df = pd.read_csv(file[1])
        df['Season'] = season[file[0]]
        df_list.append(df)
    
    df = pd.concat(df_list, ignore_index=True)



## Introduction
In this report, we test for, and look to find evidence of, referee bias in favor of home teams in the English Premier League using data for all matches between 2005 and 2022 (current season). A role of the referee is to govern the rules set out by the Football Association (FA) during a football match. 

According to the IFAB Laws of the game 2022-23:  
>Decisions will be made to the best of the referee's ability according to the Laws of the Game and the 'spirit of the game' and will be based on the opinion of the referee who has the discretion to take appropriate action within the framework of the Laws of the Game.(Association, 2022)[<sup>1</sup>](#fn1)

There will always be a concern of the referees exercising impartial or bias decisions in favour of a team specifically the home team. In this situation, there are issues of moral hazard and conflicts of interest where I explore whether the the number of sanctions (Yellow cards, Red cards, free kicks and penalties etc) awarded by referees are subject to some bias whether the team playing is at home and behaviour of the referee is somewhat influenced by various other factors such as crowd attendance. 

Often these split decisions that can decide games taken by referees can often have huge financial implications for teams. For example, if a decision cost a team the game which meant they miss out of qualifying for the UEFA Champions League this could mean the difference between extra revenue for the club as well as attracting top talent for the season ahead. It can also have implications for the manager who may find himself out of a job especially if this forms part of their contract.

These decisions that affect the overall match result provide fuel for debate between players, teams, pundits, journalists and your average fan, in some cases for years into the foreseeable future.  

This concern is nothing new football fans across the global in multiple leagues since time can remember have complained about refereeing decisions. Some recent articles are linked below for reference. There are several fans who even claim conspiracies against their clubs by certain referees and managers have openly expressed dismay when particular match officials are appointed to officiate their matches.  

[https://www.newcastleworld.com/sport/football/newcastle-united/i-hate-sunderland-premier-league-referee-on-bias-and-love-for-newcastle-united-3812928](https://www.newcastleworld.com/sport/football/newcastle-united/i-hate-sunderland-premier-league-referee-on-bias-and-love-for-newcastle-united-3812928)  
[https://www.football365.com/news/mailbox-arsenal-var-referee-conspiracy-man-utd-elon-musk-lionesses](https://www.football365.com/news/mailbox-arsenal-var-referee-conspiracy-man-utd-elon-musk-lionesses)  
[https://www.thesportsman.com/articles/tuchel-s-fury-and-87k-petitions-but-referees-are-not-biased-against-your-team](https://www.thesportsman.com/articles/tuchel-s-fury-and-87k-petitions-but-referees-are-not-biased-against-your-team)  
[https://www.caughtoffside.com/2022/08/16/anthony-taylor-to-continue-to-referee-chelsea-games/](https://www.caughtoffside.com/2022/08/16/anthony-taylor-to-continue-to-referee-chelsea-games/)  
[https://www.football365.com/news/mailbox-liverpool-1-1-palace-darwin-nunez-man-utd-arsenal](https://www.football365.com/news/mailbox-liverpool-1-1-palace-darwin-nunez-man-utd-arsenal)  


This bias however is not just limited to football it can be observed in all sporting competitions. There are numerous studies on this and have been explored for example in hockey (Pappas, 2011)[<sup>2</sup>](#fn2) and in basketball for the well known National Basketball Assocation (NBA) competition (Rodenberg and Lim 2009)

In [None]:
# Dataset 

# References
<span id="fn1"> 1: Association, T., 2022. Law 5 - The Referee. [online] www.thefa.com. Available at: <https://www.thefa.com/football-rules-governance/lawsandrules/laws/football-11-11/law-5---the-referee> [Accessed 20 August 2022].</span>  

<span id="fn2"> 2: Pappas, C., 2011. Theoretical analysis of referee bias in youth hockey. [online] Digscholarship.unco.edu. Available at: <https://digscholarship.unco.edu/cgi/viewcontent.cgi?article=1069&context=theses> [Accessed 20 August 2022].</span>
