I started this project to understand penalty kick data in the Premier League, the top soccer league in England, before the implementation of Video Assistant Referees (VAR) at the start of the 2019-20 season. VAR is a replay system intended to review referee decisions in-game to increase the accuracy of the most important calls that they make. Decisions that are subject to VAR include reviewing goals scored for offside and foul calls missed in the buildup, incorrect offside calls that ruled out goals, red card fouls, cases of mistaken identity, and penalty kick decisions.
The system was developed over the course of the last decade, but England was one of the last major soccer leagues to adopt it due to their traditionalist views of the game. Unsurprisingly, the rollout of this new technology has been controversial, with many saying that it damages the quality and experience of the game for fans, coaches, and players alike by disrupting the natural flow of the game. The majority of discourse over the value of VAR has been over balancing overall improvements in the accuracy of calls against the disruption caused by waiting for calls to be reviewed. Indeed, VAR was adopted for the 2018 FIFA world cup on the basis of experiments demonstrating that it improved overall call accuracy from 93.0 to 98.8% while only lengthening games by 55 seconds on average. However, another question of interest about the implementation of VAR is its effect on standardizing referee performance.
Is there evidence that soccer referees call the game in different fashions? Is it likely that VAR will eliminate this variability?
The majority of these data were sourced from Football-Data.co.uk which has game-level data for the past ten years of the premier league. These data included the participating teams, the referee's name, some game statistics broken down by team (more on this in a moment), and a variety of betting information. For our purposes, the betting data were excised and the following statistics were kept: goals, fouls, corners, shots, yellow cards, and red cards. Notably, this dataset did not include penalty data, so I gathered data on this myself.
Data sources with game-level data on penalties are not publicly available. Aggregate penalty data (such as for an entire team's season) can be found, and web sources exist listing game events, but as far as I can tell, no one has combined this into a dataset listing the penalties that occurred and whether they were scored in each game for a premier league season. In order to gather this data, I scraped ESPN's website as they have a database (accessible in their API) of game results with event logs for each match that used standard phrases for each type of event. As such, penalty data including converted penalties and missed penalties were collected with relatively little language processing needed (I may put up the scripts I wrote to do this later if I find time). These data were then merged with the Football-Data.co.uk dataset. See the codebook for a description of the variables present in the dataset.