## Replicating FIFA Football Intelligence - Forced Turnovers (Team-level)

---
> ### 1. SET UP DEVELOPMENT ENVIRONMENT

**1.0 Import required Python software into current development environment (i.e. this notebook)**
```
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image
```

**1.1 Configure notebook for code autocompletion + displaying plots + displaying max columns and rows of panda data objects**
```
%config Completer.use_jedi = False
%matplotlib inline
pd.options.display.max_rows, pd.options.display.min_rows = 20, 20
```

---
> ### 2. LOAD & PREP THE FOOTBALL DATA

**2.0** Read in the `match_data.csv` file located in the `data` directory (folder):
```
raw_data = pd.read_csv("data/match_data.csv")
```

**2.1** Make a copy of the raw data to work on called `df`:

```
df = raw_data.copy()
```

**2.2** View the `df` object. This is a `pandas` dataframe object, basically data in a table so has rows and columns like an Excel spreadsheet:
```
df
```

**2.3** Check the dimensions of the `df` object (<no. of rows>, <no. of columns>) - it should be (1912, 18):
```
df.shape
```

**2.4** Load the `pitch.png` graphic located in the `data` directory (folder) and store in a variable called `pitch`:
```
pitch = Image.open("data/pitch.png")
```

**2.5** Check the `pitch` object using the `imshow()` function available from the `matplotlib` plotting library:
```
plt.imshow(pitch)
```

---
> ### 3. PREP DATA FOR GENERATING THE FORCED TURNOVER VISUALISATIONS

**3.0** Have a look at what's in the `event_detail` column of the `df` data by using the `value_counts()` function to see the different categories of detailed events and how many rows there are of each category:
```
df["event_detail"].value_counts()
```

**3.1** Create a new variable called `metric` that contains the text string `"forced_turnover"`:
```
metric = "forced_turnover"
```

**3.2** Create a new variable called `ft` that contains just the rows from the `df` data where the text in the `"event_detail"` column is the same as the text in the `metric` variable, i.e. contains `"forced_turnover"`:
```
ft = df[df["event_detail"] == metric].copy()
```

**3.3** Check the `ft` cut of data to see if the filter for the `"forced_turnover"` metrics has worked as expected:
```
ft
```

**3.4** Create a new variable `FT` contains a cut of the `ft` data with just the relevant columns for creating the end data visualisation, i.e. `"player1"`, `"player1_team"`, `"event"`, `"event_detail"`, `"start_x"`, and `"start_y"`:
```
FT = ft[["player1", "player1_team", "event", "event_detail", "start_x", "start_y"]].copy()
```

**3.5** Check the new `FT` data:
```
FT
```

---
> ### 4. GENERATE THE FORCED TURNOVER VISUALISATIONS

**4.0** Use the `matplotlib` plotting library to create scatter graphs over a pitch graphic where the markers representing location that the chosen metric events occurred in the match. The x and y co-ordinates of the relevant events will be plotted, but with separately coloured scatter graphs to differentiate the teams.

TIP: Check out the range of official named colors you can use with matplotlib https://matplotlib.org/stable/gallery/color/named_colors.html#css-colors



```
fig, ax = plt.subplots(figsize=(12,8))
plt.axis( [0, 105, 0, 68] )
ax.imshow(pitch, extent=[0,105,0,68])

for index, row in FT.iterrows():
    if row["player1_team"] == "arsenal":
        plt.scatter(x=row["start_x"], y=row["start_y"], c="red", s=150)
    else: 
        plt.scatter(x=row["start_x"], y=row["start_y"], c="blue", s=150)

```
Extra options:    
-plt.tight_layout()   
-edgecolors="lightblue", linewidth=1, alpha=0.8  
-plt.savefig("FIFAIntel_ForcedTurnovers.png")

**4.1** Run some extra steps to import `seaborn` visualisation library into current development environment (i.e. this notebook) - NOTE the `piplite` calls are not standard practice but are necessary workarounds due to certain cutting-edge tech being used for this workshop.
```
import piplite
await piplite.install("seaborn")
import seaborn as sns
```

**4.2** Make 2 new cuts of the `FT` data that has just the rows relating to each teams, i.e. where the `"player1_team"` column is equal to `arsenal` or `man_u` and saved down to the new variables `ARS` and `MAN` respectively:
```
ARS = FT[FT["player1_team"] == "arsenal"].copy()
MAN = FT[FT["player1_team"] == "man_u"].copy()
```

**4.3** Use the `seaborn` visualisation library to create a Heat Map of one team's `forced_turnovers` events on the pitch graphic by passing the x and y co-ordinates of where the team's `forced_turnovers` occurred to the `sns.kdeplot()` function:
```
fig, ax = plt.subplots(figsize=(12,8))
plt.axis([0,105,0,68])
ax.imshow(pitch, extent=[0,105,0,68])

sns.kdeplot(x=ARS["start_x"], y=ARS["start_y"], n_levels=20, cmap="rocket_r", fill=True, alpha=0.6)

```
Extra options:    
-plt.tight_layout()   
-cmap options: "rocket", "mako", "flare", and "crest". Append "_r" to the string to reverse the order of the colours    
-plt.savefig("FIFAIntel_FTHeatMap.png")

**4.4** Repeat for the other team:
```
fig, ax = plt.subplots(figsize=(12,8))
plt.axis([0,105,0,68])
ax.imshow(pitch, extent=[0,105,0,68])

sns.kdeplot(x = MAN["start_x"], y=MAN["start_y"], n_levels=20, cmap="mako_r", fill=True, alpha=0.6)
```
Extra options:    
-plt.tight_layout()   
-cmap options: "rocket", "mako", "flare", and "crest". Append "_r" to the string to reverse the order of the colours    
-plt.savefig("FIFAIntel_FTHeatMap.png")

---

_Sports Python Educational Project content, licensed under Attribution-NonCommercial-ShareAlike 4.0 International_