## Replicating FIFA Football Intelligence - Passing Networks (Players within a Team)

---
> ### 1. SET UP DEVELOPMENT ENVIRONMENT

**1.0 Import required Python software into current development environment (i.e. this notebook)**
```
import pandas as pd
```

**1.1 Configure notebook for code autocompletion + displaying plots + displaying max & min rows of panda data objects**
```
%config Completer.use_jedi = False
%matplotlib inline
pd.options.display.max_rows, pd.options.display.min_rows = 20, 20
```

---
> ### 2. LOAD & CHECK THE FOOTBALL DATA

**2.0** Read in the `match_data.csv` file located in the `data` directory (folder):
```
raw_data = pd.read_csv("data/match_data.csv")
```

**2.1** Make a copy of raw data to work on called `df`:

```
df = raw_data.copy()
```

**2.2** View the `df` object, which is a `pandas` dataframe (df), basically a tabular, 2 dimensional data structure with rows & columns:
```
df
```

**2.3** Check the dimensions of the `df` (<no. of rows>, <no. of columns>), should be (1912, 18):
```
df.shape
```

---
> ### 3. PREP DATA FOR GENERATING THE PASSING NETWORKS

**3.0** Have a look at what's in the `event` column:
```
df["event"]
```

**3.1** Use the `value_counts()` function to count how many of each type of event is in the `event` column:
```
df["event"].value_counts()
```

**3.2** For analysing Passing Networks we're only interested in successful passes, so let's start to see how we filter the data just for these by first seeing which rows in the `event` column contain the text string `"completed_pass"`:
```
df["event"] == "completed_pass"
```

**3.3** Let's use this True or False filter to create a subset of the full match data just with the rows/events representing a `"completed_pass"`. Save down this subset as a new variable called `"completed_passes"`:
```
completed_passes = df[  df["event"] == "completed_pass" ].copy()
```

**3.4** Check what's in the new `"completed_passes"` data:
```
completed_passes
```

**3.5** Choose one of the teams `"arsenal"` or `"man_u"` to create the Passing Network for and store this in a new variable called `team`:
```
team = "arsenal"
```

**3.6** Have a look at which of the rows in the `"player1_team"` column of `completed_passes` are equal to the value of your `team` variable:
```
completed_passes["player1_team"] == team
```

**3.7** Create a new variable called `team_passes` containing just the rows from the `completed_passes` data where the value in the `"player1_team"` column is the same as the value of your `team` variable, i.e. either `"arsenal"` or `"man_u"`:
```
team_passes = completed_passes[completed_passes["player1_team"] == team].copy()
```

**3.8** Check `"team_passes"` to see if the additional filter has worked as expected:
```
team_passes
```

**Question** - how many completed passes did each team make in this match?


---
> ### 4. GENERATE THE PASSING NETWORKS

**4.0** Create a Passing Network for your chosen team by calling the `pd.crosstab()` function and giving the function 2x inputs, first the `team_passes["player1"]` column, and second the `team_passes["player2"]` column:
```
pd.crosstab( team_passes["player1"], team_passes["player2"]  )
```

**4.1** Further customise this function call by using the `normalize=` parameter, which will return values as a proportion of e.g. all the values in the matrix, each row, or each column, by passing this parameter `"all"`, `"index"`, or `"columns"` respectively. Chain on the `round()` function with the input integer `3` to specify rounding the values to 3 significant figures, and then multiply by 100 to display as percentages:

```
pd.crosstab(team_passes["player1"], team_passes["player2"], normalize="all") .round(3)*100  

Extra options:
-"index", "columns"

```

**4.2 OPTIONAL EXTENSION** Save this Passing Network as a csv by first storing in a new variable, e.g. `matrix`, and then using the new variable's `"to_csv()"` function to create a new csv file:

```
matrix = pd.crosstab(team_passes["player1"], team_passes["player2"])
matrix.to_csv("FIFAIntel_matrix.csv")
```

---

_Sports Python Educational Project content, licensed under Attribution-NonCommercial-ShareAlike 4.0 International_