## Getting Started 

[Kloppy](https://kloppy.pysport.org/) is _the_ industry standard open-source soccer data standardization package used by clubs in the English Premier League, Italian Seria A, La Liga, German BundesLiga, Major League Soccer, Dutch Eredivisie etc etc. It is used to standardize data from different data providers into a single format.

We can use Kloppy to directly load data from GitHub (see below).

### Before we get started:
- Download Python3.11+ if you don't have it already.
- Make a virtual environment to store and install all the Python packages related to this project.
- Activate the virtual environment (select it as a Kernel for this Jupyter Notebook)
- [Install Kloppy](https://kloppy.pysport.org/user-guide/installation/) (`pip install kloppy`) into the virtual environment

---

### Kloppy

Now we can load our tracking data directly from GitHub.

In [6]:
# ! pip install kloppy
from kloppy import skillcorner

match_id=1886347
tracking_data_github_url=f'https://media.githubusercontent.com/media/SkillCorner/opendata/741bdb798b0c1835057e3fa77244c1571a00e4aa/data/matches/{match_id}/{match_id}_tracking_extrapolated.jsonl'
meta_data_github_url=f'https://raw.githubusercontent.com/SkillCorner/opendata/741bdb798b0c1835057e3fa77244c1571a00e4aa/data/matches/{match_id}/{match_id}_match.json'

dataset = skillcorner.load(
    meta_data=meta_data_github_url,
    raw_data=tracking_data_github_url,
    # Optional Parameters
    coordinates = "skillcorner",  # or specify a different coordinate system
    sample_rate = (1/2),  # changes the data from 10fps to 5fps
    limit = 100 # only load the first 100 frames
)

### Basic Kloppy Operations

- Transform the coordinate system such that all attacks happen from left to right
- Filter out only the first half
- Output to [Polars](https://pola.rs/) dataframe

In [11]:
df = (
    dataset
    .transform(to_orientation="BALL_OWNING_TEAM")  # Now, all attacks happen from left to right
    .filter(lambda frame: frame.period.id == 1)  # Only keep frames from the first half
    .to_df(engine="polars") # Convert to a Polars DataFrame, or use engine="pandas" for a Pandas DataFrame
)

In [8]:
home_team, away_team = dataset.metadata.teams

for player in home_team.players:
    print(f"{player.jersey_no} - {player.name}")

10 - Guillermo Luis May Bartesaghi
17 - Callan Elliot
22 - Jake Brimmer
15 - Francis De Vries
4 - Nando Pijnaker
23 - Daniel Hall
6 - Louis Verstraete
14 - Liam  Gillion
25 - Neyder Stiven Moreno Betancur
5 - Tommy Smith
9 - Max Mata
27 - Logan Rogerson
28 - Luis Felipe Gallegos Leiva
3 - Scott Galloway
1 - Michael Woud
8 - Luis Toomey
7 - Cameron Drew Howieson
12 - Alex Noah Paulsen


### Basic Kloppy Functionalities
- [TrackingDataset](https://kloppy.pysport.org/user-guide/concepts/tracking-data/)
- [Metadata (players, team names etc.)](https://kloppy.pysport.org/user-guide/concepts/metadata/)
- [Coordinate Systems](https://kloppy.pysport.org/user-guide/concepts/coordinates/#built-in-coordinate-systems)
- [Transformations](https://kloppy.pysport.org/user-guide/transformations/coordinates/)
- [Filter](https://kloppy.pysport.org/user-guide/getting-started/#filtering-data)
- [Exporting to pandas / polars DataFrames](https://kloppy.pysport.org/user-guide/exporting-data/dataframes/)

### Plotting

Use `mplsoccer` and `matplotlib` to plot some different configurations of tracking data.

See [Plotting Examples](https://kloppy.pysport.org/user-guide/getting-started/#exec-51--__tabbed_1_2)