This repository provides open-access football event data from Impect, a leading provider of football analytics and event data. The dataset includes detailed match event data, focusing on key performance indicators such as bypassed opponents and other advanced metrics.
- Source: Impect
- Format: JSON
- Coverage:
- Bundesliga 2023/24
- Data Points:
- Event Data
- Event KPIs
- Team Lineups & Substitutions
- Match Info
- Player Info
- Squad Info
- Country List
- Iteration List
- KPI Definitions
- Clone the Repository
git clone https://github.com/ImpectAPI/open-data.git
- Navigate to the Project Folder
cd open-data
- Load Data in Python (Example)
import pandas as pd # read data df = pd.read_json("data/events/events_122838.json") # print first 5 rows to console print(df.head())
How to Use the Data with Kloppy
Kloppy by PySport is the industry standard open-source fooball data standardization package. Kloppy simplifies football data processing by offering a single place to load, filter, transform and export your football data in a standardized way.
To get started with the open dataset simply,
-
Install Kloppy
pip install kloppy>=3.18.0
-
Load the data
from kloppy import impect events = impect.load_open_data(match_id=122840)
To load other, non-open data use
impect.load()
instead. -
Filter, Transform and Export
df = ( events.transform( to_orientation="STATIC_HOME_AWAY" ) # Now, the home team always attacks left to right .filter(lambda event: event.period.id == 1) # Only keep frames from the first half .to_df( engine="polars" ) # Convert to a Polars DataFrame, or use engine="pandas" for a Pandas DataFrame )
The dataset is organized as follows:
open-data/
│-- data/
│ │-- events/ # Contains all in-game events. The filename contains the match ID.
│ │-- events_kpis/ # KPIs on event level for the above event data. The filename contains the match ID.
│ │-- lineups/ # Team lineups and substitutions. The filename contains the match ID.
│ │-- matches/ # Match metadata. The filename contains the iteration ID.
│ │-- players/ # Player master data. The filename contains the iteration ID.
│ │-- squads/ # Squad master data. The filename contains the iteration ID.
│ │-- countries.json # List of countries.
│ │-- iterations.json # Iterations of competitions.
│ │-- kpi_definitions.json # Definitions of key performance indicators (KPIs).
For detailed information on data structure and format, please refer to Documentation.pdf
.
- By using this data you agree to our terms and conditions. See
LICENSE.pdf
for the full terms and conditions. - If you use this data in your research or projects, please credit Impect as the data provider and use our logo from the
img
folder.
We encourage users to publish their work using this dataset! Whether it's a blog post, a research paper, a visualization, or an analysis, we'd love to see how you're using the data. Tag Impect on social media to share your insights:
- X (formerly Twitter): @impect_official
- BlueSky: @impect-official.bsky.social
- LinkedIn: Impect on LinkedIn
Join the community and showcase your findings to fellow analysts, researchers, and football enthusiasts!
We welcome contributions to improve data accessibility and documentation! Feel free to submit pull requests or report issues in the Issues section.
For any inquiries regarding this dataset, please reach out to:
- Impect Website: www.impect.com
- Email: thomas.walentin@impect.com
- Email: florian.schmitt@impect.com
🚀 Happy Analyzing!