Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add distance and speed to frames as well as extra data when available. #109

Merged
merged 11 commits into from Oct 15, 2021

Conversation

bdagnino
Copy link

Hi @koenvo!

Finally, here is the pull request that includes two main changes:

  • renamed the EPTSSerializer to MetricaEPTSSerializer (and MetricaTrackingSerializer to MetricaCsvTrackingSeializer`)
  • modified the frames model to include, players_data instead of players_coordinates, and to include a new attribute other_data to include there data that is not directly related to the player.

It's one big PR with everything together, but let me know if you want me to divide this on different PRs!

Bruno added 9 commits September 22, 2021 14:50
The MetricaTrackingSerilizer was the first serializer for Metrica Sports
tracking data included in kloppy. This serializers is for the CSV format
that Metrica Sports used to provide it's data in. It's also the format
of 2 of the 3 games with tracking that Metrica Sports made public.

Metrica Sports for quite some time now has a new format for their
tracking data. They now provide their tracking data in the EPTS FIFA
standard format. That data can be loaded with the EPTS selializer in
kloppy that will be now renamed to specify that that is the serializer
for Metrica Sports data.

Thus, to avoid confussion, I propose in this PR that we renamed the
original serializer and file to make it clear that those methods are for
the data in CSV format.

It's still pending any backwards compatibility or error we might need to
throw if users want to use the MetricaTrackingSeiliazer as it was
before.
While a lot of the implementation of the EPTSSerializer would apply to
the general strcuture of the EPTS format, there are certain elements
like periods information or type of data that are specific to Metrica
Sports data.

There are no other providers at the moment which data is in EPTS
format. Thus, rather than create a base EPTSSerilizer and then do MetricaEPTS
serializer based on that one, for now we will rename the EPTSSerializer
to make clear that is a Metrica one.

We will refactor the code in the future if there is a need for another
EPTS serializer.
Before there was a test for EPTS, another for loading EPTS data from
Metrica Sports, and another one for the metadata of Metrica Sports.
There was also a different metadata file for Metrica Sports events
files.

Since now there is only one EPTS serializer and it is for Metrica Sports
files, I have consolidates the tests on testing the serializations of
that alone. I have also consolidated the test files in just one metadata
file, one tracking data file and one events data file.
Before for each player on each frame we were only serialing the x and y
coordinates of the tracking data. Most providers on top of that provide
the speed of each player on each frame, and some also provide the
distance covered.

In this first commit I changed the attribute in Frames from
players_coordiantes to players_data. players_data has the PlayerData
information for each player on that frame.

PlayerData is a new model that has coordinates, distance and speed
attributes.

This changes are in this commit only implemented for the Metrica Sports
EPTS serializer.

I also modified the helper to_pandas so that this extra data is included
on the resulting dataframe as well.
Depending on the data type, on Metrica Sports data there could be fields
that do not belong to PlayerData types (position, distance, speed). This
extra data types need to be deserialized, added to the Frame model on
other_data and then also included in the dataframe when using to_pandas
on the dataset.

The other_data is a dict where any kind of data can be included and it's
added to the Frame model. While for use now only on the Metrica EPTS
data it leaves a base for serialiazers of different providers to add
data other than the player and ball data.

Next step is to modify and changes tests and serializers for all other
providers.
@koenvo
Copy link
Contributor

koenvo commented Oct 10, 2021

Looks good to me! Only little addition for backwards compatible: please add a ‘player_coordinates’ property which returns ‘Dict[Player, Point]’

@koenvo koenvo added this to the 3.0.0 milestone Oct 10, 2021
@koenvo koenvo merged commit 4f104f8 into PySport:master Oct 15, 2021
@bdagnino
Copy link
Author

@koenvo sorry about the delay on the addition you requested! I just came to do it and saw it's already implemented :). Thanks!

And great that this is merged to master already!

@koenvo koenvo modified the milestones: 3.0.0, 2.2.0 Oct 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants