# State

While working with event data it can be convenient to also use the current score. This can be used to determine if a team is winning, losing or drawing. A more generic name for the score is 'state'. 

In this quickstart we'll look at how to use the `add_state` method of kloppy to add state to the events for later use.

## Loading some statsbomb data

First we'll load Barcelona - Deportivo Alaves from the statsbomb open-data project.

In [6]:
from kloppy import datasets
from kloppy.domain import EventType

dataset = datasets.load('statsbomb')
print([team.name for team in dataset.metadata.teams])
print(dataset.events[0].state)

['Barcelona', 'Deportivo Alavés']
{}


## Add state - score

kloppy contains some default state builders: score, lineup and sequence. Let's have a look at the `score` state builder. 

In [7]:
dataset = dataset.add_state('score')

dataset.events[0].state

{'score': Score(home=0, away=0)}

As you can see `state` is now filled with a score object. The object contains two attributes: `home` and `away`. Every event contains a score object which is automatically updated when a goal is scored.

Now lets have a look at how we can use the score state. First we filter on only shots.

In [8]:
dataset = dataset.filter(lambda event: event.event_type == EventType.SHOT)
shots = dataset.events
len(shots)

28

In [9]:
for shot in shots:
    print(shot.state['score'], shot.player.team, '-', shot.player, '-', shot.result)

0-0 Barcelona - Lionel Andrés Messi Cuccittini - ShotResult.OFF_TARGET
0-0 Barcelona - Jordi Alba Ramos - ShotResult.OFF_TARGET
0-0 Barcelona - Lionel Andrés Messi Cuccittini - ShotResult.SAVED
0-0 Deportivo Alavés - Rubén Sobrino Pozuelo - ShotResult.OFF_TARGET
0-0 Barcelona - Luis Alberto Suárez Díaz - ShotResult.OFF_TARGET
0-0 Barcelona - Ousmane Dembélé - ShotResult.OFF_TARGET
0-0 Barcelona - Ivan Rakitić - ShotResult.OFF_TARGET
0-0 Barcelona - Lionel Andrés Messi Cuccittini - ShotResult.POST
0-0 Barcelona - Gerard Piqué Bernabéu - ShotResult.OFF_TARGET
0-0 Barcelona - Ousmane Dembélé - ShotResult.SAVED
0-0 Barcelona - Luis Alberto Suárez Díaz - ShotResult.OFF_TARGET
0-0 Barcelona - Ousmane Dembélé - ShotResult.OFF_TARGET
0-0 Barcelona - Jordi Alba Ramos - ShotResult.SAVED
0-0 Deportivo Alavés - Mubarak Wakaso - ShotResult.BLOCKED
0-0 Barcelona - Luis Alberto Suárez Díaz - ShotResult.BLOCKED
0-0 Barcelona - Philippe Coutinho Correia - ShotResult.BLOCKED
0-0 Barcelona - Jordi Alba R

In [15]:
dataframe = dataset.to_pandas(additional_columns={
    'home_score': lambda event: event.state['score'].home,
    'away_score': lambda event: event.state['score'].away
})
dataframe

Unnamed: 0,event_id,event_type,result,success,period_id,timestamp,end_timestamp,ball_state,ball_owning_team,team_id,player_id,coordinates_x,coordinates_y,home_score,away_score
0,65f16e50-7c5d-4293-b2fc-d20887a772f9,SHOT,OFF_TARGET,False,1,149.094,,alive,217,217,5503,111.65,51.65,0,0
1,b0f73423-3990-45ae-9dda-3512c2d1aff3,SHOT,OFF_TARGET,False,1,339.239,,alive,217,217,5211,113.95,26.95,0,0
2,13b1ddab-d22e-43d9-bfe4-12632fea1a27,SHOT,SAVED,False,1,928.625,,alive,217,217,5503,91.95,34.45,0,0
3,391bfb74-07a6-4afe-9568-02a9b23f5bd4,SHOT,OFF_TARGET,False,1,979.616,,alive,206,206,6613,109.05,38.65,0,0
4,5e55f5a5-954f-4cc4-ba6e-a9cf6d6e249e,SHOT,OFF_TARGET,False,1,1095.914,,alive,217,217,5246,106.95,24.95,0,0
5,1c0347cd-14dc-4aa8-91eb-520672a6cfe1,SHOT,OFF_TARGET,False,1,1842.287,,alive,217,217,5477,108.05,27.35,0,0
6,7c3182af-c8a8-4c7c-934e-5c41c7b93c6a,SHOT,OFF_TARGET,False,1,2104.861,,alive,217,217,5470,111.95,43.65,0,0
7,39f231e5-0072-461c-beb0-a9bedb420f83,SHOT,POST,False,1,2248.168,,alive,217,217,5503,96.95,53.95,0,0
8,062cdd08-8773-424f-8fc5-2e3d441c3c5c,SHOT,OFF_TARGET,False,1,2250.989,,alive,217,217,5213,112.25,41.35,0,0
9,c09e904d-6c8e-479d-af2e-c2c5863aca71,SHOT,SAVED,False,1,2308.083,,alive,217,217,5477,102.45,29.15,0,0


Now filter the dataframe. We only want to see shots when we are winning by at least two goals difference.

In [18]:
dataframe[dataframe['home_score'] - dataframe['away_score'] >= 2]

Unnamed: 0,event_id,event_type,result,success,period_id,timestamp,end_timestamp,ball_state,ball_owning_team,team_id,player_id,coordinates_x,coordinates_y,home_score,away_score
26,252b3061-7d3f-4922-b04f-0b37a44c6300,SHOT,SAVED,False,2,2662.638,,alive,217,217,5503,106.05,46.05,2,0
27,55d71847-9511-4417-aea9-6f415e279011,SHOT,GOAL,True,2,2802.77,,alive,217,217,5503,111.95,34.55,2,0


## Add state - lineup

We are able to add more state. In this example we'll look at adding lineup state.

In [2]:
from kloppy import datasets
from kloppy.domain import EventType

dataset = datasets.load('statsbomb')
home_team, away_team = dataset.metadata.teams

Arturo Vidal is a substitute on the side of Barcelona. We add lineup to all events so we are able to filter out events where Arturo Vidal is on the pitch.

In [3]:
arturo_vidal = home_team.get_player_by_id(8206)

In [4]:
dataframe = (
    dataset
    .add_state('lineup')
    .filter(lambda event: arturo_vidal in event.state['lineup'].players)
    .to_pandas()
)

In [5]:
print(f"time on pitch: {dataframe['timestamp'].max() - dataframe['timestamp'].min()} seconds")


time on pitch: 490.6479999999997 seconds


In [62]:
dataframe = (
    dataset
    .add_state('lineup')
    .filter(lambda event: event.event_type == EventType.PASS and event.team == home_team)
    .to_pandas(additional_columns={
        'vidal_on_pitch': lambda event: arturo_vidal in event.state['lineup'].players
    })
)

In [63]:
dataframe = dataframe.groupby(['vidal_on_pitch'])['success'].agg(['sum', 'count'])
dataframe['percentage'] = dataframe['sum'] / dataframe['count'] * 100
dataframe

Unnamed: 0_level_0,sum,count,percentage
vidal_on_pitch,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
False,709,798,88.847118
True,83,88,94.318182
