# SPADL To Wyscout Conversion

### What is SPADL?

SPADL (Soccer Player Action Description Language) represents a game as a sequence of on-the-ball actions [a1, a2, ..., am], where m is the total number of actions that happened in the game.

SPADL uses a standardized coordinate system with the origin on the bottom left of the pitch, and a uniform field of 105m x 68m. For direction of play, SPADL uses the “home team attacks to the right” convention, but this can be converted conveniently with the play_left_to_right() function such that the lower x-coordinates represent the own half of the team performing the action.

**Action Type**<br>
The action type attribute can have 22 possible values. These are pass, cross, throw-in, crossed free kick, short free kick, crossed corner, short corner, take-on, foul, tackle, interception, shot, penalty shot, free kick shot, keeper save, keeper claim, keeper punch, keeper pick-up, clearance, bad touch, dribble and goal kick. A detailed definition of each action type is available here.

**Result**<br>
The result attribute can either have the value success, to indicate that an action achieved it’s intended result; or the value fail, if this was not the case. An example of a successful action is a pass which reaches a teammate. An example of an unsuccessful action is a pass which goes over the sideline. Some action types can have special results. These are offside (for passes, corners and free-kicks), own goal (for shots), and yellow card and red card (for fouls).

**Body Part**<br>
The body part attribute can have 4 possible values. These are foot, head, other and none. For Wyscout, which does not distinguish between the head and other body parts a special body part head/other is used.

### The problem we need to solve : moving to v3

The old wyscout event format from version 2 of the API looks like this: 
```
{
    "tags": [
    {
    "id": 1802,
    "tag": {
    "label": "not accurate"
            }
        }
    ],
    "eventId": 8,
    "eventName": "Pass",
    "eventSec": 1.8496730000000001,
    "id": 663292348,
    "matchId": 2852835,
    "matchPeriod": "1H",
    "playerId": 21123,
    "positions": [
        {
        "x": 52,
        "y": 47
        },
        {
        "x": 60,
        "y": 32
        }
    ],
    "subEventId": 85,
    "subEventName": "Simple pass",
    "teamId": 3185
}

```
The new version three format is somewhat more complex:
```
{
    "id": 601919968,
    "matchId": -168770,
    "matchPeriod": "1H",
    "minute": 8,
    "second": 21,
    "matchTimestamp": "00:08:21.568",
    "videoTimestamp": "507.568215",
    "relatedEventId": 601919969,
    "type": {
    "primary": "pass",
    "secondary": [
        "back_pass"
    ]
    },
    "location": {
    "x": 42, 
    "y": 83 
    },
    "team": {
    "id": 964,
    "name": "Borussia Dortmund",
    "formation": "3-4-3"
    },
    "opponentTeam": {
    "id": 961,
    "name": "Bayern München",
    "formation": "4-2-3-1"
    },
    "player": {
    "id": 156709,
    "name": "T. Hazard",
    "position": "RWF"
    },
    "pass": {
    "accurate": true,
    "length": 9.34,
    "angle": 148,
    "recipient": {
        "id": 419254,
        "name": "A. Hakimi",
        "position": "RWB"
    },
    "endLocation": {
        "x": 34,
        "y": 89
    }
    },
    "shot": null,
    "groundDuel": null,
    "aerialDuel": null,
    "infraction": null,
    "carry": null,
    "possession": {
    "id": 601919966,
    "duration": "3.293842",
    "types": [
        "throw_in"
    ],
    "eventsNumber": 3,
    "eventIndex": 1,
    "startLocation": {
        "x": 29,
        "y": 100
    },
    "endLocation": {
        "x": 43,
        "y": 28
    },
    "team": {
        "id": 964,
        "name": "Borussia Dortmund",
        "formation": "3-4-3"
    },
    "attack": null
    }
}
```

There is no current conversion between the Wyscout v3 format and the SPADL data format, so we are going to build it ourselves.

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import os
import warnings
import pandas as pd
pd.set_option('display.max_columns', None)
warnings.simplefilter(action='ignore', category=pd.errors.PerformanceWarning)
warnings.filterwarnings(action="ignore", message="credentials were not supplied. open data access only")
import tqdm

In [19]:
from socceraction.spadl.wyscout import *
import pandas as pd
from socceraction.data.wyscout import WyscoutLoader
from socceraction.data.base import _localloadjson
from socceraction.data.wyscout.loader import _convert_events
from socceraction.data.wyscout.schema import WyscoutEventSchema

The Wyscout loader code from the SoccerAction databse also looks to be broken, so I will use some of the methods in the file and the hyperlinks in the WyscoutLoader class to load the data manually in the below cell

In [26]:
# Events have been downloaded from the link in the PublicWyscoutLoader class - events="https://ndownloader.figshare.com/files/14464685",

# A modified version of the code from the events method of the PublicWyscoutLoader class
obj = _localloadjson("C:\\Users\\LiamMoore\\Documents\\code\\python\\wyscout-spadl-conversion\\Data\\Events\\events_England.json")

In [25]:
obj

[{'eventId': 8,
  'subEventName': 'Simple pass',
  'tags': [{'id': 1801}],
  'playerId': 25413,
  'positions': [{'y': 49, 'x': 49}, {'y': 78, 'x': 31}],
  'matchId': 2499719,
  'eventName': 'Pass',
  'teamId': 1609,
  'matchPeriod': '1H',
  'eventSec': 2.7586489999999912,
  'subEventId': 85,
  'id': 177959171},
 {'eventId': 8,
  'subEventName': 'High pass',
  'tags': [{'id': 1801}],
  'playerId': 370224,
  'positions': [{'y': 78, 'x': 31}, {'y': 75, 'x': 51}],
  'matchId': 2499719,
  'eventName': 'Pass',
  'teamId': 1609,
  'matchPeriod': '1H',
  'eventSec': 4.946850000000012,
  'subEventId': 83,
  'id': 177959172},
 {'eventId': 8,
  'subEventName': 'Head pass',
  'tags': [{'id': 1801}],
  'playerId': 3319,
  'positions': [{'y': 75, 'x': 51}, {'y': 71, 'x': 35}],
  'matchId': 2499719,
  'eventName': 'Pass',
  'teamId': 1609,
  'matchPeriod': '1H',
  'eventSec': 6.54218800000001,
  'subEventId': 82,
  'id': 177959173},
 {'eventId': 8,
  'subEventName': 'Head pass',
  'tags': [{'id': 180

In [27]:
raw_df = pd.DataFrame(obj)
raw_df.head()

Unnamed: 0,eventId,subEventName,tags,playerId,positions,matchId,eventName,teamId,matchPeriod,eventSec,subEventId,id
0,8,Simple pass,[{'id': 1801}],25413,"[{'y': 49, 'x': 49}, {'y': 78, 'x': 31}]",2499719,Pass,1609,1H,2.758649,85,177959171
1,8,High pass,[{'id': 1801}],370224,"[{'y': 78, 'x': 31}, {'y': 75, 'x': 51}]",2499719,Pass,1609,1H,4.94685,83,177959172
2,8,Head pass,[{'id': 1801}],3319,"[{'y': 75, 'x': 51}, {'y': 71, 'x': 35}]",2499719,Pass,1609,1H,6.542188,82,177959173
3,8,Head pass,[{'id': 1801}],120339,"[{'y': 71, 'x': 35}, {'y': 95, 'x': 41}]",2499719,Pass,1609,1H,8.143395,82,177959174
4,8,Simple pass,[{'id': 1801}],167145,"[{'y': 95, 'x': 41}, {'y': 88, 'x': 72}]",2499719,Pass,1609,1H,10.302366,85,177959175


Just take a single match from this file for the rest of the investigation 

In [28]:
raw_df = raw_df.loc[raw_df.matchId==2499719]

In [29]:
if not isinstance(obj, dict) or "events" not in obj:
    df_events = _convert_events(pd.DataFrame(raw_df))
    events_df = cast(DataFrame[WyscoutEventSchema], df_events)

In [31]:
# check if there is a single row for each event ie - nothing has been exploded
len(events_df), events_df.event_id.nunique()

(1768, 1768)

The root function that does all the nice trickery we care about is the convert_to_actions function. Take a look at the output of this function with v2 and v3 event data from Wyscout and see what it returns.

*Note Im just using the first team id as home team here, may or may not be the home team but it saves downloading the Teams.json data*

In [33]:
spadl_actions = convert_to_actions(events_df, 1609)
spadl_actions

Unnamed: 0,game_id,period_id,time_seconds,team_id,player_id,start_x,start_y,end_x,end_y,original_event_id,bodypart_id,type_id,result_id,action_id
0,2499719,1,2.758649,1609,25413,51.45,34.68,32.55,14.96,177959171,0,0,1,0
1,2499719,1,4.946850,1609,370224,32.55,14.96,53.55,17.00,177959172,0,0,1,1
2,2499719,1,6.542188,1609,3319,53.55,17.00,36.75,19.72,177959173,1,0,1,2
3,2499719,1,8.143395,1609,120339,36.75,19.72,43.05,3.40,177959174,1,0,1,3
4,2499719,1,10.302366,1609,167145,43.05,3.40,75.60,8.16,177959175,0,0,1,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1283,2499719,2,2990.768201,1631,8653,93.45,46.24,97.65,36.04,177961039,0,0,1,1283
1284,2499719,2,2992.491575,1631,8480,97.65,36.04,56.70,33.32,177961040,0,0,0,1284
1285,2499719,2,2994.900590,1609,49876,56.70,33.32,76.65,28.56,177961035,1,0,1,1285
1286,2499719,2,2997.086392,1609,7870,76.65,28.56,105.00,27.20,177961036,0,11,0,1286
