# Working With Play by Play

Working with play by play can be interesting work in that there's a lot of unknown types of data as well as parsing of strings. In addition, there a ton of cool things that can be done with play be play like sending the feed into a pub\sub model so other systems can interact with it, build your own UI, or a whole host of other ideas.

The goal of this notebook is to walk through the play by play feed examining data such as 

1. `EVENTMSGTYPE` which provides the play type (e.g. FIELD_GOAL_MADE, FIELD_GOAL_MISSED, TIMEOUT, PERIOD_BEGIN, etc.)
2. `EVENTMSGACTIONTYPE`which provides a subcatagorization of `EVENTMSGTYPE` (e.g. REVERSE_LAYUP, 3PT_JUMP_SHOT, HOOK_SHOT, etc.)

This notebook builds on top of the following notebooks: [Finding Games](notebook2.ipynb), [Basics Notebook](Basics.ipynb), and of course, dives into `PlayByPlay` endpoint. Note that the `PlayByPlayV2` endpoint is an extension of `PlayByPlay`.


So with that...let's get started!

The goals are
1. Get the last game the Pacers played (maybe we'll get lucky and get a current game)
2. Examine the feed and the fields that are returned
3. See how Regex can be applied to the play by play
3. Dynamically build and Enum on EVENTMSGTYPE & EVENTMSGACTIONTYPE
4. See what's hiding in the feed...need to get those BLOCKS from Myles Turner

# Let's get started and jump into the game!

First thing's first...get the Pacers team_id

In [2]:
#Get the Pacers team_id
from nba_api.stats.static import teams

nba_teams = teams.get_teams()

# Select the dictionary for the Pacers, which contains their team ID
pacers = [team for team in nba_teams if team['abbreviation'] == 'IND'][0]
pacers_id = pacers['id']
print(f'pacers_id: {pacers_id}')

pacers_id: 1610612754


Searching through the games and get the most recent Pacers game_id

In [4]:
# Query for the last regular season game where the Pacers were playing
from nba_api.stats.endpoints import leaguegamefinder
from nba_api.stats.library.parameters import Season
from nba_api.stats.library.parameters import SeasonType

gamefinder = leaguegamefinder.LeagueGameFinder(team_id_nullable=pacers_id,
                            season_nullable=Season.default,
                            season_type_nullable=SeasonType.regular)  

games_dict = gamefinder.get_normalized_dict()
games = games_dict['LeagueGameFinderResults']
game = games[0]
game_id = game['GAME_ID']
game_matchup = game['MATCHUP']

print(f'Searching through {len(games)} game(s) for the game_id of {game_id} where {game_matchup}')

Searching through 50 game(s) for the game_id of 0021800681 where IND vs. DAL


# Retrieving the play by play data
Now that we've got a game_id, let's pull some play by play data

In [5]:
# Query for the play by play of that most recent regular season game
from nba_api.stats.endpoints import playbyplay
df = playbyplay.PlayByPlay(game_id).get_data_frames()[0]
df.head() #just looking at the head of the data

Unnamed: 0,GAME_ID,EVENTNUM,EVENTMSGTYPE,EVENTMSGACTIONTYPE,PERIOD,WCTIMESTRING,PCTIMESTRING,HOMEDESCRIPTION,NEUTRALDESCRIPTION,VISITORDESCRIPTION,SCORE,SCOREMARGIN
0,21800681,2,12,0,1,7:11 PM,12:00,,,,,
1,21800681,4,10,0,1,7:11 PM,12:00,Jump Ball Turner vs. Jordan: Tip to Doncic,,,,
2,21800681,7,6,1,1,7:11 PM,11:49,Turner P.FOUL (P1.T1) (B.Nansel),,,,
3,21800681,9,5,1,1,7:12 PM,11:37,Turner STEAL (1 STL),,Matthews Bad Pass Turnover (P1.T1),,
4,21800681,11,6,2,1,7:12 PM,11:30,,,Jordan S.FOUL (P1.T1) (C.Blair),,


Optional: Dataframes can become large. In pandas you can set some options to make it more visible if needed

In [39]:
#Since the datset is fairly large you'll see plenty of elipses(...). 
#If that's the case, you can set the following options to expand the data 
#You can adjust these as you'd like
import pandas
pandas.set_option('display.max_colwidth',250)
pandas.set_option('display.max_rows',250)

Some of the most valuable fields of `PlayByPlay`are the following:
`EVENTMSGTYPE`
`EVENTMSGACTIONTYPE`
`HOMEDESCRIPTION`
and `VISITORDESCRIPTION`.

`EVENTMSGTYPE` gives us the type of event that has occurred. This can vary per game. This is why finding these and placing them into an Enum or other type structure is a good idea.

In [33]:
#List unique values in the df['EVENTMSGTYPE'] colum
print(f'EVENTMSGTYPE: {sorted(df.EVENTMSGTYPE.unique())}')

EVENTMSGTYPE: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 18]


In [76]:
#For quick refernce, here's an Enum for `EVENTMSGTYPE`
#This list may be incomplete as a thourogh play by play scan is necessary

from enum import Enum

class EventMsgType(Enum):
    FIELD_GOAL_MADE = 1
    FIELD_GOAL_MISSED = 2
    FREE_THROWfree_throw_attempt = 3
    REBOUND = 4
    TURNOVER = 5
    FOUL = 6
    VIOLATION = 7
    SUBSTITUTION = 8
    TIMEOUT = 9
    JUMP_BALL = 10
    EJECTION = 11
    PERIOD_BEGIN = 12
    PERIOD_END = 13

Using the `EVENTMSGTYPE` field we can begin to examine the event types to see what typical values will be in the `EVENTMSGACTIONTYPE` `HOMEDESCRIPTION` and `VISITORDESCRIPTION` fields.

In [59]:
#pull the data for a specfic EVENTMSGTYPE
df.loc[df['EVENTMSGTYPE'] == 2].head() #hint: use the EVENTMSGTYPE values above to see different data

Unnamed: 0,GAME_ID,EVENTNUM,EVENTMSGTYPE,EVENTMSGACTIONTYPE,PERIOD,WCTIMESTRING,PCTIMESTRING,HOMEDESCRIPTION,NEUTRALDESCRIPTION,VISITORDESCRIPTION,SCORE,SCOREMARGIN
7,21800681,15,2,80,1,7:13 PM,11:13,,,MISS Doncic 30' 3PT Step Back Jump Shot,,
10,21800681,18,2,1,1,7:13 PM,10:51,MISS Collison 24' 3PT Jump Shot,,,,
12,21800681,20,2,6,1,7:14 PM,10:41,Collison BLOCK (1 BLK),,MISS Doncic 3' Driving Layup,,
14,21800681,23,2,6,1,7:14 PM,10:37,MISS Bogdanovic 2' Driving Layup,,Kleber BLOCK (1 BLK),,
20,21800681,35,2,5,1,7:15 PM,9:30,MISS Oladipo 4' Layup,,Kleber BLOCK (2 BLK),,


Now that we've seen what the output of `EVENTMSGTYPE` is, let's dig into `EVENTMSGACTIONTYPE`.

For this next exercise, let's pull all unique `EVENTMSGACTIONTYPE` values for `EVENTMSGTYPE = 1`

_Note: `EVENTMSGACTIONTYPE` ids have a very loose correlation to `EVENTMSGTYPE` ids. This means that `EVENTMSGTYPE` ids share some of the same `EVENTMSGACTIONTYPE` ids. This allows the NBA to have a 'Missed Field Goal' share the same '3PT Jump Shot' with a 'Made Field Goal'_

In [49]:
#List unique values in the df['EVENTMSGTYPE'] column
emt_df = df.loc[df['EVENTMSGTYPE'] == 1]
print(f'EVENTMSGACTIONTYPE: {sorted(emt_df.EVENTMSGACTIONTYPE.unique())}')

EVENTMSGACTIONTYPE: [1, 3, 5, 6, 7, 9, 43, 47, 50, 52, 63, 66, 72, 75, 76, 78, 79, 80, 86, 87, 97, 98, 99, 101, 107, 108]


# So how do we know what each `EVENTMSGACTIONTYPE` is?

Let the fun begin.

Apply some regular expressions, that are `EVENTMSGTYPE` specific, against `HOMEDECSRIPTION` and `VISITORDESCRIPTION` while keeping track of the `EVENTMSGACTIONTYPE`. 

To see the regular expressions in action, take the example listed in the comments, along with the regex, and head on over to https://regex101.com/ or your favorite regex interative tool.

# `EVENTMSGTYPE == 1`
The following regex expression `'\s+([\w+ ]*)\s` is specific to `EVENTMSGTYPE == 1`. It'll look for the type of basket within the `VISITORDESCRIPTION` or `HOMEDESCRIPTION` and tie that to the `EVENTMSGACTIONTYPE`.

Example: Given a `VISITORDESCRIPTION == 'Young Cutting Layup Shot (2 PTS) (Collison 1 AST)'` and a `EVENTMSGACTIONTYPE = 98`, the code will produce an output of `98 = CUTTING_LAYUP_SHOT`

Let's see it in action...

In [24]:
#Mapping out all of the EventMsgActionTypes for EventMsgType 1
import re
import operator

#the following expression is specific to EventMsgType 1
p = re.compile('\s+([\w+ ]*)\s')

#get the PlayByPlay data from the Pacers game_id
plays = playbyplay.PlayByPlay(game_id).get_normalized_dict()['PlayByPlay']

#declare a few variables
description = ''
event_msg_action_types = {}

#loop over the play by play data
for play in plays:
    if play['EVENTMSGTYPE'] == 1:
        description = play['HOMEDESCRIPTION'] if not None else play['VISITORDESCRIPTION']
        if description is not None:
            #do a bit of searching(regex) and a little character magic: underscores and upper case
            event_msg_action = re.sub(' ', '_', p.search(description).groups()[0]).upper()
            #Add it to our dictionary
            event_msg_action_types[play['EVENTMSGACTIONTYPE']] = event_msg_action

#sort it all
event_msg_action_types = sorted(event_msg_action_types.items(), key=operator.itemgetter(0))

#output a class that we could plug into our code base
print('from enum import Enum\n')
print('class EventMsgActionType(Enum):')
for action in event_msg_action_types:
    print(f'\t{action[1]} = {action[0]}')

from enum import Enum

class EventMsgActionType(Enum):
	JUMP_SHOT = 1
	HOOK_SHOT = 3
	LAYUP = 5
	DRIVING_LAYUP = 6
	DUNK = 7
	DRIVING_DUNK = 9
	ALLEY_OOP_LAYUP = 43
	RUNNING_DUNK = 50
	FADEAWAY_JUMPER = 63
	JUMP_BANK_SHOT = 66
	PUTBACK_LAYUP = 72
	DRIVING_FINGER_ROLL_LAYUP = 75
	FLOATING_JUMP_SHOT = 78
	PULLUP_JUMP_SHOT = 79
	STEP_BACK_JUMP_SHOT = 80
	TIP_LAYUP_SHOT = 97
	CUTTING_LAYUP_SHOT = 98
	CUTTING_FINGER_ROLL_LAYUP_SHOT = 99
	DRIVING_FLOATING_JUMP_SHOT = 101
	CUTTING_DUNK_SHOT = 108


# `EVENTMSGTYPE == 2`
The following regex expression `((?:MISS \S* \d*')|(?:MISS \S*))\s*([\w+ ]*)` is specific to `EVENTMSGTYPE == 2`. EventMsgType 2 are missed field goals. Again, it'll look for the type of basket within the `VISITORDESCRIPTION` or `HOMEDESCRIPTION` and tie that to the `EVENTMSGACTIONTYPE`.

Example: Given a `HOMEDESCRIPTION == 'MISS Collison 24' 3PT Jump Shot'` and a `EVENTMSGACTIONTYPE = 1`, the code will produce an output of `1 = 3PT_JUMP_SHOT`

Let's see it in action...

In [22]:
#Mapping out all of the EventMsgActionTypes for EventMsgType 2
import re
import operator

#the following expression is specific to EventMsgType 1
p = re.compile('((?:MISS \S* \d*\')|(?:MISS \S*))\s*([\w+ ]*)')

#get the PlayByPlay data from the Pacers game_id
plays = playbyplay.PlayByPlay(game_id).get_normalized_dict()['PlayByPlay']

#declare a few variables
description = ''
event_msg_action_types = {}

#loop over the play by play data
#do a bit of findall(regex) and a little character magic: underscores and upper case
#we're using a findall here as we have to deal with the extra word MISS at the beginning of the text.
#that extra text means we'll have multiple matches for our regex.
for play in plays:
    if play['EVENTMSGTYPE'] == 2:
        match = list()
        if play['HOMEDESCRIPTION'] is not None: 
            match = p.findall(play['HOMEDESCRIPTION'])
        
        if not match:
            match = p.findall(play['VISITORDESCRIPTION'])

        event_msg_action = re.sub(' ', '_', match[0][1]).upper()
        event_msg_action_types[play['EVENTMSGACTIONTYPE']] = event_msg_action
            
event_msg_action_types = sorted(event_msg_action_types.items(), key=operator.itemgetter(0))

print('from enum import Enum\n')
print('class EventmsgActionType(Enum):')
for action in event_msg_action_types:
    print(f'\t{action[1]} = {action[0]}')

from enum import Enum

class EventmsgActionType(Enum):
	3PT_JUMP_SHOT = 1
	LAYUP = 5
	DRIVING_LAYUP = 6
	DUNK = 7
	RUNNING_LAYUP = 41
	TURNAROUND_JUMP_SHOT = 47
	DRIVING_HOOK_SHOT = 57
	TURNAROUND_HOOK_SHOT = 58
	FADEAWAY_JUMPER = 63
	PUTBACK_LAYUP = 72
	DRIVING_FINGER_ROLL_LAYUP = 75
	FLOATING_JUMP_SHOT = 78
	3PT_PULLUP_JUMP_SHOT = 79
	STEP_BACK_JUMP_SHOT = 80
	TIP_LAYUP_SHOT = 97
	CUTTING_LAYUP_SHOT = 98
	DRIVING_FLOATING_JUMP_SHOT = 101
	DRIVING_FLOATING_BANK_JUMP_SHOT = 102
	3PT_STEP_BACK_BANK_JUMP_SHOT = 104


# `EVENTMSGTYPE == 1 & EVENTMSGTYPE == 2`
So now we're simply putting it all together to build a unique list of `EVENTMSGACTIONTYPE` ids as we inspect each play by play

Let's see it in action...

In [21]:
#Mapping out all of the EventMsgActionTypes for EventMsgType 1 & 2
import re
import operator

re_made = re.compile('\s+([\w+ ]*)\s')
re_missed = re.compile('((?:MISS \S* \d*\')|(?:MISS \S*))\s*([\w+ ]*)')

plays = playbyplay.PlayByPlay(game_id).get_normalized_dict()['PlayByPlay']
description = ''
event_msg_action_types = {}

for play in plays:
    if play['EVENTMSGTYPE'] == 1:
        description = play['HOMEDESCRIPTION'] if not None else play['VISITORDESCRIPTION']
        if description is not None:
            event_msg_action_types[play['EVENTMSGACTIONTYPE']] = \
                re.sub(' ', '_',  re_made.search(description).groups()[0]).upper()


    if play['EVENTMSGTYPE'] == 2:
        match = list()
        if play['HOMEDESCRIPTION'] is not None: 
            match = re_missed.findall(play['HOMEDESCRIPTION'])
        
        if not match:
            match = re_missed.findall(play['VISITORDESCRIPTION'])

        event_msg_action_types[play['EVENTMSGACTIONTYPE']] = re.sub(' ', '_', match[0][1]).upper()

event_msg_action_types = sorted(event_msg_action_types.items(), key=operator.itemgetter(0))

#Mapping out all of the EventMsgActionTypes for EventMsgType 1 & 2
import re
import operator

re_made = re.compile('\s+([\w+ ]*)\s')
re_missed = re.compile('((?:MISS \S* \d*\')|(?:MISS \S*))\s*([\w+ ]*)')

plays = playbyplay.PlayByPlay(game_id).get_normalized_dict()['PlayByPlay']
description = ''
event_msg_action_types = {}

for play in plays:
    if play['EVENTMSGTYPE'] == 1:
        description = play['HOMEDESCRIPTION'] if not None else play['VISITORDESCRIPTION']
        if description is not None:
            event_msg_action_types[play['EVENTMSGACTIONTYPE']] = \
                re.sub(' ', '_',  re_made.search(description).groups()[0]).upper()


    if play['EVENTMSGTYPE'] == 2:
        match = list()
        if play['HOMEDESCRIPTION'] is not None: 
            match = re_missed.findall(play['HOMEDESCRIPTION'])
        
        if not match:
            match = re_missed.findall(play['VISITORDESCRIPTION'])

        event_msg_action_types[play['EVENTMSGACTIONTYPE']] = re.sub(' ', '_', match[0][1]).upper()

event_msg_action_types = sorted(event_msg_action_types.items(), key=operator.itemgetter(0))

print('from enum import Enum\n')
print('class EventMsgActionType(Enum):')
for action in event_msg_action_types:
    print(f'\t{action[1]} = {action[0]}')

from enum import Enum

class EventMsgActionType(Enum):
	3PT_JUMP_SHOT = 1
	HOOK_SHOT = 3
	LAYUP = 5
	DRIVING_LAYUP = 6
	DUNK = 7
	DRIVING_DUNK = 9
	RUNNING_LAYUP = 41
	ALLEY_OOP_LAYUP = 43
	TURNAROUND_JUMP_SHOT = 47
	RUNNING_DUNK = 50
	DRIVING_HOOK_SHOT = 57
	TURNAROUND_HOOK_SHOT = 58
	FADEAWAY_JUMPER = 63
	JUMP_BANK_SHOT = 66
	PUTBACK_LAYUP = 72
	DRIVING_FINGER_ROLL_LAYUP = 75
	FLOATING_JUMP_SHOT = 78
	3PT_PULLUP_JUMP_SHOT = 79
	STEP_BACK_JUMP_SHOT = 80
	TIP_LAYUP_SHOT = 97
	CUTTING_LAYUP_SHOT = 98
	CUTTING_FINGER_ROLL_LAYUP_SHOT = 99
	DRIVING_FLOATING_JUMP_SHOT = 101
	DRIVING_FLOATING_BANK_JUMP_SHOT = 102
	3PT_STEP_BACK_BANK_JUMP_SHOT = 104
	CUTTING_DUNK_SHOT = 108


# What About Blocks?
So if you've taken a close look at the data, especially that where `EVENTMSGTYPE == 2` you may have noticed that a few of the missed field goals were due to some incredible shot blocking players. By adding a few lines of code, we can find these shot blockers. Dealing with this data is a bit beyond the scope of this notebook, but it's worth pointing out that the data is in there. One idea is to play it into it's own play by play block (just a thought).

In [20]:
#Blocks are not included in the event feed but are a part of the EVENTMSGTYPE 2
import re
import operator

print('------------------')

#the following expression is specific to EventMsgType 1
p = re.compile('((?:MISS \S* \d*\')|(?:MISS \S*))\s*([\w+ ]*)')

#get the PlayByPlay data from the Pacers game_id
plays = playbyplay.PlayByPlay(game_id).get_normalized_dict()['PlayByPlay']

#declare a few variables
description = ''
event_msg_action_types = {}

#loop over the play by play data
#do a bit of findall(regex) and a little character magic: underscores and upper case
#we're using a findall here as we have to deal with the extra word MISS at the beginning of the text.
#that extra text means we'll have multiple matches for our regex.
for play in plays:
    if play['EVENTMSGTYPE'] == 2:
        match = list()
        if play['HOMEDESCRIPTION'] is not None: 
            match = p.findall(play['HOMEDESCRIPTION'])

            #looking for blocks
            if len(match) & (play['VISITORDESCRIPTION'] is not None):
                print(play['VISITORDESCRIPTION'])

        if not match:
            match = p.findall(play['VISITORDESCRIPTION'])
            
            #looking for blocks
            if len(match) & (play['HOMEDESCRIPTION'] is not None):
                print(play['HOMEDESCRIPTION'])


        event_msg_action = re.sub(' ', '_', match[0][1]).upper()
        event_msg_action_types[play['EVENTMSGACTIONTYPE']] = event_msg_action
            
event_msg_action_types = sorted(event_msg_action_types.items(), key=operator.itemgetter(0))

print('------------------')
print('\nfrom enum import Enum\n')
print('class EventmsgActionType(Enum):')
for action in event_msg_action_types:
    print(f'\t{action[1]} = {action[0]}')

------------------
Collison BLOCK (1 BLK)
Kleber BLOCK (1 BLK)
Kleber BLOCK (2 BLK)
Turner BLOCK (1 BLK)
Turner BLOCK (2 BLK)
Turner BLOCK (3 BLK)
Kleber BLOCK (3 BLK)
Nowitzki BLOCK (1 BLK)
------------------

from enum import Enum

class EventmsgActionType(Enum):
	3PT_JUMP_SHOT = 1
	LAYUP = 5
	DRIVING_LAYUP = 6
	DUNK = 7
	RUNNING_LAYUP = 41
	TURNAROUND_JUMP_SHOT = 47
	DRIVING_HOOK_SHOT = 57
	TURNAROUND_HOOK_SHOT = 58
	FADEAWAY_JUMPER = 63
	PUTBACK_LAYUP = 72
	DRIVING_FINGER_ROLL_LAYUP = 75
	FLOATING_JUMP_SHOT = 78
	3PT_PULLUP_JUMP_SHOT = 79
	STEP_BACK_JUMP_SHOT = 80
	TIP_LAYUP_SHOT = 97
	CUTTING_LAYUP_SHOT = 98
	DRIVING_FLOATING_JUMP_SHOT = 101
	DRIVING_FLOATING_BANK_JUMP_SHOT = 102
	3PT_STEP_BACK_BANK_JUMP_SHOT = 104


# Scratch Pad

In [34]:
#enum usage example
from enum import Enum

class EventMsgActionType(Enum):
	JUMP_SHOT = 1
	HOOK_SHOT = 3
	LAYUP = 5
	DRIVING_LAYUP = 6
	DUNK = 7
	DRIVING_DUNK = 9
	ALLEY_OOP_LAYUP = 43
	RUNNING_DUNK = 50
	FADEAWAY_JUMPER = 63
	JUMP_BANK_SHOT = 66
	PUTBACK_LAYUP = 72
	DRIVING_FINGER_ROLL_LAYUP = 75
	FLOATING_JUMP_SHOT = 78
	PULLUP_JUMP_SHOT = 79
	STEP_BACK_JUMP_SHOT = 80
	TIP_LAYUP_SHOT = 97
	CUTTING_LAYUP_SHOT = 98
	CUTTING_FINGER_ROLL_LAYUP_SHOT = 99
	DRIVING_FLOATING_JUMP_SHOT = 101
	CUTTING_DUNK_SHOT = 108
 
print(f'Object: {repr(EventMsgActionType.JUMP_BANK_SHOT)}')
print(f'Enum: {EventMsgActionType(98)}')
print(f'Value: {EventMsgActionType.JUMP_BANK_SHOT.value}')


Object: <EventMsgActionType.JUMP_BANK_SHOT: 66>
Enum: EventMsgActionType.CUTTING_LAYUP_SHOT
Value: 66
