# The Brotherly Shove

The Philadelphia Eagles of the National Football League have redefined the Quarterback-Sneak (BQ-Sneak) over the last few seasons. A QB-Sneak is an offensive play where the quarterback, upon receiving the snap, immediately dives forward. The play is designed for short-yardage gains and is stoppable by a strong and aware defensive line.

However, due to a change to the official NFL rules, players with possession of the ball are allowed to be *pushed forward* by their teammates. This change in the rules allows teams to stack players behind the QB and *push* them forward on a QB-Sneak. The Philadelphia Eagles have proven their ability to stack their players around and behind quarterback Jalen Hurts and achieve massive success whenever they need to gain only 1 or 2 yards (and sometimes more!).

Let's take a look at what the Brotherly Shove looks like:

In [2]:
# we can use panel to help embed HTML into our Jupyter notebooks, the video itself just needs to be embeddable!
import panel as pn
pn.extension()

# create an HTML pane using the embed code provided by YouTube
video_embed = pn.pane.HTML('''<iframe width="640" height="370" src="https://www.youtube.com/embed/ATMfvNIQfbk?si=PYPifUkTqBF-x9SF" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>''')
pn.Row(video_embed)

BokehModel(combine_events=True, render_bundle={'docs_json': {'eb9c8a55-21f4-410a-a7cb-83be61ff9211': {'version…

The surge of players spilling into a pile is the flagship mark of a successful Brotherly Shove. Want we want to do is find data on the Brotherly Shove. The website [`Sportradar`](https://sportradar.com/) provides a deeply comprehensive data API for many professional sports leagues, and luckily the NFL is among them!

We want to use this API to retrieve play-by-play data to better understand the Brotherly-Shove and the Philadelphia Eagles success with it. First we need to find all of the games that the Eagles have played in. Sportsradar provides the an endpoint *Current Season Schedule*. We can see some pre-programmed usage of it [*here*](nfl/data.py).

Data API's are usually straight-forward, but the data itself is complicated. The Sportsradar API and data is no different. When we ask for all of the games in a season we get a *ton* of data. It has a format roughly like the following (there are official schemas, but they are difficult to read):

```text
season:
  id: ...
  ...
weeks: [
    id: ...
    games: [
        id: ...
        ...
    ],
    ...
]
```

The ellipses here represent many, many fields. Every section of the data contains a unique ID; these unique IDs can be used to help navigate and partition the data. There is an absolute ton of data recorded for every game. Let's take a look at one game.

1. import the `nfl` module
2. create a variable `season_schedule` and assign the value from calling `nfl.get_season_schedule()`
3. print the first game from the first week

In [16]:
import nfl
season_schedule = nfl.get_season_schedule()
season_schedule['weeks'][0]['games'][0]

{'id': '88b69b59-d7b9-4572-84d8-6ec00f9af626',
 'status': 'closed',
 'scheduled': '2023-09-08T00:20:00+00:00',
 'attendance': 73522,
 'entry_mode': 'LDE',
 'weather': ' Temp: 83 F, Humidity: 36%, Wind: E 6 mph',
 'sr_id': 'sr:match:41205251',
 'venue': {'id': '2ec4c411-dac2-403d-b091-6b6aa4a0a914',
  'name': 'GEHA Field at Arrowhead Stadium',
  'city': 'Kansas City',
  'state': 'MO',
  'country': 'USA',
  'zip': '64129',
  'address': 'One Arrowhead Drive',
  'capacity': 76416,
  'surface': 'turf',
  'roof_type': 'outdoor',
  'sr_id': 'sr:venue:8189'},
 'home': {'id': '6680d28d-d4d2-49f6-aace-5292d3ec02c2',
  'name': 'Kansas City Chiefs',
  'alias': 'KC',
  'game_number': 1,
  'sr_id': 'sr:competitor:4422'},
 'away': {'id': 'c5a59daa-53a7-4de0-851f-fb12be893e9e',
  'name': 'Detroit Lions',
  'alias': 'DET',
  'game_number': 1,
  'sr_id': 'sr:competitor:4419'},
 'broadcast': {'network': 'NBC'},
 'scoring': {'home_points': 20,
  'away_points': 21,
  'periods': [{'period_type': 'quarter',


We can even see the number of attendees for the game, among a ton of other facets of data. Your first task is to find the IDs of every game played/to-be-played by the Philadelphia Eagles.

1. Use a nested loop to iterate over the `"weeks"` field of `season_schedule` and the `"games"` field of every week 

Store the games in a list named `games`. (Note, you can do this with a single list comprehension if you wish!)

In [24]:
games = []
for week in season_schedule['weeks']:
    for game in week['games']:
        if game['away']['alias'] == 'PHI' or game['home']['alias'] == 'PHI' :
            if game['status'] == 'closed':
                games.append(game)
games = games[:-1]


In [25]:
games[-1]

{'id': '836211c3-a784-44d9-a704-3fc0c7364855',
 'status': 'closed',
 'scheduled': '2023-11-26T21:25:00+00:00',
 'attendance': 69879,
 'entry_mode': 'LDE',
 'weather': 'Rain Temp: 47 F, Humidity: 88%, Wind: NE 6 mph',
 'sr_id': 'sr:match:41209503',
 'venue': {'id': '4fa8c29c-6626-464c-8540-314ed7535e1b',
  'name': 'Lincoln Financial Field',
  'city': 'Philadelphia',
  'state': 'PA',
  'country': 'USA',
  'zip': '19148',
  'address': '1020 Pattison Avenue',
  'capacity': 69596,
  'surface': 'turf',
  'roof_type': 'outdoor',
  'sr_id': 'sr:venue:1833'},
 'home': {'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',
  'name': 'Philadelphia Eagles',
  'alias': 'PHI',
  'game_number': 11,
  'sr_id': 'sr:competitor:4428'},
 'away': {'id': '768c92aa-75ff-4a43-bcc0-f2798c2e1724',
  'name': 'Buffalo Bills',
  'alias': 'BUF',
  'game_number': 12,
  'sr_id': 'sr:competitor:4376'},
 'broadcast': {'network': 'CBS'},
 'scoring': {'home_points': 37,
  'away_points': 34,
  'periods': [{'period_type': 'quarter

In [26]:
len(games)

11

Once you have that working, let's filter it down to games where one of the competitors is the `Philadelphia Eagles` and to games that have been played (the season schedule includes games in the future!). This is most easily done by first converting `games` to a `pandas.DataFrame` and operating on the columns via masks. Note: The columns `home` and `away` are columns of dictionaries, and so the easiest way for us to check them is to apply lambda functions over them.

1. create a dataframe `games_df` from our `games` list
2. create a mask `played_games_mask` that checks if the `status` is `"closed"`
3. create a mask `eagles_home` by applying a lambda to the `home` column; the lambda should take an argument `"team"` and checks if the team's `"alias"` field is equal to `"PHI"`
4. create a mask `eagles_away` by applying a lambda to the `away` column; the lambda should take an argument `"team"` and checks if the team's `"alias"` field is equal to `"PHI"`
5. apply the masks such that we are using the pseudo-logic `played_games_mask and (eagles_home or eagles_away)` (do not copy this, this is not valid) and store the resulting dataframe as `played_eagles_games_df`

In [27]:
import pandas as pd
games_df = pd.DataFrame(games)
played_games_mask = games_df # already filtered in previous cell
played_games_mask

Unnamed: 0,id,status,scheduled,attendance,entry_mode,weather,sr_id,venue,home,away,broadcast,scoring
0,7fa08811-7cf6-4f09-a3b6-f35e91ab055e,closed,2023-09-10T20:25:00+00:00,64628,LDE,"Rain Temp: 71 F, Humidity: 96%, Wind: SE 5 mph",sr:match:41209159,"{'id': 'e43310b1-cb82-4df9-8be5-e9b39637031b',...","{'id': '97354895-8c77-4fd4-a860-32e62ea7382a',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",{'network': 'CBS'},"{'home_points': 20, 'away_points': 25, 'period..."
1,c1052576-390a-49b8-9101-88e7d802b464,closed,2023-09-15T00:15:00+00:00,69879,LDE,"Clear Temp: 72 F, Humidity: 43%, Wind: N 8 mph",sr:match:41209165,"{'id': '4fa8c29c-6626-464c-8540-314ed7535e1b',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...","{'id': '33405046-04ee-4058-a950-d606f8c30852',...",{'network': 'Amazon Prime Video'},"{'home_points': 34, 'away_points': 28, 'period..."
2,acf1f2a6-1e48-4971-aa52-1a32e322d63f,closed,2023-09-25T23:15:00+00:00,65426,LDE,"Cloudy Temp: 90 F, Humidity: 60%, Wind: SSW 4 mph",sr:match:41209225,"{'id': '6fccc39c-80bc-4c81-83d9-2d5a848c8c09',...","{'id': '4254d319-1bc7-4f81-b4ab-b5e6f3402b69',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",{'network': 'ABC'},"{'home_points': 11, 'away_points': 25, 'period..."
3,667e5a52-a24a-4f03-8a9b-2bf65d9c80ab,closed,2023-10-01T17:00:00+00:00,69879,LDE,"Temp: 76 F, Humidity: 54%, Wind: NNE 10 mph",sr:match:41209245,"{'id': '4fa8c29c-6626-464c-8540-314ed7535e1b',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...","{'id': '22052ff7-c065-42ee-bc8f-c4691c50e624',...",{'network': 'FOX'},"{'home_points': 34, 'away_points': 31, 'period..."
4,0e853aec-266e-48c8-bbd5-290385f341e1,closed,2023-10-08T20:05:00+00:00,74935,LDE,"Sunny Temp: 71 F, Humidity: 59%, Wind: SW 3 mph",sr:match:41209275,"{'id': '790c1f04-73c6-4f6f-8b1e-78a62260be90',...","{'id': '2eff2a03-54d4-46ba-890e-2bc3925548f3',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",{'network': 'FOX'},"{'home_points': 14, 'away_points': 23, 'period..."
5,6866131a-c15d-4b3c-9f15-aeff08d50e23,closed,2023-10-15T20:25:00+00:00,83068,LDE,"Partly Cloudy Temp: 59 F, Humidity: 54%, Wind:...",sr:match:41209337,"{'id': '5d4c85c7-d84e-4e10-bd6a-8a15ebecca5c',...","{'id': '5fee86ae-74ab-4bdd-8416-42a9dd9964f3',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",{'network': 'FOX'},"{'home_points': 20, 'away_points': 14, 'period..."
6,7b2da75e-2f9e-4a6a-ba4c-ba6d98ac32d2,closed,2023-10-23T00:20:00+00:00,69879,LDE,"Partly Cloudy Temp: 55 F, Humidity: 44%, Wind:...",sr:match:41209365,"{'id': '4fa8c29c-6626-464c-8540-314ed7535e1b',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...","{'id': '4809ecb0-abd3-451d-9c4a-92a90b83ca06',...",{'network': 'NBC'},"{'home_points': 31, 'away_points': 17, 'period..."
7,08bc19a0-57a0-4384-ad54-e1841617eb52,closed,2023-10-29T17:00:00+00:00,64653,LDE,"Cloudy Temp: 69 F, Humidity: 77%, Wind: NNE 4 mph",sr:match:41209387,"{'id': '7c11bb2d-4a53-4842-b842-0f1c63ed78e9',...","{'id': '22052ff7-c065-42ee-bc8f-c4691c50e624',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",{'network': 'FOX'},"{'home_points': 31, 'away_points': 38, 'period..."
8,7e44751b-9d3d-4722-aa34-959d722a7eb4,closed,2023-11-05T21:25:00+00:00,69879,LDE,"Clear Temp: 64 F, Humidity: 41%, Wind: NNW 6 mph",sr:match:41209421,"{'id': '4fa8c29c-6626-464c-8540-314ed7535e1b',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...","{'id': 'e627eec7-bbae-4fa4-8e73-8e1d6bc5c060',...",{'network': 'FOX'},"{'home_points': 28, 'away_points': 23, 'period..."
9,5eedbc61-0130-459e-a640-efcc30708407,closed,2023-11-21T01:15:00+00:00,73754,LDE,"Light Rain Temp: 45 F, Humidity: 90%, Wind: N ...",sr:match:41206611,"{'id': '2ec4c411-dac2-403d-b091-6b6aa4a0a914',...","{'id': '6680d28d-d4d2-49f6-aace-5292d3ec02c2',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",{'network': 'ABC/ESPN'},"{'home_points': 17, 'away_points': 21, 'period..."


Notably from the above dataframe we can access all of the game IDs. Remember that these game IDs are specific to Sportsradar, and so we can use them in other parts of the data API for retrieving more detailed game data. If done correctly you should see the IDs listed below.

In [41]:
played_games_mask.id

0     7fa08811-7cf6-4f09-a3b6-f35e91ab055e
1     c1052576-390a-49b8-9101-88e7d802b464
2     acf1f2a6-1e48-4971-aa52-1a32e322d63f
3     667e5a52-a24a-4f03-8a9b-2bf65d9c80ab
4     0e853aec-266e-48c8-bbd5-290385f341e1
5     6866131a-c15d-4b3c-9f15-aeff08d50e23
6     7b2da75e-2f9e-4a6a-ba4c-ba6d98ac32d2
7     08bc19a0-57a0-4384-ad54-e1841617eb52
8     7e44751b-9d3d-4722-aa34-959d722a7eb4
9     5eedbc61-0130-459e-a640-efcc30708407
10    836211c3-a784-44d9-a704-3fc0c7364855
Name: id, dtype: object

Now that we have identified the games that the `Philadelphia Eagles` have played this season (minus 12/3 game versus the `San Fransisco 49ers`), we can proceed with identifying all of the plays *likely* to have been a Brotherly Shove.

Note the use of word *likely* here. Without watching all of the NFL film from this season, we can only make best guesses as to which plays were shoves. The data is incredibly detailed, which allows us to intelligently filter through it to find shove plays. To begin, we need to write a function that will in a game ID, and use it with the function `get_play_by_play` from our `nfl` module. Once we have this function, we can apply it to the `id` column of `played_eagles_games_df`.

Your function will need to be a fairly involved, as when we get a play-by-play for a game, the resulting structure is complex. When we get a play-by-play using something like `nfl.get_play_by_play(game_id)`, we will get a large JSON structure back. This structure will include a list of periods (quarters), each containing all of the plays in those quarters. Every period will have a `pbp` structure that contains all of the game's *drives*, which includes a list of *events. We will need to drill down to the events level and select only the events that are actual plays and are also rushing plays. *Your function will need multiple nested for-loops and multiple conditionals.*

More specifically we will need to select all of the rushing plays by Jalen Hurts from a starting position under center when there are less than 4 yards to go. While this criteria is not perfect, it should give us a fairly accurate sample of plays that are Brotherly Shoves. There are a number of gotchas here though that need to be maneuvered around.

* When we iterate over the *pbp* of a given *drive*, we need to check that the type of drive is actually a `drive` and not some other event (like a punt, extra point, etc.). If type of the drive is not equal to `drive` then we should skip it.
* When iterating over the plays of a drive, we need to enture that the actual play type was a rushing play. This means that the play event should have a `play_type` key, which is absent whenever any other event transpires (timeouts, 2-minute warning, end-of-quarter, etc.).
* As we iterate over the plays, some of the data may be missing specific components. This will happen when certain gameplay events take precedence over how the data is recorded. Namely, bad snaps to the quarterback, fumbles, etc. can results in missing data. Therefore when we are looking at a specific play we need to make sure that the play has both `start_situation` and `qb_at_snap` fields. We need the former to find the yard-to-down metric, and we need the `qb_at_snap` to help identify the Brotherly Shove formation.
* To ensure that we can trace plays back to their game, we need to artificially add the game_id to the plays that get extracted.

So all in all, your function needs to implement the following pseudocode:

1. get the play-by-play
2. create an empty list named `shoves` that will hold all of the shove plays
3. iterate over the play-by-play's `"period"` field (these are periods) **[LOOP #1 START]**
4. iterate over the period's `"pbp"` field (these are drives) **[LOOP #2 START]**
5. if the `"type"` field of the drive is not equal to "drive" then skip using `continue` **[IF #1 START/END]**
6. iterate over the drive's `"event"` field (these are plays!) **[LOOP #3 START]**
7. if there is a `"play_type"` field and it is equal to `"rush"` AND the fields `"start_situation"`, `"qb_at_snap"`, and `"description"` all exist, then: **[IF #2 START]**
8. get the `"yfd"` field (yards-for-down)
9. get the `"qb_at_snap"` field (location of the quarterback at the beginning of the play)
10. get the `"description"` field (brief description of the play)
11. if the `yfd` are less than 4, the `qb_at_snap` is equal to `"Under Center"`, and the `description` starts with `"J.Hurts rushed"`, then: **[IF #3 START]**
12. add the `game_id` to the play
13. append the play to the `shoves` list **[IF #3 END]** **[IF #2 END]** **[LOOP #3 END]** **[LOOP #2 END]** **[LOOP #1 END]**
14. return `shoves` list

Once we have the function ready, we will define the function `game_shoves` by applying our function to `played_eagles_games_df.id`. This gives us a list-of-lists that we need to flatten.

In [51]:
def game_shoves(game_id):
    play_by_play = nfl.get_play_by_play(game_id)
    shoves = []
    for period in play_by_play['periods']:
        for drive in period['pbp']:
            if drive['type'] != 'drive':
                continue
            for play in drive['events']:
                if 'play_type' in play and play['play_type'] == 'rush':
                    if 'start_situation' in play and 'qb_at_snap' in play and 'description' in play:
                    # if play.str.contains("start_situation&qb_at_snap&description"):
                        yfd = play['start_situation']['yfd'] < 4
                        qb_at_snap = play['qb_at_snap'] == 'Under Center'
                        description = play['description'].startswith('J.Hurts rushed')
                        if yfd and qb_at_snap and description:
                            play['game_id'] = game_id
                            shoves.append(play)
    return shoves

all_shove_plays = played_games_mask.id.apply(game_shoves)
all_shove_plays

shoves = []
for s in all_shove_plays:
    shoves.extend(s)
shoves[0]

{'type': 'play',
 'id': 'd7b85180-5364-11ee-849d-0f0c0bba0048',
 'sequence': 1694740386451.0,
 'clock': '2:46',
 'home_points': 9,
 'away_points': 7,
 'play_type': 'rush',
 'scoring_play': True,
 'goaltogo': True,
 'wall_clock': '2023-09-15T01:12:20+00:00',
 'description': 'J.Hurts rushed to MIN End Zone for 1 yards. J.Hurts for 1 yards, TOUCHDOWN.',
 'scoring_description': 'J.Hurts rushed to MIN End Zone for 1 yards. J.Hurts for 1 yards, TOUCHDOWN.',
 'men_in_box': 9,
 'fake_punt': False,
 'fake_field_goal': False,
 'screen_pass': False,
 'blitz': False,
 'play_direction': 'Middle',
 'left_tightends': 1,
 'right_tightends': 1,
 'hash_mark': 'Right Hash',
 'qb_at_snap': 'Under Center',
 'huddle': 'Huddle',
 'running_lane': 0,
 'play_action': False,
 'run_pass_option': False,
 'start_situation': {'clock': '2:46',
  'down': 2,
  'yfd': 1,
  'possession': {'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',
   'name': 'Eagles',
   'market': 'Philadelphia',
   'alias': 'PHI',
   'sr_id': 'sr:com

We should see 28 instances of the Brotherly Shove executed by the Eagles. Let's take a look at one of these shoves.

In [52]:
shoves[0]

{'type': 'play',
 'id': 'd7b85180-5364-11ee-849d-0f0c0bba0048',
 'sequence': 1694740386451.0,
 'clock': '2:46',
 'home_points': 9,
 'away_points': 7,
 'play_type': 'rush',
 'scoring_play': True,
 'goaltogo': True,
 'wall_clock': '2023-09-15T01:12:20+00:00',
 'description': 'J.Hurts rushed to MIN End Zone for 1 yards. J.Hurts for 1 yards, TOUCHDOWN.',
 'scoring_description': 'J.Hurts rushed to MIN End Zone for 1 yards. J.Hurts for 1 yards, TOUCHDOWN.',
 'men_in_box': 9,
 'fake_punt': False,
 'fake_field_goal': False,
 'screen_pass': False,
 'blitz': False,
 'play_direction': 'Middle',
 'left_tightends': 1,
 'right_tightends': 1,
 'hash_mark': 'Right Hash',
 'qb_at_snap': 'Under Center',
 'huddle': 'Huddle',
 'running_lane': 0,
 'play_action': False,
 'run_pass_option': False,
 'start_situation': {'clock': '2:46',
  'down': 2,
  'yfd': 1,
  'possession': {'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',
   'name': 'Eagles',
   'market': 'Philadelphia',
   'alias': 'PHI',
   'sr_id': 'sr:com

*That is a tremendous amount of data captured for a single play...*. Now we need to dive deeper and determine how sucessful they are! There are a few different ways we can do this. We can do this by looking at the `statistics` structure of each play we have recorded. For example:

In [53]:
shoves[0]['statistics']

[{'stat_type': 'rush',
  'attempt': 1,
  'yards': 1,
  'touchdown': 1,
  'firstdown': 1,
  'inside_20': 1,
  'goaltogo': 0,
  'broken_tackles': 0,
  'kneel_down': 0,
  'scramble': 0,
  'player': {'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',
   'name': 'Jalen Hurts',
   'jersey': '01',
   'position': 'QB',
   'sr_id': 'sr:player:2040065'},
  'team': {'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',
   'name': 'Eagles',
   'market': 'Philadelphia',
   'alias': 'PHI',
   'sr_id': 'sr:competitor:4428'}},
 {'stat_type': 'first_down',
  'category': 'rush',
  'player': {'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',
   'name': 'Jalen Hurts',
   'jersey': '01',
   'position': 'QB',
   'sr_id': 'sr:player:2040065'},
  'team': {'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',
   'name': 'Eagles',
   'market': 'Philadelphia',
   'alias': 'PHI',
   'sr_id': 'sr:competitor:4428'}}]

Again there is a tremendous amount of data here. We need to focus on the *rush* statistic. We can get the rush statistic using some loops, iterating over the `shoves` and inspecting the `stat_type` of the statistics.

1. create an empty list `stats`
2. iterate over `shoves`
3. iterate over `shoves['statistics']`
4. if the statistic's `"stat_type"` is equal to `"rush"`, then
5. add the `game_id` to the stat
6. append the stat to the `stats` list

In [54]:
stats = []
for shove in shoves:
    for stat in shove['statistics']:
        if stat['stat_type'] == 'rush':
            stat['game_id'] = shove['game_id']
            stats.append(stat)

stats[0]

{'stat_type': 'rush',
 'attempt': 1,
 'yards': 1,
 'touchdown': 1,
 'firstdown': 1,
 'inside_20': 1,
 'goaltogo': 0,
 'broken_tackles': 0,
 'kneel_down': 0,
 'scramble': 0,
 'player': {'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',
  'name': 'Jalen Hurts',
  'jersey': '01',
  'position': 'QB',
  'sr_id': 'sr:player:2040065'},
 'team': {'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',
  'name': 'Eagles',
  'market': 'Philadelphia',
  'alias': 'PHI',
  'sr_id': 'sr:competitor:4428'},
 'game_id': 'c1052576-390a-49b8-9101-88e7d802b464'}

The key here is to find the stats that have a first-down marked as a `1`. This is important, because technically both firstdowns and touchdowns will have the first-down marker. This allows us to observe what shoves were successful! We can throw it into a dataframe now and perform some cleanup. We need to ensire that the `firstdown` and `touchdown` columns are booleans with valid values, and we need to ensure that the `yards_after_contact` has valid values.

In [56]:
stats_df = pd.DataFrame(stats)
stats_df

Unnamed: 0,stat_type,attempt,yards,touchdown,firstdown,inside_20,goaltogo,broken_tackles,kneel_down,scramble,player,team,game_id,yards_after_contact
0,rush,1,1,1.0,1,1,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",c1052576-390a-49b8-9101-88e7d802b464,
1,rush,1,1,1.0,1,1,1,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",c1052576-390a-49b8-9101-88e7d802b464,
2,rush,1,3,,1,0,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",c1052576-390a-49b8-9101-88e7d802b464,
3,rush,1,1,,1,0,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",acf1f2a6-1e48-4971-aa52-1a32e322d63f,1.0
4,rush,1,0,,0,1,1,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",acf1f2a6-1e48-4971-aa52-1a32e322d63f,
5,rush,1,1,1.0,1,1,1,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",acf1f2a6-1e48-4971-aa52-1a32e322d63f,
6,rush,1,2,,1,0,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",acf1f2a6-1e48-4971-aa52-1a32e322d63f,
7,rush,1,3,,1,0,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",667e5a52-a24a-4f03-8a9b-2bf65d9c80ab,1.0
8,rush,1,1,,1,0,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",667e5a52-a24a-4f03-8a9b-2bf65d9c80ab,1.0
9,rush,1,2,,1,0,0,0,0,0,"{'id': '64bd0f02-6a5d-407e-98f1-fd02048ea21d',...","{'id': '386bdbf9-9eea-4869-bb9a-274b0bc66e80',...",667e5a52-a24a-4f03-8a9b-2bf65d9c80ab,


Now that we can see all of the shoves and their success (again, as indicated by the `firstdown` column)! We can easily just take the mean of the `firstdown` column to get a success rate

In [57]:
import numpy as np
100 * np.mean(stats_df.firstdown)

85.71428571428571

That is an immensely good success rate! Let's try to reason about it by looking at a simple metric: player weight. While there are many players that participate in the play we are going to focus on core defensive and offensive lines. This will allow us to see how the `Philadelpha Eagles` stack up against opposing teams purely from a physical perspective. To do this we need to define a new function that we can apply to games to compute the average weight of the offensive and defensive lines that day. To begin let's just look at the available positions:


In [59]:
positions = pd.read_csv('notebooks/data/positions.csv')
positions

Unnamed: 0,POSITION,ABBR,SIDE
0,CENTER,C,OFFENSE
1,OFFENSIVE GUARD,G,OFFENSE
2,OFFENSIVE TACKLE,T,OFFENSE
3,QUARTERBACK,QB,OFFENSE
4,RUNNING BACK,RB,OFFENSE
5,WIDE RECEIVER,WR,OFFENSE
6,TIGHT END,TE,OFFENSE
7,DEFENSIVE LINE,DL,DEFENSE
8,DEFENSIVE TACKLE,DT,DEFENSE
9,DEFENSE END,DE,DEFENSE


We will start with using core positions:

In [60]:
oline = ['C','G','T']
dline = ['DT','DL']

And now we can define our function. This function will take in a game ID, and use that to pull player data using `nfl.get_game_rosters`. Once we determine which team (home or away) is played by the `Philadelphia Eagles` then we can access the rosters accordingly, pull out players that started/played in the game at the given positions, and record their weights! Ultimately our function should return a dictionary mapping team name to average weight.

The function should:

1. get the rosters for the given `game_id`
2. conditionally set new `home` and `away` variables equal to either `oline` or `dline` depending on the alias of the home team roster
3. create variables `home_data` and `away_data` that are each a list of weights pulled by iterating over the roster's home/away player lists, checking that their status was either `started` or `played` and that their position is in the appropriate position list
4. return a dictionary mapping team aliases to the mean of their weights.

In [63]:
def get_roster_weights(game_id):
    rosters = nfl.get_game_rosters(game_id)
    if rosters['home']['alias'] == 'PHI':
        home = oline
        away = dline
    else:
        home = oline
        away = dline
    
    home_data = [p['weight'] for p in rosters['home']['players'] if (p['position'] in home and p['status'] in ['started', 'played'])]
    away_data = [p['weight'] for p in rosters['away']['players'] if (p['position'] in away and p['status'] in ['started', 'played'])]

    return {
        rosters['home']['alias'] : np.mean(home_data),
        rosters['away']['alias'] : np.mean(away_data)
    }

stats_df.drop_duplicates('game_id').game_id.apply(get_roster_weights)

0                         {'PHI': 323.0, 'MIN': 307.75}
3               {'TB': 312.2, 'PHI': 307.3333333333333}
7                          {'PHI': 323.0, 'WAS': 309.0}
10                          {'LA': 310.0, 'PHI': 303.8}
16                        {'PHI': 322.75, 'MIA': 308.0}
20    {'WAS': 317.42857142857144, 'PHI': 307.3333333...
21                       {'PHI': 322.75, 'DAL': 309.75}
24             {'KC': 317.14285714285717, 'PHI': 311.4}
27            {'PHI': 325.3333333333333, 'BUF': 317.25}
Name: game_id, dtype: object

With some careful observation we can see that the `Philadelphia Eagles` offensive on average ways more than every defensive line they've gone up against! While this does not include linebackers and cornerbacks for the defense, this also does not include tight ends, wide receivers, running backs, and most of all Jalen Hurts on the offense.