# NFL 1st & Future 2021


In this competition we are provided video clips of NFL plays along with player tracking data. Our goal is to create a model that can produce bounding boxes around players helmets, and identify when collisions occur. In addition to the video clips, we are also given some images with bounding boxes identified to help aid our object detection algorithm.

This competition is evaluated using a micro F1 score at an Intersection over Union (IoU) threshold of 0.35.

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pylab as plt
import matplotlib.patches as patches

import imageio
import cv2
import subprocess

from IPython.display import Video, display
import os

sns.set_style("whitegrid")
colorpal = sns.color_palette("husl", 9)

import warnings
warnings.filterwarnings("ignore")

# Data Overview
- `train_labels.csv` - Helmet tracking and collision labels for the training set.
- `sample_submission.csv` -  A valid sample submission file.
- `image_labels.csv` - contains the bounding boxes corresponding to the images.
- `[train/test]_player_tracking.csv` - Each player wears a sensor that allows us to precisely locate them on the field; that information is reported in these two files.

Folders:
- `/train/` contains the mp4 video files for the training plays. Each play has both an `endzone` and `sideline` view.
- `/test/` contains the videos for the test set. In the public dataset you only see 2 videos but these are just examples and are actually already in the training set. When your model actually submitted it will run on 15 unseen videos. We are told that 20% of the test videos will product the public LB score, and 80% will produce the private score (3 plays public LB, 12 private).. so there may be some shakeup on the private leaderboard!
- `/images/` contains the additional annotated images of player helmets.

In [None]:
!ls ../input/nfl-impact-detection/ -GFlash --color

In [None]:
tr_labels = pd.read_csv('../input/nfl-impact-detection/train_labels.csv')
img_labels = pd.read_csv('../input/nfl-impact-detection/image_labels.csv')
ss = pd.read_csv('../input/nfl-impact-detection/sample_submission.csv')

tr_tracking = pd.read_csv('../input/nfl-impact-detection/train_player_tracking.csv')
te_tracking = pd.read_csv('../input/nfl-impact-detection/test_player_tracking.csv')

# Understanding the Label and Metric
- In our submission we are told "Each row in your submission represents a single predicted bounding box for the given frame. Note that it is not required to include labels of which players had an impact, only a bounding box where it occurred."

Lets dig deeper into the labels to see some examples for the training data and then gather some statistics about the training set.

We can see that:
- Training videos provided are sideline and endzone views for 60 plays, 120 videos in total.

In [None]:
# Number of unique videos
tr_labels['video'].nunique()

The length of each play varies. Most plays are around 300 frames, but the longest is over 600 frames.

In [None]:
play_frame_count = tr_labels[['gameKey','playID','frame']] \
    .drop_duplicates()[['gameKey','playID']] \
    .value_counts()

fig, ax = plt.subplots(figsize=(12, 5))
sns.distplot(play_frame_count, bins=15)
ax.set_title('Distribution of frames per video file')
plt.show()

The bounding box size depends on a number of factors:
- The ditance of a player from the camera.
- The camera's angle and zoom relative to the field.
- One player's helmet may be blocked from view by another player.

We can calculate the area of each bounding box for the training set by: `width x height`

In [None]:
tr_labels['area'] = tr_labels['width'] * tr_labels['height']
fig, ax = plt.subplots(figsize=(12, 5))
sns.distplot(tr_labels['area'].value_counts(),
             bins=10,
             color=colorpal[1])
ax.set_title('Distribution bounding box sizes')
plt.show()

In [None]:
tr_labels['label'].value_counts() \
    .sort_values() \
    .tail(25).plot(kind='barh',
                   figsize=(15, 5),
                   title='Top 25 Box Labels',
                   color=colorpal[3])
plt.show()

The impacts are labeled by types: Helmet, shoudler, body, etc. We can see the the majority of impact types are with other helmets, but shoulder and body impacts do occur. Our submission does not need to identify the impact type, but it may be helpful information when training models.

In [None]:
tr_labels['impactType'].value_counts() \
    .plot(kind='bar',
          title='Impact Type Count',
          figsize=(12, 4),
          color=colorpal[4])

plt.show()

# Impact Type by Frame
This plot shows the relationship between the impact "label" and the time within the video. Notice that helmet and shoulder impacts tend to occur earlier in the plays. Body impact next, followed by ground impacts which commonly occur near the middle/end of the play.

In [None]:
for i, d in tr_labels.groupby('impactType'):
    if len(d) < 10:
        continue
    d['frame'].plot(kind='kde', alpha=1, figsize=(12, 4), label=i,
                    title='Impact Type by Frame')
    plt.legend()

## Impact Occurance
Next we will look at the occurance of impact events in the training videos. These events are extremely rare. Out of every 1000 bounding boxes, roughly 2.3 of them involve an impact.

In [None]:
pct_impact_occurance = tr_labels[['video','impact']] \
    .fillna(0)['impact'].mean() * 100
print(f'Of all bounding boxes, {pct_impact_occurance:0.4f}% of them involve an impact event')

We can also look at the impact percentage by frame in the video.

In [None]:
tr_labels[['video','impact','frame']] \
    .fillna(0) \
    .groupby(['frame']).mean() \
    .plot(figsize=(12, 5), title='Occurance of impacts by frame in video.',
         color=colorpal[6])
plt.show()

# Pairplot of Bounding Box, Impact vs Non-Impact
These plots attempt to quickly identify if there is any commonality between the location of the bounding box and where impacts occur. It appears that the locations tend to be 

In [None]:
sns.pairplot(tr_labels[['frame','area',
                        'left','width',
                        'top','height',
                        'impact']] \
                .sample(5000).fillna(0),
             hue='impact')
plt.show()

Similarly we can look at the impact type by bounding box location and area.

In [None]:
sns.pairplot(tr_labels[['frame','area',
                        'left', 'top',
                        'impactType']].dropna() \
             .sample(1000), hue='impactType',
            plot_kws={'alpha': 0.5})
plt.show()

# Confidence Label
1 = Possible, 2 = Definitive, 3 = Definitive and Obvious

In [None]:
tr_labels['confidence'].dropna() \
    .astype('int').value_counts() \
    .plot(kind='bar',
          title='Confidence Type Label Count',
          figsize=(12, 4),
          color=colorpal[5], rot=0)
plt.show()

# Visability Label
Visibility labels are: 0 = Not Visible from View, 1 = Minimum, 2 = Visible, 3 = Clearly Visible

In [None]:
tr_labels['visibility'].dropna() \
    .astype('int').value_counts() \
    .plot(kind='bar',
          title='Visibility Label Count',
          figsize=(12, 4),
          color=colorpal[6], rot=0)
plt.show()

# Images

The images are still photo equivalents of the train/test videos for use making a helmet detector. Lets load an image and plot the image label on it.

First we will create a function that highlights the labels on the images, then we can plot 8 example images to get a good understanding of what they look like.

In [None]:
def plot_example_image(img_fn, ax, highlight_color='r', highlight_alpha=0.5):
    img_data = cv2.imread(f'../input/nfl-impact-detection/images/{img_fn}')
    ax.imshow(img_data)
    ax.grid(False)

    # Create a Rectangle patch
    for i, d in img_labels.loc[img_labels['image'] == img_fn].iterrows():

        rect = patches.Rectangle((d['left'],
                                  d['top']),
                                  d['width'],
                                 d['height'],
                                 linewidth=1,
                                 edgecolor=highlight_color,
                                 facecolor=highlight_color,
                                alpha=highlight_alpha)
        ax.add_patch(rect)
    ax.axis('off')
    ax.set_title(img_fn)
    return ax

In [None]:
# Loop through 8 example images
fig, axs = plt.subplots(4, 2, figsize=(14, 16))
axs = axs.flatten()
i = 0
for example_image in img_labels.sample(8, random_state=999)['image']:
    plot_example_image(example_image, axs[i])
    i += 1
plt.show()

# Training videos

We can pull an example video to see what they look like.

In [None]:
tr_labels.dropna().iloc[50]

Lets read the first example's video file.

In [None]:
# Modified function from to take single frame.
# https://www.kaggle.com/samhuddleston/nfl-1st-and-future-getting-started
def annotate_frame(video_path: str, video_labels: pd.DataFrame, stop_frame: int) -> str:
    VIDEO_CODEC = "MP4V"
    HELMET_COLOR = (0, 0, 0)    # Black
    IMPACT_COLOR = (0, 0, 255)  # Red
    video_name = os.path.basename(video_path)
    
    vidcap = cv2.VideoCapture(video_path)
    fps = vidcap.get(cv2.CAP_PROP_FPS)
    width = int(vidcap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(vidcap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    output_path = "labeled_" + video_name
    tmp_output_path = "tmp_" + output_path
    output_video = cv2.VideoWriter(tmp_output_path, cv2.VideoWriter_fourcc(*VIDEO_CODEC), fps, (width, height))
    frame = 0
    while True:
        it_worked, img = vidcap.read()
        if not it_worked:
            break
        
        # We need to add 1 to the frame count to match the label frame index that starts at 1
        frame += 1
        if frame != stop_frame:
            continue
        
        # Let's add a frame index to the video so we can track where we are
        img_name = f"{video_name}_frame{frame}"
        cv2.putText(img, img_name, (0, 50), cv2.FONT_HERSHEY_SIMPLEX, 1.0, HELMET_COLOR, thickness=2)
    
        # Now, add the boxes
        boxes = video_labels.query("video == @video_name and frame == @frame")
        for box in boxes.itertuples(index=False):
            if box.impact == 1 and box.confidence > 1 and box.visibility > 0:    # Filter for definitive head impacts and turn labels red
                color, thickness = IMPACT_COLOR, 2
            else:
                color, thickness = HELMET_COLOR, 1
            # Add a box around the helmet
            cv2.rectangle(img, (box.left, box.top), (box.left + box.width, box.top + box.height), color, thickness=thickness)
            cv2.putText(img, box.label, (box.left, max(0, box.top - 5)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, color, thickness=1)
        output_video.write(img)
    output_video.release()
    
    # Not all browsers support the codec, we will re-load the file at tmp_output_path and convert to a codec that is more broadly readable using ffmpeg
    if os.path.exists(output_path):
        os.remove(output_path)
    subprocess.run(["ffmpeg", "-i", tmp_output_path, "-crf", "18", "-preset", "veryfast", "-vcodec", "libx264", output_path])
    os.remove(tmp_output_path)
    
    return output_path

In [None]:
video_name = tr_labels.dropna().reset_index().iloc[50]['video']
video_path = f"/kaggle/input/nfl-impact-detection/train/{video_name}"
display(Video(data=video_path, embed=True))

In [None]:
annotate_frame(video_path, video_labels=tr_labels,
               stop_frame=tr_labels.dropna().iloc[50]['frame'])

# Example Single Frame with annotations

In [None]:
display(Video(data='labeled_57584_000336_Endzone.mp4', embed=True))

# How many boxes per frame?

In [None]:
tr_labels['counter'] = 1
tr_labels.groupby(['frame'])['counter'] \
    .sum().plot(title='Bounding Boxes by Video Frame',
                figsize=(15, 5))
plt.show()