# NFL Jersey Number Recognition to Track Players

This notebook presents a computer vision approach to track NFL players in video frames. Using the helmet boxes provided in the original dataset, a [dataset](https://www.kaggle.com/frlemarchand/nfl-player-numbers) was built by creating larger boxes to train a model to recognise jersey numbers. This notebook shows how to use the dataset, how to train a model and how to predict over video frames from the test set. I am mostly sharing this in order to help people interested in this approach but please bear in mind this is a work in progress and the current results seem to perform below other approaches such as the [simple helmet mapping](https://www.kaggle.com/its7171/nfl-baseline-simple-helmet-mapping) by [tito](https://www.kaggle.com/its7171).

In [None]:
import sys
from pathlib import Path
import os
from datetime import datetime
import time
import random
import cv2 as cv
import pandas as pd
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm
from sklearn.utils import class_weight

import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Dropout, Activation, Input, BatchNormalization, GlobalAveragePooling2D
from tensorflow.keras import layers
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.experimental import CosineDecay
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.layers.experimental.preprocessing import RandomCrop,CenterCrop, RandomRotation

In [None]:
dataset_df = pd.read_csv("../input/nfl-player-numbers/train_player_numbers.csv")
dataset_df

We proceed to a quick shuffle of the dataframe while filtering the images to only keep the ones taken from the endzone. From previous attempts, it appeared that frames taken from the sideline result in jersey numbers being too low resolution to read them.

In [None]:
dataset_df = dataset_df.sample(frac=1).reset_index(drop=True)
dataset_df["filepath"] = ["../input/nfl-player-numbers/"+row.filepath for idx, row in dataset_df.iterrows()]
dataset_df = dataset_df[dataset_df.video_frame.str.contains("Endzone")]

# Model training

The training process used below is heavily inspired from one of my previous notebooks available [here](https://www.kaggle.com/frlemarchand/efficientnet-aug-tf-keras-for-cassava-diseases). Please note that I should have probably split my dataset based on the playID as some frames within the same play can be extremely similar.

In [None]:
training_percentage = 0.8
training_item_count = int(len(dataset_df)*training_percentage)
validation_item_count = len(dataset_df)-int(len(dataset_df)*training_percentage)
training_df = dataset_df[:training_item_count]
validation_df = dataset_df[training_item_count:]

In [None]:
batch_size = 64
image_size = 64
input_shape = (image_size, image_size, 3)
dropout_rate = 0.4
classes_to_predict = sorted(training_df.label.unique())
class_weights = class_weight.compute_class_weight("balanced", classes_to_predict, training_df.label.values)
class_weights_dict = {i : class_weights[i] for i,label in enumerate(classes_to_predict)}

training_data = tf.data.Dataset.from_tensor_slices((training_df.filepath.values, training_df.label.values))
validation_data = tf.data.Dataset.from_tensor_slices((validation_df.filepath.values, validation_df.label.values))

In [None]:
def load_image_and_label_from_path(image_path, label):
    img = tf.io.read_file(image_path)
    img = tf.image.decode_jpeg(img, channels=3)
    return img, label

AUTOTUNE = tf.data.experimental.AUTOTUNE

training_data = training_data.map(load_image_and_label_from_path, num_parallel_calls=AUTOTUNE)
validation_data = validation_data.map(load_image_and_label_from_path, num_parallel_calls=AUTOTUNE)

training_data_batches = training_data.shuffle(buffer_size=1000).batch(batch_size).prefetch(buffer_size=AUTOTUNE)
validation_data_batches = validation_data.shuffle(buffer_size=1000).batch(batch_size).prefetch(buffer_size=AUTOTUNE)

In [None]:
data_augmentation_layers = tf.keras.Sequential(
    [
        layers.experimental.preprocessing.RandomRotation(0.25),
        layers.experimental.preprocessing.RandomZoom((-0.2, 0)),
        layers.experimental.preprocessing.RandomContrast((0.2,0.2))
    ]
)

In [None]:
image = Image.open(training_df.filepath.values[1])
plt.imshow(image)
plt.show()

In [None]:
image = tf.expand_dims(np.array(image), 0)
plt.figure(figsize=(10, 10))
for i in range(9):
  augmented_image = data_augmentation_layers(image)
  ax = plt.subplot(3, 3, i + 1)
  plt.imshow(augmented_image[0])
  plt.axis("off")

In [None]:
efficientnet = EfficientNetB0(weights="../input/efficientnet-b0-for-keras-no-top/efficientnetb0_notop.h5", 
                              include_top=False, 
                              input_shape=input_shape, 
                              drop_connect_rate=dropout_rate)

inputs = Input(shape=input_shape)
augmented = data_augmentation_layers(inputs)
efficientnet = efficientnet(augmented)
pooling = layers.GlobalAveragePooling2D()(efficientnet)
dropout = layers.Dropout(dropout_rate)(pooling)
outputs = Dense(len(classes_to_predict), activation="softmax")(dropout)
model = Model(inputs=inputs, outputs=outputs)
    
model.summary()

In [None]:
epochs = 25
decay_steps = int(round(len(training_df)/batch_size))*epochs
cosine_decay = CosineDecay(initial_learning_rate=1e-3, decay_steps=decay_steps, alpha=0.3)

callbacks = [ModelCheckpoint(filepath='best_model.h5', monitor='val_loss', save_best_only=True)]

model.compile(loss="sparse_categorical_crossentropy", optimizer=tf.keras.optimizers.Adam(cosine_decay), metrics=["accuracy"])

In [None]:
history = model.fit(training_data_batches,
                  epochs = epochs, 
                  validation_data=validation_data_batches,
                  class_weight=class_weights_dict,
                  callbacks=callbacks)

In [None]:
model.load_weights("best_model.h5")

# Extract videos frames from test set

In [None]:
def mk_images(video_name, video_labels, video_dir, out_dir, only_with_impact=True):
    video_path=f"{video_dir}/{video_name}"
    video_name = os.path.basename(video_path)
    vidcap = cv.VideoCapture(video_path)
    if only_with_impact:
        boxes_all = video_labels.query("video == @video_name")
        print(video_path, boxes_all[boxes_all.impact == 1.0].shape[0])
    else:
        print(video_path)
    frame = 0

    while True:
        it_worked, img = vidcap.read()
        if not it_worked:
            break
        frame += 1
        if only_with_impact:
            boxes = video_labels.query("video == @video_name and frame == @frame")
            boxes_with_impact = boxes[boxes.impact == 1.0]
            if boxes_with_impact.shape[0] == 0:
                continue
        img_name = f"{video_name}_frame{frame}"
        image_path = f'{out_dir}/{video_name}'.replace('.mp4',f'_{frame}.png')

        try:
            _ = cv.imwrite(image_path, img)
        except:
            print(img_name+" "+image_path)

In [None]:
test_frames_folder = "../working/test_frames"
os.mkdir(test_frames_folder)

video_dir = '../input/nfl-health-and-safety-helmet-assignment/test'
video_folder = [filename for filename in os.listdir(video_dir)]
for video_name in video_folder:
    print(video_name)
    mk_images(video_name, pd.DataFrame(), video_dir, test_frames_folder, only_with_impact=False)

# Recognise jersey numbers in new video frames

In [None]:
test_helmet_df = pd.read_csv("../input/nfl-health-and-safety-helmet-assignment/test_baseline_helmets.csv")
test_tracking_df = pd.read_csv("../input/nfl-health-and-safety-helmet-assignment/test_player_tracking.csv")

In [None]:
def find_team(jersey_number, video_frame_name):
    '''
    find to which team a player belong based on the jersey number
    return None if we have 2 players with the same number on the pitch
    '''
    
    game_id = int(video_frame_name.split("_")[0])
    play_id = int(video_frame_name.split("_")[1])
    player_list = test_tracking_df.query("gameKey==@game_id and playID==@play_id").player.unique()
    possible_players = [player_code for player_code in player_list if jersey_number==int(player_code[1:])]

    if len(possible_players)==1:
        return possible_players[0]
    else:
        return None

def extract_player_jersey(video_frame_name, display=False):
    '''
    Get the helmet boxes for a frame and apply the model.
    If a player is predicted twice, keeps the prediction
    with the highest confidence score.
    '''
    
    predictions = []
    img = np.array(Image.open(test_frames_folder+"/"+str(video_frame_name)))
    frame_df = test_helmet_df[test_helmet_df["video_frame"]==video_frame_name.replace(".png","")]

    baseline_boxes = np.array([np.array([row.left, row.top, row.left+row.width, row.top+row.height ])  for idx, row in frame_df.iterrows()])
    for idx, box in enumerate(baseline_boxes):
        box_centre = int(box[0]+round((box[2]-box[0])/2))
        jersey_box = img[box[3]-24:box[3]+40,box_centre-32:box_centre+32,:]
        
        if jersey_box.shape==(64,64,3):
            result = model.predict(np.array([np.array(jersey_box)]))
            predicted_jersey_number = np.argmax(result)
            confidence = result[0][np.argmax(result)] 
            confidence_threshold = 0.90

            if confidence>confidence_threshold:
                predicted_player_code = find_team(predicted_jersey_number, video_frame_name)
                
                if predicted_player_code is not None:
                    player_already_detected = [(i, item) for i, item in enumerate(predictions) if item["label"] == predicted_player_code]
                    prediction_data = {"video_frame":video_frame_name.replace(".png",""), 
                                            "label":predicted_player_code,
                                            "left":frame_df.iloc[idx].left,
                                            "width":frame_df.iloc[idx].width,	
                                            "top":frame_df.iloc[idx].top,
                                            "height":frame_df.iloc[idx].height,
                                            "confidence":confidence}
                    
                    if player_already_detected==[]:
                        predictions.append(prediction_data)
                    else:
                        if player_already_detected[0][1]['confidence']<confidence:
                            dict_index_to_remove = player_already_detected[0][0]
                            del predictions[dict_index_to_remove]
                            predictions.append(prediction_data)
                    
                    if display:
                        print(predicted_player_code, confidence)
                        plt.imshow(jersey_box)
                        plt.show()
        
    return predictions

For this run, we'll only apply prediction on the frames taken from the endzone as the jerseys are often easier to see.

In [None]:
frame_list = os.listdir(test_frames_folder)
frame_list = [x for x in frame_list if "Endzone" in x]
random.seed(42)
frames_to_test = random.sample(frame_list, 6) 

In [None]:
extract_player_jersey(frames_to_test[0], display=True)

In [None]:
extract_player_jersey(frames_to_test[1], display=True)

In [None]:
extract_player_jersey(frames_to_test[2], display=True)

In [None]:
extract_player_jersey(frames_to_test[3], display=True)

In [None]:
extract_player_jersey(frames_to_test[4], display=True)

In [None]:
extract_player_jersey(frames_to_test[5], display=True)

I have created the function to submit the predictions but it's still early days to be spamming the leaderboard yet!

In [None]:
def predict_for_submission(frame_list):
    prediction_list = []
    with tqdm(total=len(frame_list)) as pbar:
        for video_frame in frame_list:
            prediction = extract_player_jersey(video_frame)
            prediction_list += prediction
            pbar.update(1)
    return pd.DataFrame(prediction_list)

In [None]:
predict_for_submission(frame_list[:30])

## Thanks for reading this notebook! If you found this notebook helpful, please give it an upvote. It is always greatly appreciated!

In [None]:
!rm -rf ../working/test_frames