# Starter

#  What we want ?? ( Problem statement )

#### Understanding the problem is a very important step 

#### After reading the information given on the competition’s home page, we can easily understand the situation we are dealing with. Also, we can read more on the NFL and its efforts for health and safety from this website: www.NFL.com/PlayerHealthandSafety.


#### Here I am trying to put things in a simple manner: 

#### In a sense we can say that :
#### we are given an image and we need to detect some objects(bbox) in the image along with their label. 


#### Image ( picked the first image in the images folder ) 

In [1]:
# Lets see the image 

from PIL import Image, ImageDraw

img = Image.open('../input/nfl-health-and-safety-helmet-assignment/images/57503_000116_Endzone_frame443.jpg')

img

#### We can see there are some players playing Football, 

#### Detect the helmet and make the bounding boxes

In [20]:
import pandas as pd 

df = pd.read_csv('../input/nfl-health-and-safety-helmet-assignment/image_labels.csv')
df = df.where(df["image"]=="57503_000116_Endzone_frame443.jpg").dropna()

draw_obj = ImageDraw.Draw(img)

for _ ,(l, w, t, h) in df[['left', 'width', 'top', 'height']].iterrows():
        draw_obj.rectangle(((l, t), (l + w, t + h)), outline=(255, 0, 0), width=2)

img

#### We can see that the helmets are detected properly, Next step is to give the proper label to each helmet 

#### Every player has one helmet, and the label is the associate player's number.


#### So we can rephrase over sentence as :

#### We are given an image of players playing in the NFL and we want to detect the helmets in the image and also provide the associate player's number as the label.

#### Let’s have a look at the subbmition.csv file 

In [3]:
sub_df = pd.read_csv('../input/nfl-health-and-safety-helmet-assignment/sample_submission.csv')
sub_df

#### Now that we have a basic understanding of the problem statement, And there is a lot in the story that will be covered when we walk through the data files provided to us in the next section, have fun :) 

In [4]:
#sub_df.info()

# What we have ?? ( Data analysis ) 

#### Well there is a lot in the data, but primarily we have folders and CSV files 

#### lets start with folders 

* images
* train/test

#### images folder contains a lot of images from the game ( like the one we saw above ), this folder is given for the training of the helmet detector model.

#### train/test folders contain the video files that we have to use while training and inferring the helmet and label detector model.

In [5]:
#folders

## Let's have a look at CSV files 

### image_labels.csv

#### This file contains the corresponding bbox for helmets in the images from the images folder 

In [6]:
# load the file

df_image_labels = pd.read_csv('../input/nfl-health-and-safety-helmet-assignment/image_labels.csv')
df_image_labels.head()

#### lets look at one image ( first one from the above table )

In [7]:
img = Image.open('../input/nfl-health-and-safety-helmet-assignment/images/57503_000116_Endzone_frame443.jpg')
img

#### lets see what the first data point in the image label.csv says 
#### we only considering the first row : 

57503_000116_Endzone_frame443.jpg	Helmet	**1099	16	456	15**
    

In [8]:

draw_obj = ImageDraw.Draw(img)

draw_obj.rectangle(((1099, 456), (1099 + 16, 456 + 15)), outline=(255, 0, 0))
img

#### we see there is only one helmet marked 

#### that means, every row in the image_label.csv represents a single helmet in a single image 
#### and there are a lot of helmets in the image

#### In other words: If there are x helmets in an image then there will be x rows in the image_labels.csv corresponding to the same image 


#### we can simply iterate over all the rows corresponding to our image and get all the helmets marked 

In [9]:
df = df_image_labels.where(df_image_labels["image"]=="57503_000116_Endzone_frame443.jpg").dropna()
for _ ,(l, w, t, h) in df[['left', 'width', 'top', 'height']].iterrows():
        draw_obj.rectangle(((l, t), (l + w, t + h)), outline=(255, 0, 0), width=2)

img

#### Now we should be clear about images and image_labels.csv, together we can use them for the training of the helmet detector model

## train_labels.csv

#### before going to the CSV file let’s have one deep look into the train folder, its videos and all.



#### For those who have no idea of what this game is...

#### Let’s think like this:

#### There is pay going on in the closed room. There are two cameras in the room, one is on the front wall ( Endzone view ) and one on the side wall(Sideline) :)

#### So obviously there are two videos for each play, one recorded everything from the front and the other recorder everything from the side, But both represent the same play 


#### let’s have a look at the first video  

#### Endzone

In [10]:

from IPython.display import Video

Video("../input/nfl-health-and-safety-helmet-assignment/train/57583_000082_Endzone.mp4")

#note : if video does not start here just duble click the video in the dataset and enjoy  ( $ _ $ )

#### Sideline

In [11]:
Video("../input/nfl-health-and-safety-helmet-assignment/train/57583_000082_Sideline.mp4")

#### well that cleared a lot about the videos and all ...

#### We know that video is nothing but just a set of frames( images ) running very fast the rate at which frame moves is called as frame rate more formally no. of frames passed in one second is called as frame rate 

#### Previously in the image_label.csv we saw that there is one image and a lot of helmets and each helmet is represented in one row of CSV file making it large. here we can think of two images ( one from the Endzone video and one from Sideline video ) representing the same situation in the play.


#### To simplify 

#### There is a play 

#### there are two videos for this pay 

#### Each video contains a x frames 

#### Each frame makes y rows in the train_label.csv ( i.e y helmets ) 


#### Now that we saw what’s there in the train folder we can start understanding train_label.csv 

In [12]:
df_train_labels = pd.read_csv('../input/nfl-health-and-safety-helmet-assignment/train_labels.csv')
df_train_labels.head()


#### Let’s go over every column names 

* Gamekey: name of the game ( formally ID of the game )
* playID: name of the play  ( formally play ID)

* video: name of the video file (formally same thing *(_)* )

#### extra note begins 

#### let’s take the eg: 57583_000082_Endzone.mp4, here 

#### string before first '_' represents the gamekey (57583)
#### string before second '_' represents the playID (82)
#### string after second '_' represents the Category of the video ( either Endzone or Sideline)

#### extra note ends 
            
* View: says what camera is used (formally type of the video  either Endzone or Sideline)

* video_frame : name of the video frame ( given )
* frame : frame number ( helps to know the sequence of the frames)


#### Now the column names we have seen so far occur repeatedly in the datafame

In [13]:
# lets look over columns 
print(df_train_labels.columns)

#### remember every frame ( image ) has a lot of helmets and every helmet has to be represented by four values left, width, top, and hight 


* Label: name of the player wearing a helmet (or its number)
* left, width, top, height: defines the exact position of the helmet in the image 


#### (the following description is the same as given on the competition’s home page )
*  impactType: a description of the type of helmet impact: helmet, shoulder, body, ground, etc.
* isDefinitiveImpact: True/False indicator of definitive impacts. Definitive impact boxes are given additional weight in the scoring algorithm.
* isSidelinePlayer: True/False indicator of if the helmet box is on the sideline of the play. Only rows where this field is False will be used in the scoring.


#### And That's all :) 


In [14]:
#end

## train/test_baseline_helmets.csv


#### Let's look at the data

In [15]:

df_train_baseline = pd.read_csv('../input/nfl-health-and-safety-helmet-assignment/train_baseline_helmets.csv')

df_train_baseline.head()

#### in this data We see that there are three categories of columns 

1. Video_frame: 
 	name of the video frame file 
    
2. left, width, top, height: 
    predicted positions of the helmet in the images 
    
3. conf: 
    confidence of the model ( how much the model is sure about the position of the helmet in the image )


#### this file contains imperfect baseline predictions for helmet boxes

#### well what does it mean by imperfect baseline?

* They trained a model on the images in the images folder and gave a baseline to us, in other words, we should try developing the model that at least outperforms this model ( above the baseline )



In [16]:
#end

## train/test_player_tracking.csv

#### One of the most imp csv files 

####  well there is a lot going on here 


#### All the information that we have seen so far is pretty reasonable and understandable that's why it’s easy to interpret 

#### Now We have a twist ... 

#### Each player wears a sensor that helps precisely locate them on the field. And all that information is located in the train_player_tracking.csv

#### We knew that the position of the player or pretty much every data given in all other CSV files is important for the development and training  of the machine learning algorithms 

#### But the real question is how can we use this? 

* To find that answer let’s look at the data !! 





#### let's have a look 

In [17]:
df_train_player_tracking = pd.read_csv('../input/nfl-health-and-safety-helmet-assignment/train_player_tracking.csv')

df_train_player_tracking.head()


#### Here we can see that gameKey, playID, and player are known to us, but what are these other things ??

#### To find the answer lets hear the complete story 

#### We know that players are with the sensors but we have to record the reading of those sensors otherwise it’s useless.Well if we are recording the sensor’s response, when should we record it?

#### Now let’s look at the time column 

#### first value : 2018-09-14T00:23:45.500Z

#### this time representation is in the ISO 8601 format, in other words, it shows date and time up to millisecond 

#### date : 2018-09-14  (befour T char in the string )
#### time : 00:23:45.500Z (after T char in the string )
#### i.e 
#### 00 - > hr
#### 23 - > min 
#### 45 - > sec 
#### 500 - > mili sec 

In [18]:
# look from 5th to 15th
df_train_player_tracking.head(20)

#### We can see time difference in the consecutive data points and we will find that for every sec there are 10 recordings in the dataset.
#### that is the speed of recording is 10Hz, Which means the sensor data is recorded 10 times pr sec ( very prices )

#### Sensor recoreded position, speed , acclaretion ... etc 

#### Let’s have a quick look at the columns :

1. x: player position along the long axis of the field.
2. y: player position along the short axis of the field. 
3. s: speed in yards/second.
4. a: acceleration in yards/second^2.
5. dis: distance traveled from prior time point, in yards.
6. o: orientation of player (deg).
7. dir: angle of player motion (deg).
8. event: game events like a snap, whistle, etc.


#### This all gives different directions we can tackle this particular problem, We should try to find creative ways to use the given data and solve the problem :) 

### Thank you, Happy to hear your thoughts/suggestions 

In [19]:
#end

In [None]:
def deepsort_helmets(video_data,
                     video_dir,
                     deepsort_config='deepsort.yaml',
                     plot=False,
                     plot_frames=[]):
    
    # Setup Deepsort
    cfg = get_config()
    cfg.merge_from_file(deepsort_config)    
    deepsort = DeepSort(cfg.DEEPSORT.REID_CKPT,
                        max_dist=cfg.DEEPSORT.MAX_DIST,
                        min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE,
                        nms_max_overlap=cfg.DEEPSORT.NMS_MAX_OVERLAP,
                        max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE,
                        max_age=cfg.DEEPSORT.MAX_AGE,
                        n_init=cfg.DEEPSORT.N_INIT,
                        nn_budget=cfg.DEEPSORT.NN_BUDGET,
                        use_cuda=True)
    
    # Run through frames.
    video_data = video_data.sort_values('frame').reset_index(drop=True)
    ds = []
    for frame, d in tqdm(video_data.groupby(['frame']), total=video_data['frame'].nunique()):
        d['x'] = (d['left'] + round(d['width'] / 2))
        d['y'] = (d['top'] + round(d['height'] / 2))

        xywhs = d[['x','y','width','height']].values

        cap = cv2.VideoCapture(f'{video_dir}/{myvideo}.mp4')
        cap.set(cv2.CAP_PROP_POS_FRAMES, frame-1) # optional
        success, image = cap.read()
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        confs = np.ones([len(d),])
        clss =  np.zeros([len(d),])
        outputs = deepsort.update(xywhs, confs, clss, image)

        if (plot and frame > cfg.DEEPSORT.N_INIT) or (frame in plot_frames):
            for j, (output, conf) in enumerate(zip(outputs, confs)): 

                bboxes = output[0:4]
                id = output[4]
                cls = output[5]

                c = int(cls)  # integer class
                label = f'{id}'
                color = compute_color_for_id(id)
                im = plot_one_box(bboxes, image, label=label, color=color, line_thickness=2)
            fig, ax = plt.subplots(figsize=(15, 10))
            video_frame = d['video_frame'].values[0]
            ax.set_title(f'Deepsort labels: {video_frame}')
            plt.imshow(im)
            plt.show()

        preds_df = pd.DataFrame(outputs, columns=['left','top','right','bottom','deepsort_cluster','class'])
        if len(preds_df) > 0:
            # TODO Fix this messy merge
            d = pd.merge_asof(d.sort_values(['left','top']),
                              preds_df[['left','top','deepsort_cluster']] \
                              .sort_values(['left','top']), on='left', suffixes=('','_deepsort'),
                              direction='nearest')
        ds.append(d)
    dout = pd.concat(ds)
    return dout

def add_deepsort_label_col(out):
    # Find the top occuring label for each deepsort_cluster
    sortlabel_map = out.groupby('deepsort_cluster')['label'].value_counts() \
        .sort_values(ascending=False).to_frame() \
        .rename(columns={'label':'label_count'}) \
        .reset_index() \
        .groupby(['deepsort_cluster']) \
        .first()['label'].to_dict()
    # Find the # of times that label appears for the deepsort_cluster.
    sortlabelcount_map = out.groupby('deepsort_cluster')['label'].value_counts() \
        .sort_values(ascending=False).to_frame() \
        .rename(columns={'label':'label_count'}) \
        .reset_index() \
        .groupby(['deepsort_cluster']) \
        .first()['label_count'].to_dict()
    
    out['label_deepsort'] = out['deepsort_cluster'].map(sortlabel_map)
    out['label_count_deepsort'] = out['deepsort_cluster'].map(sortlabelcount_map)

    return out

def score_vs_deepsort(myvideo, out, labels):
    # Score the base predictions compared to the deepsort postprocessed predictions.
    myvideo_mp4 = myvideo + '.mp4'
    labels_video = labels.query('video == @myvideo_mp4')
    scorer = NFLAssignmentScorer(labels_video)
    out_deduped = out.groupby(['video_frame','label']).first().reset_index()
    base_video_score = scorer.score(out_deduped)
    
    out_preds = out.drop('label', axis=1).rename(columns={'label_deepsort':'label'})
    print(out_preds.shape)
    out_preds = out_preds.groupby(['video_frame','label']).first().reset_index()
    print(out_preds.shape)
    deepsort_video_score = scorer.score(out_preds)
    print(f'{base_video_score:0.5f} before --> {deepsort_video_score:0.5f} deepsort')