** NFL Data Analysis**

The **National Football League (NFL)** is a professional American football league consisting of 32 teams, divided equally between the National Football Conference (NFC) and the American Football Conference (AFC). The NFL is one of the four major professional sports leagues in North America, and the highest professional level of American football in the world.

![NFL](https://cdn1.thecomeback.com/wp-content/uploads/sites/94/2018/08/10497682-832x447.jpg)

The NFL was formed in 1920 as the American Professional Football Association (APFA) before renaming itself the National Football League for the 1922 season. The NFL agreed to merge with the American Football League (AFL) in 1966, and the first Super Bowl was held at the end of that season; the merger was completed in 1970. Today, the NFL has the highest average attendance (67,591) of any professional sports league in the world and is the most popular sports league in the United States. The Super Bowl is among the biggest club sporting events in the world[6] and individual Super Bowl games account for many of the most watched television programs in American history, all occupying the Nielsen's Top 5 tally of the all-time most watched U.S. television broadcasts by 2015. The NFL's executive officer is the commissioner, who has broad authority in governing the league. The players in the league belong to the National Football League Players Association.

The team with the most NFL championships is the Green Bay Packers with thirteen (nine NFL titles before the Super Bowl era, and four Super Bowl championships afterwards); the team with the most Super Bowl championships is the Pittsburgh Steelers with six. The current NFL champions are the Philadelphia Eagles, who defeated the New England Patriots in Super Bowl LII, their first Super Bowl championship after winning three NFL titles before the Super Bowl era.


**NFL Concussion**

A concussion is a type of traumatic brain injury caused by a blow to the head. Reports show an increasing number of retired NFL players who have suffered concussions have developed memory and cognitive issues such as dementia, Alzheimer's, depression and chronic traumatic encephalopathy (CTE).

In September 2015, researchers with the Department of Veterans Affairs and Boston University announced that they had identified CTE in 96 percent of NFL players that they had examined and in 79 percent of all football players.

NFL players suffered more concussions in 2017 than in each of the previous five years, according to data released Friday by the league. There were 281 reported concussions this season, including head injuries suffered in preseason games and practices.

You can learn more  [here](https://edition.cnn.com/2013/08/30/us/nfl-concussions-fast-facts/index.html).

![](https://www.playsmartplaysafe.com/wp-content/uploads/2018/06/checklist-june-2018-final1-791x1024.png)

![](https://www.playsmartplaysafe.com/wp-content/uploads/2018/06/9.8.18-concussion-protocol-1024x640.png)

Improvements to the Concussion Protocol

For the 2018 season, the Head, Neck and Spine Committee made additional improvements to the Concussion Protocol:

* Added a third UNC who will monitor the broadcast video and audio feeds of each game from the spotters’ booth, and notify on-field UNCs of possible head, neck or spine injuries.
* Defined impact seizure and fencing responses as independent signs of potential loss of consciousness, representing “No-Go” criteria under the current Protocol. Players who display either of these signs at any time shall be removed from play and may not return to the game.
* Required an evaluation for all players demonstrating gross motor instability (e.g., stumbling or falling to the ground when trying to stand) to determine the cause of the instability. If the team physician, in consultation with the sideline UNC, determines the instability to be neurologically caused, the player is designated a “No-Go” and may not return to play.
* Officials, teammates, and coaching staffs have been instructed to take an injured player directly to a member of the medical team for appropriate evaluation, including a concussion assessment, if warranted.
* Required all players who undergo any concussion evaluation on game day to have a follow-up evaluation conducted the following day by a member of the medical staff.


**2018 Rule Changes**

* 4-8-2 | Eliminates the requirement that a team who scores a touchdown at the end of regulation of a game to kick the extra point or go for two-point conversion if it would not affect the outcome of the game.
* 6-1-3, 6-2-1 | Modifies rules for a free kick formation and for blocking on a free kick. 
* 8-1-3 | Changes standard for a catch.
* 11-6-3 | Makes permanent the Playing Rule that changes the spot of the next snap after a touchback resulting from a free kick to the 25-yard line. 
* 12-2-8 | Makes lowering the head to initiate contact with the helmet a foul.
* 12-5-1 | Makes the penalties for Illegal Batting and Kicking the same.
* 15-2-2 | Authorizes the designated member of the Officiating department to instruct on-field game officials to disqualify a player for a flagrant non-football act when a foul for that act is called on the field.
* 16-1-3 | Provides that in overtime, if the team that possesses the ball first scores a field goal on its initial possession and the second team loses possession by an interception or fumble, the down will be permitted to run to its conclusion, including awarding points scored by either team during the down

In the first section we will try to analyze the data and in the second section will propose the rules change.




**Exploring the Data**

Data exploration is the most important step in Data Analytics. So in the first section we explore the datasets and try to establish a coorelation between them such that the data makes more sense. 

Lets get high level view of the data,
![](https://storage.googleapis.com/kaggle-media/competitions/NFL%20player%20safety%20analytics/key_variables.jpg)

We will start with video review as it contains the concussion data.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
print(os.listdir("../input"))

from plotly import tools
import plotly.plotly as py
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go

import matplotlib.pyplot as plt
import seaborn as sns

import tqdm
import gc
import feather


video_review = pd.read_csv('../input/video_review.csv')

# Any results you write to the current directory are saved as output.

In [None]:
game_review = pd.read_csv('../input/game_data.csv')

**Analysis of Concussions**

In [None]:
player_role = pd.read_csv('../input/play_player_role_data.csv')

In [None]:
# Player activity during primary injury causing event
temp = video_review["Player_Activity_Derived"].value_counts()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, 1]},
      "hole": .6,
      "type": "pie"
    },
    
    ],
  "layout": {
        "title":"Player-Activity during primary injury causing event",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Player activity",
                "x": 0.5,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')


From the above graph its quite evident that most of the player had concussion while Tackling or after getting Blocked.  

In [None]:
temp = video_review["Primary_Impact_Type"].value_counts()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, 1]},
      "hole": .6,
      "type": "pie"
    },
    
    ],
  "layout": {
        "title":"Impacting source that caused the concussion",
        "annotations": [
            {
                "font": {
                    "size": 17
                },
                "showarrow": False,
                "text": "Primary Impact type",
                "x": 0.5,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

Most of these injuries were caused due Helmet-to-Body or Helmet-to-Helmet Contacts. These were the results of Helmet to Player high speed clashes that occurred during tackles or blocks.

In [None]:
temp = video_review["Primary_Partner_Activity_Derived"].value_counts()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, 1]},
      "hole": .6,
      "type": "pie"
    },
    
    ],
  "layout": {
        "title":"Primary Partner Activity that caused the concussion",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Primary Partner Activity type",
                "x": 0.5,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

From the above graph its quite evident that most of the player had concussion while the Partner was Tackling or after getting Blocked.  

In [None]:
fig = plt.figure(figsize = (20,10))
ax1 = fig.add_subplot(2,2,1)
ax2 = fig.add_subplot(2,2,2)
ax3 = fig.add_subplot(2,2,3)
ax4 = fig.add_subplot(2,2,4)
df = pd.DataFrame(data=video_review)
sns.heatmap(pd.crosstab(df.Player_Activity_Derived, df.Primary_Partner_Activity_Derived), annot=True, square=True, ax=ax1)
sns.heatmap(pd.crosstab(df.Player_Activity_Derived, df.Primary_Impact_Type), annot=True, square=True, ax=ax2)
sns.heatmap(pd.crosstab(df.Player_Activity_Derived, df.Friendly_Fire), annot=True, square=True, ax=ax3)
sns.heatmap(pd.crosstab(df.Primary_Impact_Type, df.Friendly_Fire), annot=True, square=True, ax=ax4)

**Analysis from the above data**

* While player has blocked already the opponent and their primary partner still is trying to do the blocking its lead to 5(15%) of the concussions. .
* While player is trying to tackle the opponent and their primary partner has already tackled it lead to 8(22%) of the concussions.
* While player is trying to block the opponent and their primary partner has already blocked lead to 8 (22%) of the concussions.
* While player has tackled the opponent and their primary partner is still tackling lead to 6 of the concussions.
* Helmet to body contact while tackling to 8 (22%) of injuries.
* Helmet to helmet contact while blocked to 4 (11%) of injuries.
* Most of the injuries were non-friendly fires.
* Most of the non-friendly injuries were Helmet-to-Helmet (43%)

In [None]:
merged_csv = pd.merge (video_review,player_role)

In [None]:
player_punt_data = pd.read_csv('../input/player_punt_data.csv')
player_punt_data = player_punt_data.sort_values('GSISID', ascending=False).drop_duplicates('GSISID').sort_index()

In [None]:
merged_csv = pd.merge (merged_csv ,player_punt_data[['GSISID','Position']])

In [None]:
del video_review,player_role,player_punt_data
gc.collect()

**Analysis of Player Roles and Positions**

Now we will explore the **Player Role **  and  **Player Punt Data** to analyze which players receive more concussions. The NFL positions are explained as follows,
![](http://www.sportsfy.com/football-positions.jpg)

**Offensive Positions: ** 

* Center (C): The offensive player who snaps the ball to start a play. This player also blocks for the runner on a running play and protects the quarterback on a passing play.
* Offensive Guard (OG): An offensive lineman who lines up next to the center. An offensive guard's role is to block for the runner on running plays and protect the quarterback on passing plays.
* Offensive Tackle (OT): An offensive lineman who lines up outside of the offensive guard. An offensive tackle's role is to block for the runner on running plays and protect the quarterback on passing plays.
* Wide Receiver (WR): An offensive player who often lines up away from the line of scrimmage, and whose primary responsibility is to catch the football on a passing play. On running plays, a wide receiver often blocks for the runner.
* Tight End (TE): A tight end often lines up at the end of the offensive line. On a running play, the tight end blocks for the runner. On a passing play, the tight end often runs out and becomes a pass target.
* Quarterback (QB): The quarterback usually lines up behind center and receives the snap. On running plays, the quarterback hands off to the runner. On passing plays, the quarterback throws the pass to a receiver. The quarterback may also gain yards by running.
* Fullback (FB): A fullback lines up between the quarterback and the halfback, and his primary role is to back for the halfback in a running play and to protect the quarterback on a passing play.
* Halfback (HB): Also called the tailback. A halfback is usually smaller and quicker than the fullback, and is the primary ball carrier on running plays. On passing plays, a halfback often becomes a passing target.
* Running Back (RB) – Also known as the Halfback. This player does it all. Lining up either behind or beside the quarterback, he runs, he catches, he blocks and he’ll even throw a pass from time to time. A running back is normally a player who is a quick runner and thrives on contact.

**Defensive Positions:**

* Nose Tackle (NT): Also called a nose guard. A nose tackle is a defensive lineman who lines up directly across the opponent's center, commonly seen in a 3-4 defense. A nose tackle's primary role is to stop the run.
* Defensive End (DE): A defensive lineman who lines up at the end of the defensive line. The primary role of the defensive end is to rush the quarterback.
* Cornerback (CB): A defensive player whose primary responsibility is to defend against the pass.
* Inside Linebacker (ILB): A defensive player who lines up behind the defensive line and has responsibility to stop plays that take place near the middle of the field.
* Outside Linebacker (OLB): A linebacker who lines up behind the defensive line and has responsibility to stop plays that goes outside of the tackles. Outside linebackers sometimes rush the quarterback.
* Strong Safety (SS): A defensive player who lines up on the strong side of the offense (the side where the tight end lines up), usually behind the linebackers. A strong safety supports the run defense on running plays and helps cornerbacks to defend the pass on passing plays.
* Free Safety (FS): A defensive player who lines up behind the linebackers and the cornerbacks. A free safety should be a sure tackler as he is usually the last line of defense, and is expected to help the cornerback on deep passes.
* Middle Linebacker (MLB): A linebacker who plays in the middle of the field behind the defensive line. Middle linebacker is a position in a 4-3 defense (not pictured). A middle linebacker is often the "quarterback" on defense, as he is the one who calls out defensive plays.
* Defensive Tackle (DT): A defensive lineman who lines up on the inside of the defensive line in a 4-3 defense. The primary role of the defensive tackle is to stop the run.

In [None]:
temp = merged_csv["Role"].value_counts()
tempe = merged_csv["Position"].value_counts()
fig = {
    "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, .48]},
      "hole": .4,
      "type": "pie"
    },
      {
      "values": tempe.values,
      "labels": tempe.index,
      "domain": {"x": [.52, 1]},
      "hole": .4,
      "type": "pie"
    },
    
    
    ],
  "layout": {
        "title":"Player Roles and Punt Positions",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Punt Role",
                 "x": 0.19,
                "y": 0.5
            },
              {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Position",
                "x": 0.80,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

Most of the Players who were injured were Punt Returners, PLG and GL (Punt Role wise) and Tight End,Wide Receiver,Inside Line Breaker(Position Wise).

A punt returner (PR) has the job of catching the ball after it is punted and to give his team good field position (or a touchdown if possible) by returning it. Before catching the punted ball, the returner must assess the situation on the field while the ball is still in the air.Punt returners sometimes also return kickoffs and usually play other positions, especially wide receiver, cornerback and running back, although sometimes as backups.

Tight End (TE) duties include blocking for both the quarterback and the running backs, but he can also run into the field and catch passes. Tight ends can catch like a receiver, but has the strength and size to dominate on the line.

Inside linebackers (ILB) are usually responsible for shadowing RBs, TEs and sometimes WRs; rushing the passer; and tackling ball carriers. Those who are linebackers are likely strong and fast.

In [None]:
#temp = video_review["Primary_Partner_Activity_Derived"].value_counts()
temp = merged_csv.groupby(['Role','Position']).size()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, 1]},
      "hole": .6,
      "type": "pie"
    },
    
    ],
  "layout": {
        "title":"Player Punt Role and Position Combination",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Position and Punt Role Combined",
                "x": 0.5,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

This plot shows the coorelation between the Punt Roles and Player Positions.

In [None]:
final_csv = pd.merge (merged_csv,game_review)
play_information = pd.read_csv('../input/play_information.csv')
final_csv = pd.merge (final_csv ,play_information[['GameKey','Quarter','PlayID','Poss_Team','Score_Home_Visiting']],on=['GameKey','PlayID'],how='left')
final_csv['home_poss'] = np.where(final_csv['HomeTeamCode'] == final_csv['Poss_Team'], 'Yes', 'No')  

In [None]:
score_away = []
score_home = []
home_win_loss = []

for item in final_csv['Score_Home_Visiting']:
    scores = item.split('-')
    temp =  int(scores[1])
    temp2 = int(scores[0])
    if(temp<temp2):
        temp3 = "Winning"
    elif(temp==temp2):
        temp3 = "Draw"
    else:
        temp3 = "Losing"
    score_away.append(temp)
    score_home.append(temp2)
    home_win_loss.append(temp3)

final_csv['home_score'] = score_home  
final_csv['visit_score'] = score_away
final_csv['home_win_loss'] = home_win_loss

In [None]:
poss_win_loss = []
for row in final_csv.iterrows():
    items = row[1]
    if items['Poss_Team'] == items['HomeTeamCode']:
        if items['home_score'] < items['visit_score']:
            poss_win_loss.append('Losing')
        if items['home_score'] > items['visit_score']:
            poss_win_loss.append('Winning')
                
    elif items['Poss_Team'] == items['VisitTeamCode']:
        if items['visit_score'] < items['home_score']:
            poss_win_loss.append('Losing')
        if items['visit_score'] > items['home_score']:
            poss_win_loss.append('Winning')
                
    if items['home_score'] == items['visit_score']:
        poss_win_loss.append('Draw')
    
final_csv['Poss_Team_Status'] = poss_win_loss

**Analysis of Game (Game Status and Timing)**

In this section we will analyze the Game Status and Game Types.

In [None]:
temp = final_csv["home_win_loss"].value_counts()
tempe = final_csv["Poss_Team_Status"].value_counts()
fig = {
    "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, .48]},
      "hole": .4,
      "type": "pie"
    },
      {
      "values": tempe.values,
      "labels": tempe.index,
      "domain": {"x": [.52, 1]},
      "hole": .4,
      "type": "pie"
    },
    
    
    ],
  "layout": {
        "title":"Game Status during the time of Concussion",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Home Team Status",
                 "x": 0.145,
                "y": 0.5
            },
              {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Poss Team Status",
                "x": 0.858,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

Most of the concussions (59% approx) occured when the Possession Team was either Losing or Drawing. 

In [None]:
temp = final_csv["Season_Type"].value_counts()
tempe = final_csv["Quarter"].value_counts()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, .48]},
      "hole": .6,
      "type": "pie"
    },
      {
      "values": tempe.values,
      "labels": tempe.index,
      "domain": {"x": [.52, 1]},
      "hole": .4,
      "type": "pie"
    },
    
    ],
  "layout": {
        "title":"Season and Game Quarter for Concussions",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Season",
                 "x": 0.170,
                "y": 0.5
            },
              {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Game Quarter",
                "x": 0.828,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

32.4 % concussion occurred in the Pre-season games while around 67.6% occurred in Regular Games. And NO concussions occured during the Post Season Games.

While in a game, Quarter 3 experienced most number of concussions followed Quarter 2.

In [None]:
del merged_csv,game_review,play_information
gc.collect()

**Analysis of Natural Conditions ( Turf , Weather and Temperature)**

In [None]:
# Game Wise Analysis of Injuries 

# Turf Analysis 

temp = final_csv["Turf"].value_counts()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, 1]},
      "hole": .6,
      "type": "pie"
    },
    
    ],
  "layout": {
        "title":"In Which Turf most injuries occured ? ",
        "annotations": [
            {
                "font": {
                    "size": 17
                },
                "showarrow": False,
                "text": "Turf",
                "x": 0.5,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

Looks like, most of the injuries tend to occur on Natural Grass or Grass turfs. May be the Grass makes the players slip more and fumble , which cause most of the injuriies.

In [None]:
# Game Wise Analysis of Injuries 

# Weather Analysi
temp = final_csv["OutdoorWeather"].value_counts()
tempe = final_csv["GameWeather"].value_counts()
fig = {
  "data": [
    {
      "values": temp.values,
      "labels": temp.index,
      "domain": {"x": [0, .48]},
      "hole": .4,
      "type": "pie"
    },
      {
      "values": tempe.values,
      "labels": tempe.index,
      "domain": {"x": [.52, 1]},
      "hole": .4,
      "type": "pie"
    },
    
    
    ],
  "layout": {
        "title":"How was the weather during the Match ?",
        "annotations": [
            {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Outdoor Weather",
                 "x": 0.135,
                "y": 0.5
            },
              {
                "font": {
                    "size": 12
                },
                "showarrow": False,
                "text": "Game Weather",
                "x": 0.85,
                "y": 0.5
            }
            
        ]
    }
}
iplot(fig, filename='donut')

Weather has nothing to do with concussions.

In [None]:
trace = {"x": final_csv["Stadium"], 
          "y": final_csv["Temperature"], 
          "marker": {"size": 12}, 
          "mode": "markers",  
          "type": "scatter"
}


data = [trace]
layout = {"title": "Games and Temperatures", 
          "xaxis": {"title": "Stadiums", }, 
          "yaxis": {"title": "Temperature (in Farenheit)"}}

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='basic_dot-plot')

So there is not much coorelation between temperature and concussions.

**Home and Away Teams Analysis**

Lets visualize some co- relation between the Home and Away Teams,

In [None]:
trace0 = go.Scatter(
    x=final_csv.Home_Team,
    y=final_csv.Player_Activity_Derived,
    mode='markers',
    marker = dict(
          color = 'rgb(17, 157, 255)',
          size = 20,
          line = dict(
            color = 'rgb(231, 99, 250)',
            width = 2
          ))
)
layout = dict(title='Home Team activity leading to concussion '
)
fig = go.Figure(data=[trace0], layout=layout)
iplot(fig, filename='bubblechart-color')

In [None]:
trace0 = go.Scatter(
    x=final_csv.Visit_Team,
    y=final_csv.Player_Activity_Derived,
    mode='markers',
    marker = dict(
          color = 'rgb(17, 157, 255)',
          size = 20,
          line = dict(
            color = 'rgb(231, 99, 250)',
            width = 2
          ))
)
layout = dict(
            title='Visit Team activity leading to concussion '
)
fig = go.Figure(data=[trace0], layout=layout)
iplot(fig, filename='bubblechart-color')

In [None]:
x = final_csv.Home_Team
y = final_csv.Visit_Team

data = [
    go.Histogram2d(
        x=x,
        y=y
    )
]
layout = dict(
            title='Home Team vs Away Team and Concussions ',
            xaxis=dict(
                    title='Home Team'
                        ),
            yaxis=dict(
                    title='Away Team',
                        )
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)

In [None]:
del temp,tempe,x,y
gc.collect()

In this section, will make use of **NGS Data** to visualize the injuries in depth. NGS Data contains the each and every player movements during the play.

In [None]:
def calculate_speeds(df, dt=None, SI=False):
    data_selected = df[['Time', 'x','y']]
    if SI==True:
        data_selected.x = data_selected.x / 1.0936132983
        data_selected.y = data_selected.y / 1.0936132983
    # Might have used shift pd function ?
    data_selected_diff = data_selected.diff()
    if dt==None:
        # Time is now a timedelta and need to be converted
        data_selected_diff.Time = data_selected_diff.Time.apply(lambda x: (x.total_seconds()))
        data_selected_diff['Speed'] = (data_selected_diff.x **2 + data_selected_diff.y **2).astype(np.float64).apply(np.sqrt) / data_selected_diff.Time
    else:
        # Need to be sure about the time step...
        data_selected_diff['Speed'] = (data_selected_diff.x **2 + data_selected_diff.y **2).astype(np.float64).apply(np.sqrt) / dt
    #data_selected_diff.rename(columns={'Time':'TimeDelta'}, inplace=True)
    #return data_selected_diff
    df['TimeDelta'] = data_selected_diff.Time
    df['Speed'] = data_selected_diff.Speed
    return df[1:]

In the phase phase will only consider data from 2016 , will combine all the data from 2016, combine them , calcualte speeds and visualize. We ignore the NGS data from Post Season as there were no concussions in that period.

In [None]:

dtypes = {'Season_Year': 'int16',
         'GameKey': 'int16',
         'PlayID': 'int16',
         'GSISID': 'float32',
         'Time': 'str',
         'x': 'float32',
         'y': 'float32',
         'dis': 'float32',
         'o': 'float32',
         'dir': 'float32',
         'Event': 'str'}

col_names = list(dtypes.keys())

df_list = []

buffer = ['NGS-2017-pre.csv',
             'NGS-2017-reg-wk1-6.csv',
             'NGS-2017-reg-wk7-12.csv',
             'NGS-2017-reg-wk13-17.csv',
             'NGS-2017-post.csv']
ngs_files = ['NGS-2016-pre.csv',
             'NGS-2016-reg-wk1-6.csv',
             'NGS-2016-reg-wk7-12.csv','NGS-2016-reg-wk13-17.csv']

for i in tqdm.tqdm(ngs_files):
    df = pd.read_csv(f'../input/'+i, usecols=col_names,dtype=dtypes)
    date_format = '%Y-%m-%d %H:%M:%S.%f'
    sortBy = ['Season_Year', 'GameKey', 'PlayID', 'GSISID', 'Time']
    df.Time = pd.to_datetime(df.Time, format =date_format)
    df.sort_values(sortBy, inplace=True)
    df = calculate_speeds(df, SI=True)
    df_list.append(df)
    del df
    gc.collect()

ngs = pd.concat(df_list)

del df_list
gc.collect()

In [None]:
#Converting everything to meters and speed to KMPH
ngs['x'] = ngs['x']/1.0936
ngs['y'] = ngs['y']/1.0936
ngs['dis'] = ngs['dis']/1.0936
ngs['Speed'] = ngs['Speed']* 3.6

In [None]:
ngs = ngs[ngs.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)] 

In [None]:
def remove_wrong_values(df, tested_columns=['Season_Year', 'GameKey', 'PlayID', 'GSISID', 'TimeDelta'], cutspeed=None):
    dump = df.copy()
    colums = dump.columns
    mask = []
    for col in tested_columns:
        dump['shift_'+col] = dump[col].shift(-1)
        mask.append("( dump['shift_"+col+"'] == dump['"+col+"'])")
    mask =eval(" & ".join(mask))
    # Keep results where next rows is equally space
    dump = dump[mask]
    dump = dump[colums]
    if cutspeed!=None:
        dump = dump[dump.Speed < cutspeed]
    return dump

In [None]:
cut_speed=44 # World record 9,857232 m/s for NFL
ngs = remove_wrong_values(ngs, cutspeed=cut_speed)

In [None]:
video_review = pd.read_csv('../input/video_review.csv')
final = pd.merge(final_csv,ngs,on=['Season_Year','GameKey','PlayID','GSISID'])


In [None]:
def load_layout():
    """
    Returns a dict for a Football themed Plot.ly layout 
    """
    layout = dict(
        title = "Player Pitch Activity",
        plot_bgcolor='darkseagreen',
        showlegend=True,
        xaxis=dict(
            autorange=False,
            range=[0, 120],
            showgrid=False,
            zeroline=False,
            showline=True,
            linecolor='black',
            linewidth=1,
            mirror=True,
            ticks='',
            tickmode='array',
            tickvals=[10,20, 30, 40, 50, 60, 70, 80, 90, 100, 110],
            ticktext=['Goal', 10, 20, 30, 40, 50, 40, 30, 20, 10, 'Goal'],
            showticklabels=True
        ),
        yaxis=dict(
            title='',
            autorange=False,
            range=[-3.3,56.3],
            showgrid=False,
            zeroline=False,
            showline=True,
            linecolor='black',
            linewidth=1,
            mirror=True,
            ticks='',
            showticklabels=False
        ),
        shapes=[
            dict(
                type='line',
                layer='below',
                x0=0,
                y0=0,
                x1=120,
                y1=0,
                line=dict(
                    color='white',
                    width=2
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=0,
                y0=53.3,
                x1=120,
                y1=53.3,
                line=dict(
                    color='white',
                    width=2
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=10,
                y0=0,
                x1=10,
                y1=53.3,
                line=dict(
                    color='white',
                    width=10
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=20,
                y0=0,
                x1=20,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=30,
                y0=0,
                x1=30,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=40,
                y0=0,
                x1=40,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=50,
                y0=0,
                x1=50,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=60,
                y0=0,
                x1=60,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),dict(
                type='line',
                layer='below',
                x0=70,
                y0=0,
                x1=70,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),dict(
                type='line',
                layer='below',
                x0=80,
                y0=0,
                x1=80,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=90,
                y0=0,
                x1=90,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),dict(
                type='line',
                layer='below',
                x0=100,
                y0=0,
                x1=100,
                y1=53.3,
                line=dict(
                    color='white'
                )
            ),
            dict(
                type='line',
                layer='below',
                x0=110,
                y0=0,
                x1=110,
                y1=53.3,
                line=dict(
                    color='white',
                    width=10
                )
            )
        ]
    )
    return layout

In [None]:
def plot_play(game_df, PlayID, player1=None, player2=None, custom_layout=False):
    """
    Plots player movements on the field for a given game, play, and two players
    """
    game_df = game_df[game_df.PlayID==PlayID]
    finale = final[final.PlayID==PlayID]

    GameKey=str(pd.unique(game_df.GameKey)[0])
    traces=[]   
    listb = []
    
    list1= list(game_df[game_df.GSISID==player1].Event)
    list2= list(game_df[game_df.GSISID==player1].Speed)
    lista = ["Event: "+str(list1[i]) +" + Speed: "+ str(list2[i]) for i in range(len(list1))]
    if not lista:
        lista.append("None")
    
    list3= list(game_df[game_df.GSISID==player2].Event)
    list4= list(game_df[game_df.GSISID==player2].Speed)
    try:
        listb = ["Event: "+str(list3[i]) +" + Speed: "+ str(list4[i]) for i in range(len(list2))]
    except:
        listb.append("None")
        
    trace0 = go.Scatter(
                x = game_df[game_df.GSISID==player1].x,
                y = game_df[game_df.GSISID==player1].y,
                name='Primary GSISID '+str(player1),
                mode = 'lines+markers',
                text = lista,
                line = dict(width = 6,smoothing=1.1),
                marker=dict(
                size=12,
                line = dict(
                color= 'rgb(0,0,0)',
                width = 1),
                color = game_df[game_df.GSISID==player1].Speed, #set color equal to a variable
                colorscale='Viridis',
                colorbar=dict(
                title='Primary Speed'
                ),
                showscale=True
    )
            )
    trace1 = go.Scatter(
                x = game_df[game_df.GSISID==player2].x,
                y = game_df[game_df.GSISID==player2].y,
                name='Partner GSISID '+str(player2),
                text = listb,
                line = dict(
                width = 5),
                mode = 'lines+markers',
                marker=dict(
                size=10,
                line = dict(
                color= 'rgb(0,0,0)',
                width = 1),
                color = game_df[game_df.GSISID==player2].Speed, #set color equal to a variable
                colorscale='Portland',
                colorbar=dict(title='Partner Speed', x =-0.14),
                showscale= True
                )
            )
    
    layout = load_layout()
    layout['title'] = 'Player Activity (Concussion) in GameKey ' + GameKey + ' : ' + str(pd.unique(finale.Home_Team)[0]) +' v/s ' + str(pd.unique(finale.Visit_Team)[0])
    layout['legend'] = dict(orientation="h")
    data = [trace0,trace1]
    fig = dict(data=data, layout=layout)
    print(" Play Information")
    print(" Date :" + str(pd.unique(finale.Game_Date)[0]))
    print(" Home Team :" + str(pd.unique(finale.Home_Team)[0])+ ", Visiting Team : " + str(pd.unique(finale.Visit_Team)[0]) )
    print(" Player Activity Derived :" + str(pd.unique(finale.Player_Activity_Derived)[0])+ ", Primary Partner Activity Derived : " + str(pd.unique(finale.Primary_Partner_Activity_Derived)[0]) )
    print(" Primary Impact Type :" + str(pd.unique(finale.Primary_Impact_Type)[0])+ ", Punt Play Player Role : " + str(pd.unique(finale.Role)[0]) + ", Player Position : " + str(pd.unique(finale.Position)[0]))
    print(" Quarter of Play :" + str(pd.unique(finale.Quarter)[0])+ ", Pocession Team : " + str(pd.unique(finale.Poss_Team)[0]) + ", Score (Home-Visiting) : " + str(pd.unique(finale.Score_Home_Visiting)[0]))
    print(" Home Team Status :" + str(pd.unique(finale.home_win_loss)[0])+ ", Pocession Team Status : " + str(pd.unique(finale.Poss_Team_Status)[0]) )
    print(" Max Speed :" + str(game_df.Speed.max())+ ", Avg Speed :" + str(game_df.Speed.mean()) )
    print(" Stadium :" + str(pd.unique(finale.Stadium)[0])+ ", Turf : " + str(pd.unique(finale.Turf)[0])+", GameWeather :" + str(pd.unique(finale.GameWeather)[0])+ ", Temperature : " + str(pd.unique(finale.Temperature)[0])  )
    del finale
    #print("\n\n\t",play_description)
    iplot(fig)


**Players and their Partner Movements which caused Concussion - 2016**

In this section lets visualize , the player movements during their concussion time game by game. We will analyze all the games of Preseason , Regular Week 1-6 and few games for Regular Week 7-12. 

Field Layout and Player positions for reference in case you are not aware,

![](https://storage.googleapis.com/kaggle-media/competitions/NFL%20player%20safety%20analytics/punt_coverage.png)

![](https://storage.googleapis.com/kaggle-media/competitions/NFL%20player%20safety%20analytics/punt_return.png)

Football Plotly Layout taken from [An-Indian-Analysing-American-Football](https://www.kaggle.com/piyush1912/an-indian-analysing-american-football/)

Lets start with some preseason games.

**Pre-Season Game - 2016 - CHI vs DEN**

P.O'Donnell punts 58 yards to DEN 11, Center-P.Scales. B.Addison to DEN 25 for 14 yards (K.Carey). PENALTY on DEN-S.Sulleyman, Illegal Block Above the Waist, 10 yards, enforced at DEN 25.


In [None]:
plot_play(game_df=ngs, PlayID=3129, player1=31057, player2=32482 )

Now , let watch the original footage from NFL. 

*Note: If you are using Google Chrome , please enable unsecure HTTP Content in your Browser by clicking "Load Unsafe Scripts". Some of the official NFL Videos are having "http://"  (they don't have SSL Certificate) which cannot be displayed without permissions in Chrome.*

In [None]:
from IPython.display import HTML
# Youtube
HTML('<iframe width="950" height="600" src="http://a.video.nfl.com//films/vodzilla/153233/Kadeem_Carey_punt_return-Vwgfn5k9-20181119_152809972_5000k.mp4" frameborder="0" allowfullscreen></iframe>')


We can see from the video K Carey recklessly tackles the opponent using his Helmet. Carey made a hard hit on special teams in the second half of the game against the Broncos and came off wobbling, prompting him to enter the concussion protocol.

**Pre-Season Game  - 2016 - TEN vs CAR**

K.Redfern punts 36 yards to TEN 9, Center-J.Jansen, downed by CAR-B.Wegher. PENALTY on TEN-M.Huff, Illegal Blindside Block, 5 yards, enforced at TEN 9.

In [None]:
plot_play(game_df=ngs, PlayID=2587, player1=29343, player2=31059 )

**Pre-Season Game  - 2016 - WAS vs NYJ**

L.Edwards punts 51 yards to WAS 27, Center-T.Purdum. J.Crowder MUFFS catch, touched at WAS 27, recovered by WAS-Q.Dunbar at WAS 11. Q.Dunbar to WAS 11 for no gain (M.Williams).

In [None]:
plot_play(game_df=ngs, PlayID=538, player1=31023, player2=31941 )

**Pre-Season Game  - 2016 - NYJ vs NWG**

B.Wing punts 44 yards to NYJ 10, Center-T.Ott. J.Ross to NYJ 38 for 28 yards (O.Darkwa; B.Goodson).

In [None]:
plot_play(game_df=ngs, PlayID=1212, player1=33121, player2=28249 )

**Pre-Season Game  - 2016 - CAR vs PIT **

Berry punts 45 yards to CAR 31, Center-G.Warren. D.Byrd to CAR 46 for 15 yards (S.Davis).

In [None]:
plot_play(game_df=ngs, PlayID=1045, player1=32444, player2=31756 )

**Regular Game  - 2016 - BUF vs SF **


C.Schmidt punts 54 yards to SF 19, Center-G.Sanborn. J.Kerley to SF 31 for 12 yards (M.Gillislee).

In [None]:
plot_play(game_df=ngs, PlayID=2342, player1=32410, player2=23259 )

**Regular Game  - 2016 - NO VS CAR **

T.Morstead punts 54 yards to CAR 39, Center-J.Drescher. T.Ginn to 50 for 11 yards (D.Lasco).

In [None]:
plot_play(game_df=ngs, PlayID=3663, player1=28128, player2=29629 )

**Regular Game  - 2016 - KC VS JAX **

D.Colquitt punts 54 yards to JAX 31, Center-J.Winchester. B.Walters to JAX 39 for 8 yards (F.Zombo)

In [None]:
plot_play(game_df=ngs, PlayID=3509, player1=27595, player2=31950 )

**Regular Game  - 2016 - IND VS TEN **

B.Kern punts 46 yards to IND 37, Center-B.Brinkley. C.Rogers to IND 39 for 2 yards (W.Woodyard).

In [None]:
plot_play(game_df=ngs, PlayID=3468, player1=28987, player2=31950)

**Regular Game  - 2016 - BLT VS  CIN**

K.Huber punts 58 yards to BLT 14, Center-T.Ott. D.Hester MUFFS catch, and recovers at BLT 12. D.Hester pushed ob at BLT 27 for 15 yards (N.Vigil). CIN-C.Brown was injured during the play. 

In [None]:
plot_play(game_df=ngs, PlayID=1976, player1=32214, player2=32807 )

Now, Lets see the original footage from the game, 

In [None]:
HTML('<iframe width="950" height="600" src="https://nfl-vod.cdn.anvato.net/league/5691/18/11/25/284954/284954_75F12432BA90408C92660A696C1A12C8_181125_284954_huber_punt_3200.mp4" frameborder="0" allowfullscreen></iframe>')

C Brown suffers injury after a high speed Helmet to Helmet collision. Home Team was losing and resulted in desperate challenge on the opponent. 

In [None]:
finale = pd.merge(final_csv,ngs,on=['GSISID'])

In [None]:
import plotly.plotly as py
import plotly.graph_objs as go
temp = finale["Event"].value_counts()
tempe = final["Event"].value_counts()

trace1 = go.Bar(
    x=temp.index,
    y=temp.values,
    name='All Games'
)
trace2 = go.Bar(
    x=tempe.index,
    y=tempe.values,
    name='Concussion'
)

data = [trace1, trace2]
layout = go.Layout(
    barmode='group'
)

layout = go.Layout(title='Events over the Games')
fig = go.Figure(data=data, layout=layout)

iplot(fig, filename='grouped-bar')

**Analysis**

Out of around 2659 punts over all games there were 18 punts involving concussions. Out of 1292 "Punt Received" , 14 resulted in concussions which is approx 1 concussion occured every 100 Punt Received. Where as in the event of 603 "Fair Catches" , there were 1 concussions. So , the rate was pretty less in case of fair catches.

Lets analyze the speed of the players during these injury events,

In [None]:
speed_during_punt = final.loc[final['Event'].isin(['punt'])]
speed_during_puntrec = final.loc[final['Event'].isin(['punt_received'])]
speed_during_tackle = final.loc[final['Event'].isin(['tackle'])]
speed_during_down = final.loc[final['Event'].isin(['punt_downed'])]
speed_during_fumble = final.loc[final['Event'].isin(['fumble'])]
speed_during_catch = final.loc[final['Event'].isin(['fair_catch'])]

In [None]:
trace0 = go.Box(
    y=finale.Speed,
    name = 'During Whole Game',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(165,42,42)',
    ),
)

trace1 = go.Box(
    y=speed_during_punt.Speed,
    name = 'During Punt',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(214,12,140)',
    ),
)

trace2 = go.Box(
    y=speed_during_puntrec.Speed,
    name = 'During Punt Rec',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(238,130,238)',
    ),
)

trace3 = go.Box(
    y=speed_during_tackle.Speed,
    name = 'During Tackle',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(46,139,87)',
    ),
)


trace4 = go.Box(
    y=speed_during_down.Speed,
    name = 'During Punt Downed',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(255,215,0)',
    ),
)

trace5 = go.Box(
    y=speed_during_fumble.Speed,
    name = 'During Fumble',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(0,191,255)',
    ),
)

trace6 = go.Box(
    y=speed_during_catch.Speed,
    name = 'During Fair Catch',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(176,48,96)',
    ),
)

layout = go.Layout(
    width=900,
    height=500,
    yaxis=dict(
        title='Speed of the Concussed Player',
        zeroline=False
    ),
)
data = [trace0,trace1,trace2,trace3,trace4,trace5,trace6]
layout = go.Layout(title='Speed of the Players and Events 2016')
fig= go.Figure(data=data, layout=layout)
iplot(fig, filename='alcohol-box-plot')

We can observe most of the tackles occured in very high speeds and using Helmets.

In [None]:
density_punt = finale.loc[finale['Event'].isin(['punt'])]
density_punts = final.loc[final['Event'].isin(['punt'])]
density_puntrec = finale.loc[finale['Event'].isin(['punt_received'])]
density_puntrecs = final.loc[final['Event'].isin(['punt_received'])]
density_tackle = final.loc[final['Event'].isin(['tackle'])]
density_tackles = finale.loc[finale['Event'].isin(['tackle'])]

Now lets visualize the player density of field during these specific events,

In [None]:
trace = go.Histogram2dContour(
        x = density_punt.x,
        y = density_punt.y
)

trace0 = go.Scatter(
    x = density_punts.x,
    y = density_punts.y,
    mode = 'markers',
    name = 'Position of Players (Concussed) during Punts',
    text = list(density_punts.Role),
    marker = dict(
        symbol='x',
          color = 'rgb(25,25,112)',
          size = 14)
)

layout = load_layout()
layout['legend'] = dict(orientation="h")
layout['plot_bgcolor'] = 'rgb(220,220,220)'
layout['title'] = 'Player Density on field during Punt 2016'
data = [trace,trace0]
fig = dict(data=data, layout=layout)
iplot(fig, filename = "Basic Histogram2dContour")

In [None]:
trace = go.Histogram2dContour(
        x = density_puntrec.x,
        y = density_puntrec.y
)


trace0 = go.Scatter(
    x = density_puntrecs.x,
    y = density_puntrecs.y,
    mode = 'markers',
    name = 'Position of Players (Concussed) during Punt Recs',
    text = list(density_puntrecs.Role),
    marker = dict(
        symbol='x',
          color = 'rgb(25,25,112)',
          size = 14)
)
layout = load_layout()
layout['plot_bgcolor'] = 'rgb(220,220,220)'
layout['legend'] = dict(orientation="h")
layout['title'] = 'Player Density on field during Punt Rec 2016'
data = [trace,trace0]
fig = dict(data=data, layout=layout)
iplot(fig, filename = "Basic Histogram2dContour")

In [None]:
trace = go.Scatter(
    x = density_tackle.x,
    y = density_tackle.y,
    mode = 'markers',
    name = 'Concussion',
    text = list(density_tackle.Role),
    marker = dict(
        symbol='x',
          color = 'rgb(255, 0, 0)',
          size = 20)
)

trace0 = go.Scatter(
    x = density_tackles.x,
    y = density_tackles.y,
    mode = 'markers',
    name = 'Normal',
    marker = dict(
          color = 'rgb(238,221,130)',
          size = 10)
)

layout = load_layout()
layout['title'] = 'Tackle Points on field during 2016'
data = [trace0,trace]
fig = dict(data=data, layout=layout)
iplot(fig, filename='basic-scatter')

In [None]:
del density_punt,density_puntrec,ngs,final,finale
gc.collect()

In [None]:
del speed_during_punt,speed_during_puntrec,speed_during_tackle,speed_during_down,speed_during_fumble
gc.collect()

**Analysis**


* Most of these events were full speed event.
* Most of these tackles were made at very high speed.
* Most of these tackles were near the end line.
*  Most of these tackles were made while running back.


**Now lets look at some of the concussions in 2017,**

In this section we will explore data 

In [None]:
dtypes = {'Season_Year': 'int16',
         'GameKey': 'int16',
         'PlayID': 'int16',
         'GSISID': 'float32',
         'Time': 'str',
         'x': 'float32',
         'y': 'float32',
         'dis': 'float32',
         'o': 'float32',
         'dir': 'float32',
         'Event': 'str'}

col_names = list(dtypes.keys())

df_list = []

ngs_files = ['NGS-2017-pre.csv',
             'NGS-2017-reg-wk1-6.csv',
             'NGS-2017-reg-wk7-12.csv','NGS-2017-reg-wk13-17.csv']

for i in tqdm.tqdm(ngs_files):
    df = pd.read_csv(f'../input/'+i, usecols=col_names,dtype=dtypes)
    date_format = '%Y-%m-%d %H:%M:%S.%f'
    sortBy = ['Season_Year', 'GameKey', 'PlayID', 'GSISID', 'Time']
    df.Time = pd.to_datetime(df.Time, format =date_format)
    df.sort_values(sortBy, inplace=True)
    df = calculate_speeds(df, SI=True)
    df_list.append(df)
    del df
    gc.collect()

ngs = pd.concat(df_list)

del df_list
gc.collect()

In [None]:
#Converting everything to meters and speed to KMPH
ngs['x'] = ngs['x']/1.0936
ngs['y'] = ngs['y']/1.0936
ngs['dis'] = ngs['dis']/1.0936
ngs['Speed'] = ngs['Speed']* 3.6

In [None]:
ngs = ngs[ngs.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)] 

In [None]:
def remove_wrong_values(df, tested_columns=['Season_Year', 'GameKey', 'PlayID', 'GSISID', 'TimeDelta'], cutspeed=None):
    dump = df.copy()
    colums = dump.columns
    mask = []
    for col in tested_columns:
        dump['shift_'+col] = dump[col].shift(-1)
        mask.append("( dump['shift_"+col+"'] == dump['"+col+"'])")
    mask =eval(" & ".join(mask))
    # Keep results where next rows is equally space
    dump = dump[mask]
    dump = dump[colums]
    if cutspeed!=None:
        dump = dump[dump.Speed < cutspeed]
    return dump

In [None]:
cut_speed=44 # World record 9,857232 m/s for NFL
ngs = remove_wrong_values(ngs, cutspeed=cut_speed)
ngs.Speed.hist()

In [None]:
video_review = pd.read_csv('../input/video_review.csv')
final = pd.merge(final_csv,ngs,on=['Season_Year','GameKey','PlayID','GSISID'])

**Players and their Partner Movements which caused Concussion - 2017**

In this section lets visualize , the player movements during their concussion time game by game. We will analyze all the games of Preseason , Regular Week 1-6 and few games for Regular Week 7-12. 

**Pre Season Game  - 2017 - MIA vs BLT**

M.Haack punts 52 yards to BLT 25, Center-W.Chapman. B.Rainey to BLT 34 for 9 yards (D.Morgan). MIA-C.Pantale was injured during the play. His return is Questionable. 

In [None]:
plot_play(game_df=ngs, PlayID=3630, player1=30171, player2=29384 )

In [None]:
HTML('<iframe width="950" height="600" src="http://a.video.nfl.com//films/vodzilla/153250/52_yard_Punt_by_Matt_Haack-ENsIvMyf-20181119_161418429_5000k.mp4" frameborder="0" allowfullscreen></iframe>')

Another reckless challenge, The injured player C Pantale bulldozes the Partner player using his Helmet, to be exact C Pantale makes the contact with opponent at a speed of more than 25 KMPH ! 

**Pre Season Game - 2017 - WAS vs GB**

J.Vogel punts 43 yards to WAS 48, Center-D.Hart. K.Fuller to GB 40 for 12 yards (J.Hawkins)

In [None]:
plot_play(game_df=ngs, PlayID=2764, player1=32323, player2=31930 )

**Pre Season Game - 2017 - ATL vs JAX**

B.Nortman punts 40 yards to ATL 25, Center-M.Overton. J.Hardy to ATL 32 for 7 yards (B.Brown). JAX-J.Harper was injured during the play.  PENALTY on ATL-J.Keyes, Offensive Holding, 10 yards, enforced at ATL 25.

In [None]:
plot_play(game_df=ngs, PlayID=183, player1=33813, player2=33841 )

**Pre Season Game - 2017 - KC vs TEN**

B.Kern punts 61 yards to KC 24, Center-R.DiSalvo. J.Chesson for 76 yards, TOUCHDOWN. TEN-R.DiSalvo was injured during the play. 

In [None]:
plot_play(game_df=ngs, PlayID=1088, player1=32615, player2=31999 )

**Pre Season Game - 2017 - SF vs LAC**

T.Baker punts 41 yards to SF 11, Center-M.Windt. D.Carter to SF 14 for 3 yards (J.Perry; D.Brown).

In [None]:
plot_play(game_df=ngs, PlayID=1526, player1=32894, player2=31763 )

**Regular Game - 2017 - NE vs KC**

D.Colquitt punts 36 yards to KC 47, Center-J.Winchester. D.Amendola to KC 44 for 3 yards (T.Smith). PENALTY on NE-B.Bolden, Running Into the Kicker, 5 yards, enforced at KC 11 - No Play.

In [None]:
plot_play(game_df=ngs, PlayID=3312, player1=26035, player2=27442 )

**Regular Game - 2017 - DEN vs LAC**

D.Kaser punts 59 yards to DEN 13, Center-M.Windt. I.McKenzie ran ob at DEN 44 for 31 yards.

In [None]:
plot_play(game_df=ngs, PlayID=1262, player1=33941, player2=27442 )

**Regular Game - 2017 - MIA vs NO**

M.Haack punts 42 yards to NO 30, Center-J.Denney. T.Ginn ran ob at NO 39 for 9 yards (J.Denney).

In [None]:
plot_play(game_df=ngs, PlayID=2792, player1=33838, player2=31317 )

**Regular Game - 2017 - OAK vs BLT**

M.King punts 62 yards to BLT 15, Center-J.Condo. M.Campanaro to BLT 24 for 9 yards (J.Condo, J.Cowser).

In [None]:
plot_play(game_df=ngs, PlayID=2072, player1=29492, player2=33445 )

**Regular Game - 2017 - NYG vs KC**

B.Wing punts 37 yards to KC 29, Center-Z.DeOssie. T.Hill pushed ob at KC 49 for 20 yards (N.Berhe). Penalty on KC-F.Zombo, Offensive Holding, declined. PENALTY on KC-T.Smith, Unnecessary Roughness, 15 yards, enforced at KC 33. Officially, a return for 4 yards.

In [None]:
plot_play(game_df=ngs, PlayID=1683, player1=32820, player2=25503 )

In [None]:
HTML('<iframe width="950" height="600" src="http://a.video.nfl.com//films/vodzilla/153280/Wing_37_yard_punt-cPHvctKg-20181119_165941654_5000k.mp4" frameborder="0" allowfullscreen></iframe>')

Another unnecessary rough challenge using Helmet.

In [None]:
finale = pd.merge(final_csv,ngs,on=['GSISID'])

Lets, visualize the frequency of events and the concussion events

In [None]:
import plotly.plotly as py
import plotly.graph_objs as go

temp = finale["Event"].value_counts()
tempe = final["Event"].value_counts()

trace1 = go.Bar(
    x=temp.index,
    y=temp.values,
    name='All Games'
)
trace2 = go.Bar(
    x=tempe.index,
    y=tempe.values,
    name='Concussion'
)

data = [trace1, trace2]
layout = go.Layout(
    barmode='group'
)

layout = go.Layout(title='Events over the Games')
fig = go.Figure(data=data, layout=layout)

iplot(fig, filename='grouped-bar')

**Analysis**

The pattern remains similar like 2016 Season. Out of around 2181 punts over all games there were 18 punts involving concussions. Out of 1048 "Punt Received" , 16 resulted in concussions which is approx 1.6 concussion occured every 100 Punt Received. Where as in the event of 509 "Fair Catches" , there were only 2 concussions. So , data once again shows that fair catches are more safer.

Lets analyze the speed of the players during these injury events,

In [None]:
speed_during_punt = final.loc[final['Event'].isin(['punt'])]
speed_during_puntrec = final.loc[final['Event'].isin(['punt_received'])]
speed_during_tackle = final.loc[final['Event'].isin(['tackle'])]
speed_during_down = final.loc[final['Event'].isin(['punt_downed'])]
speed_during_fumble = final.loc[final['Event'].isin(['fumble'])]
speed_during_catch = final.loc[final['Event'].isin(['fair_catch'])]

In [None]:
trace0 = go.Box(
    y=finale.Speed,
    name = 'During Whole Game',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(165,42,42)',
    ),
)

trace1 = go.Box(
    y=speed_during_punt.Speed,
    name = 'During Punt',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(214,12,140)',
    ),
)

trace2 = go.Box(
    y=speed_during_puntrec.Speed,
    name = 'During Punt Rec',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(138,43,226)',
    ),
)

trace3 = go.Box(
    y=speed_during_tackle.Speed,
    name = 'During Tackle',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(30,144,255)',
    ),
)


trace4 = go.Box(
    y=speed_during_down.Speed,
    name = 'During Punt Downed',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(214,179,140)',
    ),
)

trace5 = go.Box(
    y=speed_during_fumble.Speed,
    name = 'During Fumble',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(254,199,140)',
    ),
)

trace6 = go.Box(
    y=speed_during_catch.Speed,
    name = 'During Fair Catch',
    boxpoints='all',
    jitter=0.3,
    marker = dict(
        color = 'rgb(176,48,96)',
    ),
)

layout = go.Layout(
    width=900,
    height=500,
    title = 'Speed , Concussion and Events',
    yaxis=dict(
        title='Speed of the Concussed Player 2017',
        zeroline=False
    ),
)
data = [trace0,trace1,trace2,trace3,trace4,trace5,trace6]
fig= go.Figure(data=data, layout=layout)
iplot(fig, filename='alcohol-box-plot')

**Analysis**


* Most of these events were full speed event.
* Most of these tackles were made at very high speed using their helmets.
* Most of these tackles were near the end line.
*  Most of these tackles were made while running back.


In [None]:
density_punt = finale.loc[finale['Event'].isin(['punt'])]
density_punts = final.loc[final['Event'].isin(['punt'])]
density_puntrec = finale.loc[finale['Event'].isin(['punt_received'])]
density_puntrecs = final.loc[final['Event'].isin(['punt_received'])]
density_tackle = final.loc[final['Event'].isin(['tackle'])]
density_tackles = finale.loc[finale['Event'].isin(['tackle'])]

In [None]:
trace = go.Histogram2dContour(
        x = density_punt.x,
        y = density_punt.y
)

trace0 = go.Scatter(
    x = density_punts.x,
    y = density_punts.y,
    mode = 'markers',
    name = 'Position of Players (Concussed) during Punts',
    text = list(density_punts.Role),
    marker = dict(
        symbol='x',
          color = 'rgb(25,25,112)',
          size = 14)
)

layout = load_layout()
layout['legend'] = dict(orientation="h")
layout['plot_bgcolor'] = 'rgb(220,220,220)'
layout['title'] = 'Player Density on field during Punt 2017'
data = [trace,trace0]
fig = dict(data=data, layout=layout)
iplot(fig, filename = "Basic Histogram2dContour")

In [None]:
trace = go.Histogram2dContour(
        x = density_puntrec.x,
        y = density_puntrec.y
)


trace0 = go.Scatter(
    x = density_puntrecs.x,
    y = density_puntrecs.y,
    mode = 'markers',
    name = 'Position of Players (Concussed) during Punt Recs',
    text = list(density_puntrecs.Role),
    marker = dict(
        symbol='x',
          color = 'rgb(25,25,112)',
          size = 14)
)
layout = load_layout()
layout['plot_bgcolor'] = 'rgb(220,220,220)'
layout['legend'] = dict(orientation="h")
layout['title'] = 'Player Density on field during Punt Rec 2017'
data = [trace,trace0]
fig = dict(data=data, layout=layout)
iplot(fig, filename = "Basic Histogram2dContour")

In [None]:
trace = go.Scatter(
    x = density_tackle.x,
    y = density_tackle.y,
    mode = 'markers',
    name = 'Concussion',
    text = list(density_tackle.Role),
    marker = dict(
        symbol='x',
          color = 'rgb(255, 0, 0)',
          size = 20)
)

trace0 = go.Scatter(
    x = density_tackles.x,
    y = density_tackles.y,
    mode = 'markers',
    name = 'Normal',
    marker = dict(
          color = 'rgb(238,221,130)',
          size = 10)
)

layout = load_layout()
layout['title'] = 'Tackle Points on field during 2017'
data = [trace0,trace]
fig = dict(data=data, layout=layout)
iplot(fig, filename='basic-scatter')

In [None]:
del density_punt,density_puntrec,ngs,final,finale,final_csv
gc.collect()

**Rule Changes:**

Here in this section we will mention the rule changes that may reduce the number of concussions. 

Lets **summarize **  all the  **observations** that were made in the above sections,

*  Most of the player had concussion while Tackling or after getting Blocked.
* Most of these injuries were caused due Helmet-to-Body or Helmet-to-Helmet Contacts.  
* Helmet to body contact while tackling to 8 (22%) of injuries. Which were results of the injured player tackling the other player using their Helmets.
* Most of the Players who were injured were Punt Returners, PLG and GL (Punt Role wise) and Tight End,Wide Receiver,Inside Line Breaker(Position Wise).
* Most of the players who were concussed were in Offensive positions.
* Most of the concussions (59% approx) occured when the Possession Team was either Losing or Drawing.
* 32.4 % concussion occurred in the Pre-season games while around 67.6% occurred in Regular Games. And NO concussions occured during the Post Season Games.
* While in a game, Quarter 3 experienced most number of concussions followed Quarter 2.
* Most of the injuries tend to occur on Natural Grass or Grass turfs. 
* In 2016, Out of around 2659 punts (excluding Post Season) over all games there were 18 punts involving concussions. 
* In 2016, Out of 1292 "Punt Received" , 14 resulted in concussions which is approx 1 concussion occured every 100 Punt Received.
* In 2016, Out of 603 "Fair Catches" , there were 1 concussions. So , the rate was pretty less in case of fair catches.
* Most of these concussion events were full speed events.
* Most of these tackles were made at very high speed, with player recklessly challenging the partner with the helmet.
* Most of these tackles were near the end line.
* Most of these tackles were made while running back.



Now, Will suggest some rule changes based on these observations

* **Bonus yards for fair catches.**

From the observations it is quite evident that Fair Catches are more safer Punt Received, and thus Fair Catches must be encouraged. This may be achieved by awarding bonus yards.

* **Penalty and Ejection for Roughness**

Individual analysis of lots of concussion footage showed that many of the players were involved in unecessary roughness. They used Helmet to tackle the opponent player. The 2018 Rule Change came up with  **"lowering the head to initiate contact with the helmet a foul".** But still its not enough. The foul should be backed up with Soccer like ejections. 
* Players receive “RED card” or ejection for direct helmet to helmet contact and a 10 yrd penalty. The roughness can be judged using any speed  sensors.
* Players receive “YELLOW card” or ejection for direct helmet to body contact and a 5 yrd penalty. The roughness can be judged using any speed sensors.

Internet of Things has reached to a great level and we have sensors for almost everything. So Sensors can be used to monitor the Helmet Use for tackling other players. 


Once, the 2018 data is available we will have more insights and can analyze the impact of 2018 rule changes. 

Version submitted for competition is Version 15. 

*Note : I knew nothing about NFL before this competetion, So,Thanks Kaggle and NFL.*


