NFL Punt analytic for concussion prevention

In 2018 season NFL implement rule change to kickoff to limit concussion in kickoff. In this kaggle challenge, participant are asked to suggest rule change for reduction of concussion in punt as well. First we would have a review on previous rule change and regulation to elinimate concussion. 

**1. Review on NFL concussion safety policy**



**a. Concussion Protocol**

Concussion protocol was developed by NFL to ensure concussed players are receiving aquadate medicial care.

If a player exhibate signs of concussion, coaches, players or other NFL officials can trigger concussion protocol and the player is immediately moved to sidelide for concussion assessment. 

If the result is positive or uncertain, player will be examined by team physicians in the locker room. Player is cleared to play only if he passed all assessment

![](https://www.playsmartplaysafe.com/wp-content/uploads/2018/06/checklist-june-2018-final1-791x1024.png)


For every NFL players who are diagnosed with a concussion, players need to go through a 5-step return-to-participation protocol and cleared by team physicians in order to resume contact practice and play in a game.

![](https://www.playsmartplaysafe.com/wp-content/uploads/2017/07/nfl_returntoparticipationprotocol.png)




**b. Equipment and technology**

In 2016, NFL pledged $60 millions fro  a five-year plan called the Engineering Roadmap to help a better understanding of biomechanics of head injuries. Here are a few examples on how new technology can help to promote a safer games:

**Helmet testing**: Every year, NFL conduct laboratory research to evaluate which helmets best reduce head impact severity and the findings are shard to NFL teams and players. In 2018 season, 10 helmets are banned to use due to their performance.  
![](https://www.playsmartplaysafe.com/wp-content/uploads/2016/05/helmet1-767x1024.png)

**HeadHealthTECH Challenges**: NFL has held HeadHealthTECH Challenges annually to attract companies and universities for develop improved helmets and protective equipment. Six challenges has beed held with up to $1 million annual funding.

**Electronic equipment**: NFL teams have access to sideline video monitors and able to review the cause the injury and design the best care for the player. Also, medical staff has instant access of player's electronic medical record as well as electronic tablets for concussion diagnosis and treatment.



**c. Rule Change**

 Since 2002, NFL has made 50 rules changes intended to eliminate potentially dangerous tactics and reduce the risk of injuries. Specifically in 2018 season:
 
** Use of Helmet rule**:

A 15-yards penalty if a player lowers his head to initiate and make contact with his helmet against an opponent. He might also be ejected from the game.

** Kickoff rule change**

* Kickoff team must have five players on each side of the ball.

* Eight players of the receiving team must be lined up in the 15-yard “setup zone”

* No wedge blocks (shoulder-to-shoulder allignment) are permitted

* No player on either the receiving or kicking team may block within the 15-yard area, or 10-yard for onside-kick

* The ball is dead if it is not touched by the receiving team and touches the ground in the end zone (touchback)

**2. Overview of data**

Here is the overview of concussion data for punting play in 16-17 NFL season

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import HTML
import math
plt.style.use('seaborn')

In [None]:
video_review = pd.read_csv('../input/video_review.csv')
video_review.head()

In [None]:
player_role_data = pd.read_csv('../input/play_player_role_data.csv')
player_role_data.head()

In [None]:
play_information_data = pd.read_csv('../input/play_information.csv')
play_information_data.head()

In [None]:
video_footage_injury_data = pd.read_csv('../input/video_footage-injury.csv')
video_footage_injury_data.head()

First we look at how much concussion happens during punt play in 16-17 season

In [None]:
len(video_review)

In 16-17 season punt play account for about 6.4% of total plays. From [nfl concussion data](https://www.playsmartplaysafe.com/newsroom/reports/2017-injury-data/) there are 452 concussions happens during 16-17 NFL game and therefore punt plays account for 8.2% of total concussion. It means that punt play generally has cause a slightly higher chance for concussion to happens. Next we look at the split of concussion data.

In [None]:
pd.value_counts(video_footage_injury_data['Type']).plot.bar()
plt.title('Concussed play season type')

In [None]:
pd.value_counts(video_footage_injury_data['season']).plot.bar()
plt.title('Concussed play season year')
plt.yticks(np.arange(25, step = 5))

In [None]:
pd.value_counts(video_review['Player_Activity_Derived']).plot.bar()
plt.title('Concussed player action')

In [None]:
pd.value_counts(video_review['Primary_Partner_Activity_Derived']).plot.bar()
plt.title('Concussion partner action')

In [None]:
pd.value_counts(video_review['Primary_Impact_Type']).plot.bar()
plt.title('Impact position')

In [None]:
pd.value_counts(video_review['Friendly_Fire']).plot.bar()
plt.title('Friendly fire?')

Next step we merge concussion data with play information and player role data

In [None]:
merged_data = pd.merge(video_review,play_information_data)
merged_data = pd.merge(merged_data,player_role_data)
merged_data.head()

In [None]:
receiving_position = ['PR', 'PFB', 'VR', 'PDR1', 'PDL2']

receive_data = merged_data[merged_data.Role.isin(receiving_position)]
pd.value_counts(receive_data['Role']).plot.bar()
plt.title('Concussed data by receiving team position')

In [None]:
punt_data = merged_data[~merged_data.Role.isin(receiving_position)]
pd.value_counts(punt_data['Role']).plot.bar()
plt.yticks(np.arange(5))
plt.title('Concussed data by punting team position')

We can see that the risk of concussion of punting team has 7 times more than non-returner of receiving team.

In [None]:
punt_data['Concussed_position'] = 'kicking'
receive_data['Concussed_position'] = 'receiving'
pd.value_counts(pd.merge(receive_data,punt_data,how='outer')['Concussed_position']).plot.bar()
plt.title('Concussed count by team split')

In [None]:
def return_yardline_from_string(pos_team,yardline_string):
    
    if (pos_team in yardline_string): #opponent yardline
        yardline = 100 - float(yardline_string.split(' ')[1])
    else:
        yardline = float(yardline_string.split(' ')[1])
        
    return yardline    

In [None]:
#merging all data using pivot table
table = pd.pivot_table(player_role_data,index=['GameKey', 'PlayID'],columns=['Role'], aggfunc=lambda x: len(x.unique()))['GSISID'].fillna(0) 

table.reset_index(inplace=True)

all_punt_data = pd.merge(table,play_information_data)
all_punt_data = pd.merge(all_punt_data,video_review,how='outer')

punt_yards_list = []
adjust_yardline_list = []
returned_list = []
return_yards_list = [] 
fair_catch_list = [] 
for i,yards in enumerate(all_punt_data.PlayDescription.str.split('punts ').str[1].str[:2]):
    try:
        if('No Play' in all_punt_data['PlayDescription'][i] or 'Direct snap' in all_punt_data['PlayDescription'][i] or 'pass' in all_punt_data['PlayDescription'][i] or 'FUMBLE' in all_punt_data['PlayDescription'][i]):
            punt_yards_list.append(np.nan)
        else:
            punt_yards_list.append(float(yards))
    except ValueError:
        punt_yards_list.append(np.nan)

    if(all_punt_data['Poss_Team'][i] in all_punt_data['YardLine'][i]):
        adjust_yardline_list.append(float(all_punt_data['YardLine'][i].split(' ')[1]))
    else:
        adjust_yardline_list.append(100-float(all_punt_data['YardLine'][i].split(' ')[1]))
    if(' for ' in all_punt_data['PlayDescription'][i]): #returned kick
        returned_list.append(1)
    else:
        returned_list.append(0)
    if('fair catch' in all_punt_data['PlayDescription'][i]):
        fair_catch_list.append(1)
    else:
        fair_catch_list.append(0)
        
for i,yards in enumerate(all_punt_data.PlayDescription.str.split(' for ').str[1].str[:2]):        
    try:
        if('No Play' in all_punt_data['PlayDescription'][i] or 'fair catch' in all_punt_data['PlayDescription'][i] or 'Direct snap' in all_punt_data['PlayDescription'][i] or 'pass' in all_punt_data['PlayDescription'][i]):
            return_yards_list.append(np.nan)
        elif('MUFFS' in all_punt_data['PlayDescription'][i]):
            return_yards_list.append(0)
        elif(str(yards) == 'no'):
            return_yards_list.append(0)
        elif('PENALTY' in all_punt_data['PlayDescription'][i]):
            if(all_punt_data['PlayDescription'][i].split('PENALTY on ')[1][:3] == all_punt_data['Poss_Team'][i]): #kicking team penalty
                return_yards_list.append(float(yards))
            elif('TOUCHDOWN.' in all_punt_data['PlayDescription'][i]): #penalty after touchdown    
                return_yards_list.append(float(yards))
            else:
                return_end_yardline = return_yardline_from_string(all_punt_data['Poss_Team'][i],all_punt_data['PlayDescription'][i].split('to ')[1][:6])           
                if(float(all_punt_data['PlayDescription'][i].split('enforced at ')[1][:2]) != 50):
                    penalty_yardline = return_yardline_from_string(all_punt_data['Poss_Team'][i],all_punt_data['PlayDescription'][i].split('enforced at ')[1][:6])
                else:
                    penalty_yardline = 50
                return_yards_list.append(return_end_yardline - penalty_yardline)
                
        else:    
            return_yards_list.append(float(yards))
    except ValueError:    
        return_yards_list.append(np.nan)       
all_punt_data['punt_yards'] = punt_yards_list
all_punt_data['adjust_yardline'] = adjust_yardline_list
all_punt_data['returned'] = returned_list
all_punt_data['concussed'] = all_punt_data.GSISID.notnull().astype(int)
all_punt_data['return_yards'] = return_yards_list
all_punt_data['fair_catch'] = fair_catch_list
all_punt_data['blocked'] = (all_punt_data.PlayDescription.str.contains('BLOCKED') == True | (all_punt_data.punt_yards <= 10)).astype(int)

In [None]:
concussed_split = all_punt_data.groupby(['returned'])['concussed'].mean()

print("Non-return punt concussed percentage = %1.1f%%" % (100*concussed_split[0]))
print("Returned punt concussed percentage = %1.1f%%" % (100*concussed_split[1]))

In [None]:
pd.value_counts(all_punt_data[all_punt_data['concussed'] == 1]['returned']).plot.bar()
plt.xticks([0,1], ('Punt return', 'Others'))
plt.title('Concussion number by punt outcome')

We can see that concussed percentage is way up for returned punt. Similar to kickoff rule change, one of the main idea of suggesting a rule change for punting is to motivate return for more fair catches but without eliminate punt return.

**3. Proposed rule changes**

**a.  Change to rule 9 (Scrimmage Kick)  section 1:**


** At most two players from both kicking and receiving team can be lined up outside of yard number, one for each side of the field**


Before talking about the rule change let's discuss about punt coverage in football. In NFL, there is a regulation for defesive formation:

NFL rule 9-1-2:

During a kick from scrimmage, only the end men (eligible receivers) on the line of scrimmage at the time of the snap, or an eligible receiver who is aligned or in motion behind the line 
and is more than one yard outside the end man, are permitted to advance more than one yard beyond the line before the ball is kicked.

Since only elgible receiver can cover the punt before the kick, NFL teams cannot ultilize a "shield punt formation" and therefore most punt in the the league end up like:

![](https://mgoblog.com/sites/mgoblog.com/files/Oldpunt.jpg)


where two wide receiver at the end act as gunner to cover the punt, and the rest of the team to protect the punt kick.

For the receiving team, there are three types of formation, depends on number of player cover against opponent's gunner: 

Single coverage

![](https://imgur.com/7xMiR2j.jpg)

Double coverage

![](https://imgur.com/dKGrWie.jpg)

Hybrid coverage

![](https://imgur.com/2LQiOXe.jpg)


Let's check the effect of coverage on concussion chance:


In [None]:
single_coverage = all_punt_data[(all_punt_data['VR'] == 1) & (all_punt_data['VL'] == 1) ]

double_coverage = all_punt_data[(all_punt_data['VR'] == 0) & (all_punt_data['VL'] == 0) ]

hybrid_coverage = all_punt_data[(all_punt_data['VR'] == 0) ^ (all_punt_data['VL'] == 0) ]

In [None]:
print("Number of single coverage punt: %d" % len(single_coverage))
print("Concussions from single coverage punt: %d\n" % single_coverage.Primary_Impact_Type.notna().sum())
print("Number of hybrid coverage punt: %d" % len(hybrid_coverage))
print("Concussions from hybrid coverage punt: %d\n" % hybrid_coverage.Primary_Impact_Type.notna().sum())
print("Number of double coverage punt: %d" % len(double_coverage))
print("Concussions from double coverage punt: %d" % double_coverage.Primary_Impact_Type.notna().sum())

But bear in mind that coaches tends to deploy different coverage depends on punting line of scrimmage

In [None]:
print("single coverage average starting yardline = %1.1f" % np.nanmean(np.array(single_coverage.adjust_yardline).astype(float)))
print("hybrid coverage average starting yardline = %1.1f" % np.nanmean(np.array(hybrid_coverage.adjust_yardline).astype(float)))
print("double coverage average starting yardline = %1.1f" % np.nanmean(np.array(double_coverage.adjust_yardline).astype(float)))

So we would like to find the effect of coverage after control for yardline using Statsmodels package

In [None]:
hybrid_coverage['coverage_type'] = 0
single_coverage['coverage_type'] = 1
double_coverage['coverage_type'] = 2

regression_df = pd.concat([single_coverage,double_coverage,hybrid_coverage])

In [None]:
import statsmodels
import statsmodels.api as sm

import statsmodels.formula.api as smf


results = smf.logit(formula='returned ~ C(coverage_type) + adjust_yardline', data=regression_df).fit()

In [None]:
results.summary()

In [None]:
print("Odd ratio of return for single coverage compare to hybrid coverage = %f" % np.exp(results.params)[1])
print("Odd ratio of return for single coverage compare to double coverage = %f" % (np.exp(results.params)[1] * np.exp(results.params)[2]))

It demostartes that single coverage formation can significatly decrease the chance of returning a punt even after adjusted for starting yardline. 

We would also like to find the effect of coverage on average yards per punt.

In [None]:
regression_df_punt = regression_df.dropna(subset=['punt_yards'])
regression_df_return = regression_df.dropna(subset=['return_yards'])

In [None]:
results = smf.ols(formula='punt_yards ~ adjust_yardline +  C(coverage_type)', data=regression_df_punt).fit()

In [None]:
results.summary()

Here it shows that a hybrid coverage would increase average yards per punt, but no significant relation found on the difference between single and double coverage. Which means that the rule change fits our goal: limit the number of returns, but receiving team can still try to advance the ball as before.

**b.  Change to rule 10 (Opportunity to catch a kick, fair catch)  section 2 article 4:**


** After a fair catch is made, the receiving team can put the ball in play by a snap from 5 yards in front of spot of the catch if the catch is off a punt kick**

Simply speaking, it's to award the receiving team for 5 yards after they have a fair catch off a punt. 


From the first rule change, we can tell how much less likely for receive team after banning double coverage formation. We shall use punt from own 34 as example at it close to average punt line of scrimmage.

In [None]:
print("Average punting line of scrimmage = %1.3f\n" % np.mean(all_punt_data['adjust_yardline']))

print("Probability of return a punt from 34 yardlime for single coverage: %f" % (math.exp(-0.679+2.133-0.0623*34) / (1+math.exp(-0.679+2.133-0.0623*34))))
print("Probability of return a punt from 34 yardlime for hybrid coverage: %f" % (math.exp(0+2.133-0.0623*34) / (1+math.exp(0+2.133-0.0623*34))))
print("Probability of return a punt from 34 yardlime for double coverage: %f" % (math.exp(0.208+2.133-0.0623*34) / (1+math.exp(0.208+2.133-0.0623*34))))


Also we can output  punt return data for different coverage.

In [None]:
print("Number of return of all punt = %d\n"% np.nansum(np.array(all_punt_data.returned).astype(float)) )

print("Single coverage number of return = %d" % np.nansum(np.array(single_coverage.returned).astype(float)))
print("Hybrid coverage number of return = %d" % np.nansum(np.array(hybrid_coverage.returned).astype(float)))
print("Double coverage number of return = %d\n" % np.nansum(np.array(double_coverage.returned).astype(float)))



print("Average yards per return of all punt = %1.2f\n"% np.nanmean(np.array(all_punt_data.return_yards).astype(float)) )

print("Single coverage average yards per return = %1.2f" % np.nanmean(np.array(single_coverage.return_yards).astype(float)))
print("Hybrid coverage average yards per return = %1.2f" % np.nanmean(np.array(hybrid_coverage.return_yards).astype(float)))
print("Double coverage average yards per return = %1.2f" % np.nanmean(np.array(double_coverage.return_yards).astype(float)))

Using linear regression to see the decrease of return yards for single coverage:

In [None]:
results = smf.ols(formula='return_yards ~ adjust_yardline +  C(coverage_type)', data=regression_df_return).fit()

In [None]:
results.summary()

After considering direct decrease in punt yards and decrease in punt return chance, we can calculate how effective punt yards change

Formula: $\%{  }of{ }play \times ( {decrease\text{  }in\text { }return\text{ }yards} + {decrease\text{  }in\text { }return\text{ }chance} \times {average\text{  }return\text { }yards} )$



In [None]:
print("Decrease in effective punt yards at own 34 = %1.2f" % ((1865/(1865+1423+3379))*(1.03+(0.503-0.34)*(11-0.0695*34-1.03))+(1423/(1865+1423+3379))*((1.03-0.12)+(0.55-0.34)*(11-0.0695*34-1.03+0.12))))

After calculation, we find that on average receiving team lose about 1.2 yards due to rule change. As to compensate the effect, it is suggested that 5 yards is awarded to receiving team for each fair catch. Similar to kickoff rule changes happens a 2016 which change touchback line to 25 yard, the change may let receiving team to go for fair catch more and hence decrease the number of concussion.


**c. Point of Emphasis: Protection of kick returner**

**Special teamer cannot lower their head to tackle kick returner**

Point of emphasis is not really a rule change, but a guideline to referees for better gaming integrity in the comming season. 


In 2016-17, half of the concussions in punt plays are due to tackling, and punt returner has the highest risk of concussion in all positions. After reviewing the video, quite a number of them are due to head first tackle by kicking team. The tackling technique put both the tackler and returner at a higher risk to sustain injuries. Here are video examples how head-first tackle increase concussion risk


Tackler injury

In [None]:
HTML('<video width="800" height="600" controls> <source src="http://a.video.nfl.com//films/vodzilla/153233/Kadeem_Carey_punt_return-Vwgfn5k9-20181119_152809972_5000k.mp4" type="video/mp4"></video>')

Returner injury

In [None]:
HTML('<video width="800" height="600" controls> <source src="http://a.video.nfl.com//films/vodzilla/153274/Haack_punts_41_yards-SRJMeOc3-20181119_165546590_5000k.mp4" type="video/mp4"></video>')

In 2018 season, NFL has implement on rule change that it is a foul if a player lowers his head to initiate and make contact with his helmet against an opponent. Contact does not have to be to an opponent’s head or neck area. From [nflpenalties.com](http://www.nflpenalties.com/penalty/lowering-the-head-to-initiate-contact?year=2018&view=log) we can see the statistic of targeting penalty

![](https://imgur.com/7XYCMQf.png)

We can see out of 16 penalties called in till the end of 2018 regular season, 15 of them are called on defense and the remaining one is on offense, which means none of them is called during special team play. 
Therefore, referee can pay attention on players' tackling technique during punt return and penalize those who use head-first tackle. Here are some examples of safe "wrap up"technique from control group video footage:



In [None]:
HTML('<video width="800" height="600" controls> <source src="http://a.video.nfl.com//films/vodzilla/153511/Lechler_58_yard_punt-8LArhoQg-20181121_123420599_5000k.mp4" type="video/mp4"></video>')

In [None]:
HTML('<video width="800" height="600" controls> <source src="http://a.video.nfl.com//films/vodzilla/153517/Schmidt_57_yard_punt-EMXj28Mw-20181121_124742503_5000k.mp4" type="video/mp4"></video>')

**4. Discussion of other possible rule changes**

Here I would discuss other possible rule changes as well as their pro and cons. They are not included in the main sessions above due to possible difficulties for implementation.

**a.  Limit number of players of receiving team to block the punt**

The rule can be implemented by limit number of players crossing line of scrimmage when opponent present a punt formation

We may look at how many players attmept to block punt in each players, where I define as number of players with 10 yards of punter at the time of the punt

In [None]:
file_location = ['NGS-2016-pre.csv', 'NGS-2016-reg-wk1-6.csv', 'NGS-2016-reg-wk7-12.csv', 'NGS-2016-reg-wk13-17.csv','NGS-2016-post.csv',
                 'NGS-2017-pre.csv', 'NGS-2017-reg-wk1-6.csv', 'NGS-2017-reg-wk7-12.csv', 'NGS-2017-reg-wk13-17.csv','NGS-2017-post.csv']

tracking_df = pd.DataFrame()

for i in range(10):
    data = pd.read_csv('../input/' + file_location[i])
    data = pd.merge(data,play_information_data,how='outer')
    data = pd.merge(data,player_role_data,how='outer')
    punter_data = data[(data.Event=='punt') & (data.Role == 'P')][['x','y','Season_Year' ,'GameKey','PlayID','Time']]
    punter_data.columns = ['punter_x', 'punter_y','Season_Year','GameKey','PlayID','Time']
    receiving_position = ['PDL1', 'PDL2','PDL3', 'PDL4', 'PDL5', 'PDL6', 'PDM', 'PDR1', 'PDR2', 'PDR3','PDR4', 'PDR5', 'PDR6', 'PFB','PLL', 'PLL1', 'PLL2',
       'PLL3', 'PLM', 'PLM1', 'PLR', 'PLR1', 'PLR2', 'PLR3','VL', 'VLi', 'VLo', 'VR', 'VRi', 'VRo', 'PR']

    punt_time = np.array(punter_data.Time)
    playid = np.array(punter_data.PlayID)
    gamekey = np.array(punter_data.GameKey)

    receiving_team_data = data[(data.Event=='punt') & (data.Role.isin(receiving_position))] #All member of receiving team at time of punt
    
    merged_data = pd.merge(receiving_team_data,punter_data,how='left').dropna()
    merged_data['distance_to_P'] = ((merged_data['x'] - merged_data['punter_x'])**2-(merged_data['y'] - merged_data['punter_y']))**0.5
    output_df = merged_data[merged_data['distance_to_P'] < 10].groupby(['Season_Year','GameKey','PlayID'])['Role'].count().reset_index() #group all receiving team player around 10 yards of punter
    output_df.columns = ['Season_Year', 'GameKey', 'PlayID', 'no_blocker']

    
    tracking_df = tracking_df.append(output_df)      

In [None]:
all_punt_data = pd.merge(all_punt_data,tracking_df)

In [None]:
pd.value_counts(all_punt_data['no_blocker']).plot.bar()
plt.title('Number of blockers in punt')

We can also check how number of blockers affect chance of concussion

In [None]:
results = smf.logit(formula='concussed ~ adjust_yardline  + no_blocker', data=all_punt_data).fit()

In [None]:
results.summary()

Turn out that it's possible to number of blockers has an impact on number of concussion

**Problem:**

**1. Possible abusement in field goal**

In NFL there is no restriction when and where to present a punt formation. Therefore, if number of blockers for defense is limited, a team which attempt a field goal can present a punt formation to aviod getting blocked or draw penalty by faking field goal and punt instead.

**2. Increase in number of returns and lead to more concussions:**

Decrease of number of blockers will lead to increase in punt return, presumably would increase the risk of concussion

In [None]:
results = smf.logit(formula='returned ~ adjust_yardline  + no_blocker', data=all_punt_data).fit()
results.summary()

**b. Limit the number of overload defender**

In current NFL rule, when opponent present a field goal or try kick (extra point) formation, defense cannot punt more than six players on each side of the snapper (NFL rulebook 9-3-2(2)). We would see the effect of number of overload defender (i.e. unbalanced number of player on each side of the field) for punt play.


In [None]:
all_punt_data['overload'] =  ((all_punt_data['PDL1'] + all_punt_data['PDL2'] + all_punt_data['PDL3'] + all_punt_data['PDL4'] + all_punt_data['PDL5'] + all_punt_data['PDL6']) - \
(all_punt_data['PDR1'] + all_punt_data['PDR2'] + all_punt_data['PDR3'] + all_punt_data['PDR4'] + all_punt_data['PDR5'] + all_punt_data['PDR6']) + \
(all_punt_data['PLL1'] + all_punt_data['PLL2'] + all_punt_data['PLL3']) - \
(all_punt_data['PLR1'] + all_punt_data['PLR2'] + all_punt_data['PLR3'])).abs()

In [None]:
results = smf.logit(formula='concussed ~ adjust_yardline  + overload', data=all_punt_data).fit()
results.summary()

In [None]:
results = smf.logit(formula='returned ~ adjust_yardline  + overload', data=all_punt_data).fit()
results.summary()

**Problem: **

We can see number of overload defender neither affect the concussion chance nor rate of returning the punt. The rule change is easily implemented, but probably cannot meanfully decrease number of concussions

**c. players from kicking team need to have at least 5 yards from punt returner**

In Canadian football, there is no fair catches. Instead, kicking team player need to be at least 5 yards from the returner. Teams who violate will be given a "no yards" penalty. Let's check the data of gunner within 5 yards.

In [None]:
file_location = ['NGS-2016-pre.csv', 'NGS-2016-reg-wk1-6.csv', 'NGS-2016-reg-wk7-12.csv', 'NGS-2016-reg-wk13-17.csv','NGS-2016-post.csv',
                 'NGS-2017-pre.csv', 'NGS-2017-reg-wk1-6.csv', 'NGS-2017-reg-wk7-12.csv', 'NGS-2017-reg-wk13-17.csv','NGS-2017-post.csv']

tracking_df = pd.DataFrame()

for i in range(10):
    data = pd.read_csv('../input/' + file_location[i])
    data = pd.merge(data,play_information_data,how='outer')
    data = pd.merge(data,player_role_data,how='outer')
    returner_data = data[((data.Event=='fair_catch') | (data.Event=='punt_received')) & (data.Role == 'PR')][['x','y','Season_Year' ,'GameKey','PlayID','Time']]
    returner_data.columns = ['punter_x', 'punter_y','Season_Year','GameKey','PlayID','Time']
    receiving_position = ['PDL1', 'PDL2','PDL3', 'PDL4', 'PDL5', 'PDL6', 'PDM', 'PDR1', 'PDR2', 'PDR3','PDR4', 'PDR5', 'PDR6', 'PFB','PLL', 'PLL1', 'PLL2',
       'PLL3', 'PLM', 'PLM1', 'PLR', 'PLR1', 'PLR2', 'PLR3','VL', 'VLi', 'VLo', 'VR', 'VRi', 'VRo', 'PR']

    return_time = np.array(returner_data.Time)
    playid = np.array(returner_data.PlayID)
    gamekey = np.array(returner_data.GameKey)

    kicking_team_data = data[((data.Event=='fair_catch') | (data.Event=='punt_received')) & ~(data.Role.isin(receiving_position))] #All member of receiving team at time of punt
    
    merged_data = pd.merge(kicking_team_data,returner_data,how='left').dropna()
    merged_data['distance_to_PR'] = ((merged_data['x'] - merged_data['punter_x'])**2-(merged_data['y'] - merged_data['punter_y']))**0.5
    output_df = merged_data[merged_data['distance_to_PR'] < 5].groupby(['Season_Year','GameKey','PlayID'])['Role'].count().reset_index() #group all receiving team player around 10 yards of punter
    output_df.columns = ['Season_Year', 'GameKey', 'PlayID', 'close_defender']

    
    tracking_df = tracking_df.append(output_df)      

In [None]:
close_defender_data = pd.merge(all_punt_data,tracking_df)

In [None]:
pd.value_counts(close_defender_data['close_defender']).plot.bar()

In [None]:
results = smf.logit(formula='returned ~ close_defender + adjust_yardline', data=close_defender_data).fit()
results.summary()

In [None]:
results = smf.logit(formula='concussed ~ close_defender + adjust_yardline', data=close_defender_data).fit()
results.summary()

Also for fair catches in punt in NFL:

In [None]:
results = smf.logit(formula='concussed ~ fair_catch + adjust_yardline', data=all_punt_data).fit()
results.summary()

We can see number of close defender significantly decrease the number of return, but it's doesn't really mean much since returner will more likely to have more fair catch. It doesn't affect concussion chance significantly either.

**Problem: **

Number of defender within returner's proximityis more of a symptom than an actual meaningful attribute, and it has no edvidence to shows it helps since from above discussion we know that returning punt is highly assiosiate with concussion.  The reason of CFL has the penalties is that it has no fair catch, and  it is shown that fair catches has significantly decrease number of concussion.

**f. Abolish punt return**

Another way to abolish punt return and put the ball at the spot where the punt land.

**Problem:**

**1. Affect ability to score in late game**

It will decrease the ability for  team to make a comeback or take the lead in late game. For example at 2010 week 15 Eagles vs Giant (dubbed "[Miracle at the New Meadowlands](https://www.youtube.com/watch?v=2PufejLOdzs)"), Eagles overcome a 21 points deficit with 8 minutes remaining and score a game-winning touchdown when clock expire. Without punt return, it would be harder for trailing team to comeback.

**2. Eliminate punter in NFL**

One of the reason why NFL team have one punter and one placekicker is that they provide different skill set: kicker has to have good accuracy to make field goal, while punter need good control of the ball so as to limit return and let opponent start at a disadvantage position. If punt return is abolish it probably take less skill to become a punter, and thus pushing them out of the league.

**e. Abolish punting**

It may not fit the theme of the competition, but there are discussions that if rule changes does not limit the number of concussions in kickoff, NFL may consider to ban kickoff and put the ball on a fixed line after scoring.

**Problem:**

**1. Concussion risk **

From above data, punt contribute for 8.3% of concussion but it's over [18% for kickoff](https://www.washingtonpost.com/news/early-lead/wp/2018/03/28/could-the-nfl-ban-kickoffs-concussion-concerns-have-idea-moving-closer-to-reality/?utm_term=.06ed4fce4b4f). The inherit risk of getting concussed in a punt play is much lower than kickoff. If NFL hasn't get rid of kickoff, punting probably won't disappear anyway.

**2. Implementaion problem**

For kickoff the league can choose an arbitary yardline to put the ball on, but for punt there isn't one, especially punt yards is highly dependent of line of scrimmage of punting team.

**3. Affect ability to score in late game**

**4. Eliminate punter in NFL**

See section (d) above

In [None]:
all_punt_data[['punt_yards','adjust_yardline']].groupby('adjust_yardline').mean().plot(kind='bar')
plt.xticks([])
plt.show()

So here is the end of the kernel and thanks for reading. Feel free to point out any mistakes I made or other suggestions!