# NFL 1st and Future

## Introduction<br/>
The National Football League is America's most popular sports league, comprised of 32 franchises that compete each year to win the Super Bowl, the world's biggest annual sporting event. Founded in 1920, the NFL developed the model for the successful modern sports league, including national and international distribution, extensive revenue sharing, competitive excellence, and strong franchises across the country.
The NFL is committed to advancing progress in the diagnosis, prevention and treatment of sports-related injuries. The NFL's ongoing health and safety efforts include support for independent medical research and engineering advancements and a commitment to work to better protect players and make the game safer, including enhancements to medical protocols and improvements to how our game is taught and played.
As more is learned, the league evaluates and changes rules to evolve the game and try to improve protections for players. Since 2002 alone, the NFL has made 50 rules changes intended to eliminate potentially dangerous tactics and reduce the risk of injuries.(NFL.com)

## Kernel Problem<br/>
The objective of this kernel is to see if there is any impact of playing surfaces on players movement and performance that may increase the injury risk, and to find out if the injury risk is affected by other playing environment.

## kernel methodology<br/>
lots of research indicates that injury risk is not birth of moment, it depends on all previous and present environment. and this was the key of ,y analysis is to make aggregate comparison between player who has at least one injury in the past (100 in this datasets with 5 has repeated injuries) and player who has no injury in there past (150 players in this datasets), data visualizations is used to indentify any relationships between injury risk and playing environment and performance, logistic regression is used as evidence to prove the significant of this relationship.


In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from pandas import DataFrame
import matplotlib.colors as mcolors
import matplotlib.patheffects as path_effects
from matplotlib import cm
import itertools
from matplotlib.patches import Rectangle, Polygon
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
from PIL import Image
from textwrap import wrap
import re
import warnings

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
        
warnings.filterwarnings('ignore')
%matplotlib inline
plt.style.use('default')
kaggle_color = '#20beff'

In [None]:
playlist=pd.read_csv("/kaggle/input/nfl-playing-surface-analytics/PlayList.csv")
injuryrecord=pd.read_csv("/kaggle/input/nfl-playing-surface-analytics/InjuryRecord.csv")

## Analysis 

### Injury Type

In [None]:
# Define variable indicated number of missing training days for each injury
injuryrecord["DayMissingTraining"]=injuryrecord["DM_M1"]+injuryrecord["DM_M7"]+injuryrecord["DM_M28"]+injuryrecord["DM_M42"]
injuryrecord["DayMissingTraining"]=injuryrecord["DayMissingTraining"].replace(4,42)
injuryrecord["DayMissingTraining"]=injuryrecord["DayMissingTraining"].replace(3,28)
injuryrecord["DayMissingTraining"]=injuryrecord["DayMissingTraining"].replace(2,7)
injuryrecord["DayMissingTraining"]=injuryrecord["DayMissingTraining"].replace(1,1)

#Find number of players for each type of injury
injury1=pd.DataFrame(injuryrecord.groupby(["PlayerKey","BodyPart"])["BodyPart"].count().unstack().fillna(0))
#Find number of missing training days for each injured player
injury2=pd.DataFrame(injuryrecord.groupby(["PlayerKey"])["DayMissingTraining"].mean().fillna(0))

injury=pd.concat([injury1,injury2],axis=1)
#Find number of injury 
injury["NoInjury"]=injury.Ankle+injury.Foot+injury.Heel+injury.Knee+injury.Toes


In [None]:
injuryrecord["BodyPart"]=injuryrecord["BodyPart"].astype("category")
injuryrecord["BodyPart"]=injuryrecord["BodyPart"].cat.set_categories(["Knee","Ankle","Toes","Foot","Heel"])

injuryrecord["DayMissingTraining"]=injuryrecord["DayMissingTraining"].astype("category")
injuryrecord["DayMissingTraining"]=injuryrecord["DayMissingTraining"].cat.set_categories([7,1,42,28])
def draw_border_around_axes(this_ax, color="black"):
    for axis in ['top','bottom','left','right']:
        this_ax.spines[axis].set_visible(True)
        this_ax.spines[axis].set_color(color)

def hide_axes(this_ax):
    this_ax.set_frame_on(False)
    this_ax.set_xticks([])
    this_ax.set_yticks([])
    return this_ax

barstyle = {"edgecolor":"black", "linewidth":0.5}

        
f, ax = plt.subplots(nrows=2, ncols=2, figsize=(9,9),gridspec_kw={'height_ratios':[2,5], 'width_ratios':[2,5], 'wspace':0.1, 'hspace':0.1})
this_ax = ax[0,0]
hide_axes(this_ax)

hm_ax = ax[1,1]
sns.heatmap(pd.crosstab(injuryrecord["BodyPart"],injuryrecord["DayMissingTraining"]),annot=True, fmt="d", square=False, 
center = 90, vmin=0, vmax=20, lw=4, cbar=False,color="black")
hm_ax.yaxis.tick_right()
hm_ax.yaxis.set_label_position("right")
draw_border_around_axes(hm_ax)

this_ax = ax[1,0]
injuryrecord.BodyPart.value_counts().to_frame().sort_values(by="BodyPart").plot.barh(ax=this_ax,colors='darkseagreen',**barstyle)
this_ax.set_xlabel("BodyPart")
this_ax.xaxis.tick_top()
this_ax.set_xlim(this_ax.get_xlim()[::-1]);
this_ax.yaxis.set_label_position("right")
this_ax.xaxis.set_label_position("top")


this_ax = ax[0,1]
injuryrecord.DayMissingTraining.value_counts().plot.bar(ax=this_ax,colors='peru',**barstyle)
this_ax.set_ylabel("MissingDays")
this_ax.xaxis.set_label_position("bottom")
this_ax.yaxis.set_label_position("left")
this_ax.xaxis.tick_top()


There are two ways to classify our injuries<br/>
**Body Type**: as you can see players are suffering the most from Knee and Ankle injuries with (85.7)%, where Toes, Foot and Heel are much less likely to happened.<br/>
**Serious**: Other concern should be interesting is how serious is your injury is? and i used number of missing training days as indicator, notice that (64.8)% of injuries do not serious (less missing trainig days).<br/>

Now if we look deeper into out charts you can notice that fortunetly (67.8)% of our comon injuries(Knee and Ankle) is not serious where (53.3)% of not common injuries are serious.<br/>
So i think there is good news for NFL that serious injuries are less likely to occured. 

In [None]:
injuryrecord.groupby(['BodyPart','Surface']).count().unstack('BodyPart')['PlayerKey'].T.apply(lambda x: x).sort_values('BodyPart').T.sort_values("Knee", ascending=False).plot(kind='barh',
          figsize=(15, 5),
          title='Injury Body Part by Field Type',
          stacked=True)
plt.show()


There are two things to highligh here<br/>
Players are more likely to injured in stadiums with synthetic surface than natural<br/>
From injury type perspective, Ankle and Toes is more likely to occurs in synthetic surface, where foot and heel is more likely to happened at natural surface.


## Playing Environments

In [None]:
# Data cleaning
#1-Stadium Type
playlist.StadiumType=playlist.StadiumType.replace(["Outdoor","Outdoors","Oudoor","Ourdoor","Outdoor Retr Roof-Open","Outddors",
                                                  "Outdor","Outside","Open","cloudy","Cloudy","Bowl","Heinz Field","Indoor, Open Roof"
                                                  ,"Domed, open","Domed, Open","Retr. Roof - Open","Retr. Roof-Open","Open"],"Outdoor")
playlist.StadiumType=playlist.StadiumType.replace(["Indoors","Indoor","Indoor, Roof Closed","Dome","Domed, closed","Dome, closed","Closed Dome",
                                                   "Domed","Retractable Roof","Retr. Roof - Closed","Retr. Roof-Closed","Retr. Roof Closed"],"Indoor")


In [None]:
#2-Weather
playlist.Weather=playlist.Weather.replace(["Cloudy","Partly Cloudy","Mostly Cloudy","Partly Clouidy","Coudy",
                                           "Cloudy with periods of rain, thunder possible. Winds shifting to WNW, 10-20 mph.",
                                          "Party Cloudy","Cloudy, chance of rain","Mostly cloudy","Partly cloudy","cloudy","Cloudy and Cool"
                                          "Cloudy, 50% change of rain","Cloudy, fog started developing in 2nd quarter",
                                           "Cloudy, 50% change of rain","Cloudy and Cool","Cloudy and cold","Cloudy, light snow accumulating 1-3",
                                          "Mostly Coudy","Cloudy, light snow accumulating 1-3","Hazy","Overcast"],"Cloudy")
playlist.Weather=playlist.Weather.replace(["Sunny","Mostly Sunny","Partly Sunny","Mostly sunny","Sunny and clear",
                                           "Sunny and warm","Sunny, highs to upper 80s","Sunny Skies","Sunny, Windy","Partly sunny",
                                           "Clear and Sunny","Sun & clouds","Clear and sunny","Mostly Sunny Skies","Sunny and cold"],"Sunny")
playlist.Weather=playlist.Weather.replace(["Clear","Clear and warm","Clear Skies","Clear skies","Clear and cold","Partly clear",
                                           "Clear and Cool","Clear to Partly Cloudy","Fair"],"Clear")
playlist.Weather=playlist.Weather.replace(["N/A (Indoors)","Indoors","Indoor","N/A Indoor","Controlled Climate"],"Indoor")
playlist.Weather=playlist.Weather.replace(["Light Rain","Scattered Showers","Showers","Rainy","Rain shower","Cloudy, Rain",
                                           "Chance of Rain","Rain likely, temps in low 40s.","10% Chance of Rain","Rain Chance 40%",
                                          "30% Chance of Rain"],"Rain")
playlist.Weather=playlist.Weather.replace(["Heavy lake effect snow","Snow",'Cloudy, light snow accumulating 1-3"'],"Snow")
playlist.Weather=playlist.Weather.replace(["Cold","Heat Index 95"],"Other")

#3-RosterPosition
playlist.RosterPosition=playlist.RosterPosition.replace(["Offensive Lineman"],"Offensive_Lineman")
playlist.RosterPosition=playlist.RosterPosition.replace(["Wide Receiver"],"Wide_Receiver")
playlist.RosterPosition=playlist.RosterPosition.replace(["Defensive Lineman"],"Defensive_Lineman")
playlist.RosterPosition=playlist.RosterPosition.replace(["Running Back"],"Running_Back")
playlist.RosterPosition=playlist.RosterPosition.replace(["Tight End"],"Tight_End")

#4-PlayType
playlist.PlayType=playlist.PlayType.replace(["0"],"None")

#5-playlist
playlist.PlayType=playlist.PlayType.replace(["Extra Point"],"Extra-Point")
playlist.PlayType=playlist.PlayType.replace(["Field Goal"],"Field-Goal")
playlist.PlayType=playlist.PlayType.replace(["Kickoff Not Returned"],"Kickoff-Not-Returned")
playlist.PlayType=playlist.PlayType.replace(["Kickoff Returned"],"Kickoff-Returned")
playlist.PlayType=playlist.PlayType.replace(["Punt Not Returned"],"Punt-Not-Returned")
playlist.PlayType=playlist.PlayType.replace(["Punt Returned"],"Punt-Returned")



In [None]:
# Missing data
#stadium type
playlist["StadiumType"].fillna("Outdoor",inplace=True)
# Weather and Temerature
playlist.Temperature.replace([-999],np.nan,inplace=True)
playlist["Weather"]=playlist.groupby("StadiumType").Weather.transform(lambda x: x.fillna(x.mode()[0]))
playlist["Temperature"].fillna(playlist.groupby(["StadiumType","Weather"])["Temperature"].transform(np.median),inplace=True)


In [None]:
#split the playlist datasets into two sets one for inured players and other for non injured players
Injplaylist=playlist[playlist.PlayerKey.isin(injuryrecord.PlayerKey)]
Injplay=Injplaylist.dropna(axis=0)
play_noinjury=playlist[~playlist.PlayerKey.isin(injuryrecord.PlayerKey)]
play_noinjury.dropna(axis=0,inplace=True)
play_injury=pd.merge(Injplaylist,injuryrecord,on=["PlayerKey","GameID","PlayKey"],how="outer")


In [None]:
play_injury=play_injury.drop(["DM_M1","DM_M7","DM_M28","DM_M42","DayMissingTraining"],axis=1)
play_injury["BodyPart"]=play_injury["BodyPart"].astype("category")
play_injury["BodyPart"]=play_injury["BodyPart"].cat.add_categories('None')
play_injury["BodyPart"].fillna("None",inplace=True)
play_injury.dropna(axis=0,inplace=True)

In [None]:
play_injury["Injury_Risk"]=1
play_noinjury["Injury_Risk"]=0
injuryrisk=pd.concat([play_injury,play_noinjury],axis=0)
injuryrisk.drop(["BodyPart"],axis=1,inplace=True)

risk1=injuryrisk.groupby(["PlayerKey","RosterPosition"])["RosterPosition"].nunique().unstack().fillna(0).astype('int64')
risk4=injuryrisk.groupby(["PlayerKey","PlayType"])["PlayType"].count().unstack().fillna(0).astype('int64')
risk6=injuryrisk.groupby(["PlayerKey"])["PlayerGame"].max().fillna(0).astype('int64')
risk7=injuryrisk.groupby(["PlayerKey"])["PlayerGamePlay"].max().fillna(0).astype('int64')
risk8=injuryrisk.groupby(["PlayerKey"])["Temperature"].max().fillna(0).astype('int64')
risk9=injuryrisk.groupby(["PlayerKey","FieldType"])["FieldType"].count().unstack().fillna(0).astype('int64')
risk10=injuryrisk.groupby(["PlayerKey","StadiumType"])["StadiumType"].count().unstack().fillna(0).astype('int64')
risk11=injuryrisk.groupby(["PlayerKey","Weather"])["Weather"].count().unstack().fillna(0).astype('int64')
risk12=injuryrisk.groupby(["PlayerKey"])["Injury_Risk"].mean().fillna(0).astype('int64')

from functools import reduce
dfs = [risk1, risk4,risk6, risk7,risk8, risk9, risk10, risk11, risk12] # list of dataframes
df_final = reduce(lambda left,right: pd.merge(left,right,on='PlayerKey'), dfs)


### 1- Roster Position 

In [None]:

plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
text1=" ".join(Injplay['RosterPosition'].str.lower())
wordcloud = WordCloud(stopwords=STOPWORDS,background_color='white', max_words=300,collocations = False).generate(text1)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Roster Position game play distribution for player with injury', fontsize=15)
plt.axis("off")

plt.subplot(1,2,2)
text2=" ".join(play_noinjury['RosterPosition'].str.lower())
wordcloud = WordCloud(stopwords=STOPWORDS,background_color='white', max_words=300,collocations = False).generate(text2)
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Roster Position game play distribution for player with no injury', fontsize=15)
plt.axis("off")
plt.show()


In [None]:
import statsmodels.api as sm
x=df_final[['Cornerback', 'Defensive_Lineman', 'Kicker', 'Linebacker','Offensive_Lineman', 'Quarterback', 'Running_Back', 'Safety','Tight_End', 'Wide_Receiver']]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

Our grapha suggest that players in Cornerback, Safety, Wide_Receiver and Linebacker position is more likely to be injured if they playing many games, in the contrary with offensive and defensive lineman .<br/>
However our model couldn't fail to support if these postions is significant injury risk (This is for position only without number of game playing) but suggest that offensive and defensive lineman is the most safer position in football team.




### 2- Play Type

In [None]:
plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
text1=" ".join(Injplay['PlayType'].str.lower())
wordcloud = WordCloud(stopwords=STOPWORDS,background_color='white', max_words=300,collocations = False).generate(text1)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Play Type game play distribution for player with injury')
plt.axis("off")

plt.subplot(1,2,2)
text2=" ".join(play_noinjury['PlayType'].str.lower())
wordcloud = WordCloud(stopwords=STOPWORDS,background_color='white', max_words=300,collocations = False).generate(text2)
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Play Type  game play distribution for player with no injury')
plt.axis("off")

plt.show()



In [None]:
x=df_final[['Extra-Point', 'Field-Goal', 'Kickoff','Kickoff-Not-Returned', 'Kickoff-Returned', 'None', 'Pass', 'Punt',
           'Punt-Not-Returned', 'Punt-Returned', 'Rush']]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

Not Much to said here, our graphs and model suggest no significant association with injury risk for any type of playing.

### 3- Weather

In [None]:
plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
text1=" ".join(Injplay['Weather'].str.lower())
wordcloud = WordCloud(stopwords=STOPWORDS,background_color='white', max_words=300,collocations = False).generate(text1)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Weather distribution for player with injury',fontsize=15)
plt.axis("off")

plt.subplot(1,2,2)
text2=" ".join(play_noinjury['Weather'].str.lower())
wordcloud = WordCloud(stopwords=STOPWORDS,background_color='white', max_words=300,collocations = False).generate(text2)
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Weather distribution for player with no injury',fontsize=15)
plt.axis("off")
plt.show()


In [None]:
x=df_final[['Clear','Cloudy', 'Indoor_y', 'Other', 'Rain', 'Snow', 'Sunny']]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

This should be surprising since lots of previous research and study find strong association between weather and injury occurance espcially in rain and snow. 

### 4-Field Type

In [None]:
plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
ax1=Injplaylist.FieldType.value_counts().plot.barh(color="darkseagreen")
plt.gca().invert_xaxis()
plt.title("Field Type game play distribution for injury players",fontsize=15)

plt.subplot(1,2,2)
ax2=play_noinjury.FieldType.value_counts().plot.barh(color="peru")
plt.title("Field Type game play distribution for non_injury players",fontsize=15)

plt.subplots_adjust(wspace=0.05)

In [None]:
x=df_final[['Natural', 'Synthetic']]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

Although many studies shows strong correlation between surface and injury but our result fail to support their findings to see that Field type has no affect on injury

### 5- Stadium Type

In [None]:
plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
ax1=Injplaylist.StadiumType.value_counts().plot.barh(color="darkseagreen")
plt.gca().invert_xaxis()
plt.title("Stadium Type game play distribution for injurt players",fontsize=15)

plt.subplot(1,2,2)
ax2=play_noinjury.StadiumType.value_counts().plot.barh(color="peru")
plt.title("Stadium Type game play distribution for non_injurt players",fontsize=15)

plt.subplots_adjust(wspace=0.05)

In [None]:
x=df_final[["Indoor_x","Outdoor"]]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

Although by logic Outdoor should be more associated with more injury since its more exposed to weather changing but out analysis is fail to support this assumption.

### 6-No of Game played

In [None]:
Nogame_injury=play_injury.groupby(["PlayerKey"])["PlayerGame"].max().to_frame()
Nogame_injury.columns=["No.game.inj"]
Nogame_noinjury=play_noinjury.groupby(["PlayerKey"])["PlayerGame"].max().to_frame()
Nogame_noinjury.columns=["No.game.noinj"]

plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
ax1=Nogame_injury["No.game.inj"].hist(color="darkseagreen")
plt.xlabel("Number of game")
plt.title("Number of games played by injured player",fontsize=15)

ax2=plt.subplot(1,2,2)
Nogame_noinjury["No.game.noinj"].hist(color="peru")
plt.xlabel("Number of game")
plt.title("Number of games played by non_injured player",fontsize=15)

plt.show()


In [None]:
x=df_final[["PlayerGame"]]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

Number of playing game should have high impact on injury occurance but here i gave big question marks on negative association here, from our charts the most of injury players had played less than 10 games where for most non_injury players had played more than 25 played. One explanation is injured player is missing many game in recovery and on the bench.


### 7- Maximam number of game play

In [None]:
Nogameplay_injury=play_injury.groupby(["PlayerKey"])["PlayerGamePlay"].max().to_frame()
Nogameplay_injury.columns=["No.gameplay.inj"]
Nogameplay_noinjury=play_noinjury.groupby(["PlayerKey"])["PlayerGamePlay"].max().to_frame()
Nogameplay_noinjury.columns=["No.gameplay.noinj"]

plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
Nogameplay_injury["No.gameplay.inj"].hist(color="darkseagreen")
plt.xlabel("Maximum number of plays per game")
plt.title("Maximum number of plays per game played by injured player",fontsize=15)


plt.subplot(1,2,2)
Nogameplay_noinjury["No.gameplay.noinj"].hist(color="peru")
plt.xlabel("Maximum number of plays per game")
plt.title("Maximum number of plays per game played by non_injured player",fontsize=15)

plt.show()

In [None]:
x=df_final["PlayerGamePlay"]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

### 8- Temperature

In [None]:
Tem_injury=play_injury.groupby(["PlayerKey"])["Temperature"].mean().to_frame()
Tem_injury.columns=["tem.mean.inj"]
Tem_noinjury=play_noinjury.groupby(["PlayerKey"])["Temperature"].mean().to_frame()
Tem_noinjury.columns=["tem.mean.noinj"]

plt.figure(figsize=(20,8))
plt.subplot(1,2,1)
Tem_injury["tem.mean.inj"].hist(color="darkseagreen")
plt.xlabel("Average Temperature")
plt.title("Temperature distribution in games played by injured players",fontsize=15)

plt.subplot(1,2,2)
Tem_noinjury["tem.mean.noinj"].hist(color="peru")
plt.xlabel("Average Temperature")
plt.title("Temperature distribution in games played by non_injured players",fontsize=15)
plt.show()

In [None]:
x=df_final["Temperature"]
y=df_final.Injury_Risk
sm.Logit(y,x).fit().pvalues

Higher Temperature means higher injury risk,Exposure to high temperatures can lead to physiological and psychological changes associated with heat strain, which in turn can decrease workers' performance and lead to impaired concentration, increased distractibility, and fatigue (Kjellstrom et al. 2016). so it is not surprising to see temperature is significantly associated with injury risk.

### PLayer Track 

In [None]:
import dask.dataframe as dd
track = dd.read_csv("/kaggle/input/nfl-playing-surface-analytics/PlayerTrackData.csv")
track["PlayerKey"]=track["PlayKey"].str[:5]

playtrack=track.groupby(["PlayKey"]).agg({"x":"mean","y":"mean","dir":"mean","dis":"mean","o":"mean","s":"mean"}).compute()
play_track=pd.merge(playtrack,playlist,on="PlayKey")


### Positons

In [None]:

sns.catplot(y="FieldType",x="x",data=play_track,kind="boxen")

sns.catplot(y="FieldType",x="y",data=play_track,kind="boxen")
plt.show()


### Distance and directions

In [None]:

sns.catplot(y="FieldType",x="dir",data=play_track,kind="boxen")

sns.catplot(y="FieldType",x="dis",data=play_track,kind="boxen")

### Speed and Acceleration

In [None]:
sns.catplot(y="FieldType",x="o",data=play_track,kind="boxen")

sns.catplot(y="FieldType",x="s",data=play_track,kind="boxen")

In [None]:
playertrack=track.groupby(["PlayerKey"]).agg({"x":"mean","y":"mean","dir":"mean","dis":"mean","o":"mean","s":"mean"}).compute()


In [None]:
playertrack.reset_index(inplace=True)
df_final.reset_index(inplace=True)


In [None]:
df_final.PlayerKey = df_final.PlayerKey.astype("object")
player_track=pd.concat([df_final["Injury_Risk"],playertrack],axis=1)
player_track.dropna(axis=0,inplace=True)

In [None]:
x = player_track.drop(["Injury_Risk","PlayerKey"],axis=1)
y = player_track.Injury_Risk
sm.Logit(y,x).fit().pvalues

Its abvious that playing surface had no affect on any of players movements or positions, and no players movements and positions have significant impact on injury risk.

## Data Limitations
1. about (27)% of injuries which is great proportion has no available information about their playing environment or player track data and hence impact on the study findings
2. Other information about players should be gather like their age, weight, height, medical history, training sessions and many other things.
3. The data should be gather on dates so we can indentify so we can indentify to capture more information like,how many days between games.



## Conclusion
Injuries in Football has huge impact on team performance and cost millions for players treatment and find proper player replacement, so many research is conducted to find the main factors that increase the injury risk.<br/>
The most common injuries are knee and ankle injuries but luckily most of these injuries are not serious, most of these injuries occurred in stadiums with synthetics surface, although there is significant impact of playing surfaces on players movement and positions<br/>
Our analysis that number of game playing, maximum number of game play and temperature has significant impact on injuries, and we found that offensive and defensive lineman is the safer positions among all team square.


## Recommendation
Although Football injury is inevitable but there is some consideration should be taken into account to reduce its risk and impact
1. Examination of stadiums conditions periodically especially the ones with synthetics surface.
2. Pay attention to environmental recommendations, especially in relation to excessively hot and humid weather, to help avoid heat illness
3. Maintain proper fitness â€” injury rates are higher in athletes who have not adequately prepared physically.
4. Don't hesitate to Speak with a sports medicine professional for any concerns about injuries or injury prevention strategies
