# Occurance of on-ice events

<p>data frames used in this notebook:</p>
<p>&nbsp; &nbsp; 1. all on-ice prior to a goal events.</p>
<p>&nbsp; &nbsp; 2. all even strength on-ice events.</p> 
 

In [115]:
import sys
import os
import pandas as pd
import numpy as np
import datetime, time
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.formula.api import ols
from pylab import hist, show
import scipy
import zipfile


pd.set_option('display.max_rows', 50)
pd.set_option('display.max_columns', 200)

In [116]:
pwd

'/Users/stefanostselios/Desktop/nhl_roster_design-master'

In [117]:
da = pd.read_csv('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/pbp_merged.csv')
#da = pd.read_csv('/Users/kevinmongeon/Brock University/Steve Tselios - StephanosShare/out/pbp_merged.csv')
da = da.drop('Unnamed: 0', axis=1)
da = da.rename(columns={'TeamCode': 'EventTeamCode'})

- keep regular season games and relevant on-ice events in **regulation time**. Drop duplicates by season, game number, event number and event team to have one obsrevation per event per game.

In [118]:
da = da[da['GameNumber'] <= 21230]
da = da[da['Period'] <= 3]
da = da[da['Period'] >= 1]
da = da[da['EventType']!='STOP']
da = da[da['EventType']!='EISTR']
da = da[da['EventType']!='EIEND']
da = da[da['EventType'] !='FIGHT']
da = da.dropna(subset=['EventNumber'])

In [119]:
da.head()

Unnamed: 0,Season,GameNumber,EventNumber,Period,AdvantageType,EventTimeFromZero,EventTimeFromTwenty,EventType,EventDetail,VPlayer1,VPosition1,VPlayer2,VPosition2,VPlayer3,VPosition3,VPlayer4,VPosition4,VPlayer5,VPosition5,VPlayer6,VPosition6,HPlayer1,HPosition1,HPlayer2,HPosition2,HPlayer3,HPosition3,HPlayer4,HPosition4,HPlayer5,HPosition5,HPlayer6,HPosition6,GameDate,VTeamCode,HTeamCode,EventTeamCode,PlayerNumber,PlayerName,ShotType,ShotResult,Zone,Length,PenaltyType
0,2010,20001,1,1,,0,1200,FAC,MTL won Neu. Zone - MTL #11 GOMEZ vs TOR #37 B...,11,C,21.0,R,57.0,L,26.0,D,75.0,D,31.0,G,37,C,9.0,R,11.0,L,3.0,D,22.0,D,35.0,G,2010-10-07,MTL,TOR,MTL,11.0,GOMEZ,,,N,,
1,2010,20001,3,1,EV,15,1185,HIT,"TOR #37 BRENT HIT MTL #26 GORGES, Off. Zone",11,C,21.0,R,57.0,L,26.0,D,75.0,D,31.0,G,37,C,9.0,R,11.0,L,3.0,D,22.0,D,35.0,G,2010-10-07,MTL,TOR,TOR,37.0,BRENT,,,O,,
2,2010,20001,4,1,EV,46,1154,HIT,"MTL #14 PLEKANEC HIT TOR #2 SCHENN, Off. Zone",14,C,81.0,C,46.0,L,6.0,D,76.0,D,31.0,G,42,C,81.0,C,32.0,R,2.0,D,15.0,D,35.0,G,2010-10-07,MTL,TOR,MTL,14.0,PLEKANEC,,,O,,
3,2010,20001,5,1,EV,57,1143,HIT,"MTL #76 SUBBAN HIT TOR #15 KABERLE, Neu. Zone",14,C,81.0,C,46.0,L,6.0,D,76.0,D,31.0,G,42,C,81.0,C,32.0,R,2.0,D,15.0,D,35.0,G,2010-10-07,MTL,TOR,MTL,76.0,SUBBAN,,,N,,
4,2010,20001,6,1,EV,69,1131,GIVE,"TOR&nbsp;GIVEAWAY - #35 GIGUERE, Def. Zone",14,C,81.0,C,46.0,L,6.0,D,76.0,D,31.0,G,42,C,81.0,C,32.0,R,2.0,D,15.0,D,35.0,G,2010-10-07,MTL,TOR,TOR,35.0,GIGUERE,,,D,,


In [120]:
da.shape

(310113, 44)

- create a goal dataframe that will display the number of goal per game.

In [121]:
df = da[['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'VTeamCode', 'HTeamCode', 'EventTeamCode']]
dg = df[df['EventType'] == 'GOAL']
dg['Goal'] = dg.apply(lambda x: 1 if (x['EventType'] == 'GOAL') else 0, axis=1)
dg['GoalNumber'] = dg.groupby(['Season', 'GameNumber']).cumcount()+1
dg.head()
dg = dg[['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode', 'GoalNumber']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


- merge dg onto df to display the goal number per game. Group by season, game number and period to backwardfill advantage type and goal number.

In [122]:
df = pd.merge(df, dg, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode'], how='left')
df['AdvantageType'] = df.groupby(['Season', 'GameNumber'])['AdvantageType'].apply(lambda x: x.bfill())
df['GoalNumber'] = df.groupby(['Season', 'GameNumber', 'Period'])['GoalNumber'].apply(lambda x: x.bfill())
df.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber
0,2010,20001,1,EV,1,FAC,0,MTL,TOR,MTL,1.0
1,2010,20001,3,EV,1,HIT,15,MTL,TOR,TOR,1.0
2,2010,20001,4,EV,1,HIT,46,MTL,TOR,MTL,1.0
3,2010,20001,5,EV,1,HIT,57,MTL,TOR,MTL,1.0
4,2010,20001,6,EV,1,GIVE,69,MTL,TOR,TOR,1.0


## all on-ice events prior to a goal

- display the home goal number and visitor goal number by game number and season. Keep all on-ice events that happened prior to a goal when the score differential was between -1 and 1. Exclude all other events.

In [123]:
dz = dg[dg['EventTeamCode'] == dg['HTeamCode']]
dz['HGoalNumber'] = dz.groupby(['Season', 'GameNumber']).cumcount()+1
dy = dg[dg['EventTeamCode'] == dg['VTeamCode']]
dy['VGoalNumber'] = dy.groupby(['Season', 'GameNumber']).cumcount()+1

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


- merge visitor goal number dataframe (dy) and home goal number dataframe (dz) onto goal dataframe (dg). 

In [124]:
dg = pd.merge(dg, dy, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode', 'GoalNumber'], how='left')
dg = pd.merge(dg, dz, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode', 'GoalNumber'], how='left')
dg.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,EventTeamCode,VTeamCode,HTeamCode,GoalNumber,VGoalNumber,HGoalNumber
0,2010,20001,35,EV,1,GOAL,402,TOR,MTL,TOR,1,,1.0
1,2010,20001,49,EV,1,GOAL,537,TOR,MTL,TOR,2,,2.0
2,2010,20001,68,EV,1,GOAL,739,MTL,MTL,TOR,3,1.0,
3,2010,20001,223,EV,3,GOAL,96,TOR,MTL,TOR,4,,3.0
4,2010,20001,232,EV,3,GOAL,148,MTL,MTL,TOR,5,2.0,


- forward fill home goal number and visitor game number by season and game number. Fill in 'NaN' values with zero for home and visitor game number.

In [125]:
dg['HGoalNumber'] = dg.groupby(['Season', 'GameNumber'])['HGoalNumber'].apply(lambda x: x.ffill())
dg['VGoalNumber'] = dg.groupby(['Season', 'GameNumber'])['VGoalNumber'].apply(lambda x: x.ffill())
dg['VGoalNumber'] = dg['VGoalNumber'].fillna(0)
dg['HGoalNumber'] = dg['HGoalNumber'].fillna(0)
dg.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,EventTeamCode,VTeamCode,HTeamCode,GoalNumber,VGoalNumber,HGoalNumber
0,2010,20001,35,EV,1,GOAL,402,TOR,MTL,TOR,1,0.0,1.0
1,2010,20001,49,EV,1,GOAL,537,TOR,MTL,TOR,2,0.0,2.0
2,2010,20001,68,EV,1,GOAL,739,MTL,MTL,TOR,3,1.0,2.0
3,2010,20001,223,EV,3,GOAL,96,TOR,MTL,TOR,4,1.0,3.0
4,2010,20001,232,EV,3,GOAL,148,MTL,MTL,TOR,5,2.0,3.0


- merge goal dataframe on dk and backward fill by home goal number and visitor goal number.

In [126]:
dk = da[['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'VTeamCode', 'HTeamCode', 'EventTeamCode']]
dk = pd.merge(dk, dg, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode'], how='left')
dk['AdvantageType'] = dk.groupby(['Season', 'GameNumber'])['AdvantageType'].apply(lambda x: x.bfill())
dk['GoalNumber'] = dk.groupby(['Season', 'GameNumber', 'Period'])['GoalNumber'].apply(lambda x: x.bfill())
dk['HGoalNumber'] = dk.groupby(['Season', 'GameNumber', 'Period'])['HGoalNumber'].apply(lambda x: x.bfill())
dk['VGoalNumber'] = dk.groupby(['Season', 'GameNumber', 'Period'])['VGoalNumber'].apply(lambda x: x.bfill())
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber
0,2010,20001,1,EV,1,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0
1,2010,20001,3,EV,1,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0
2,2010,20001,4,EV,1,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0
3,2010,20001,5,EV,1,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0
4,2010,20001,6,EV,1,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0


### ** even strength situations only !!**

In [127]:
#dk = dk[dk['AdvantageType'] == 'EV']

- display the goal differential per game for each team.

In [128]:
dk['GD'] = dk.apply(lambda x: x['HGoalNumber'] - x['VGoalNumber'] if (x['EventTeamCode'] == x['HTeamCode']) else x['VGoalNumber'] - x['HGoalNumber'], axis=1)
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,GD
0,2010,20001,1,EV,1,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,-1.0
1,2010,20001,3,EV,1,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,1.0
2,2010,20001,4,EV,1,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,-1.0
3,2010,20001,5,EV,1,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,-1.0
4,2010,20001,6,EV,1,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,1.0


In [129]:
dk.shape

(310113, 14)

- On-ice events that occured in a different period from a goal or after a goal are excluded from the dataframe.

In [130]:
dk = dk.dropna(subset=['GoalNumber'])
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk = dk.drop_duplicates(['Season', 'GameNumber', 'EventNumber', 'EventTeamCode'])
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,GD
0,2010,20001,1,EV,1,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,-1.0
1,2010,20001,3,EV,1,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,1.0
2,2010,20001,4,EV,1,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,-1.0
3,2010,20001,5,EV,1,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,-1.0
4,2010,20001,6,EV,1,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,1.0


In [131]:
dk.shape

(178759, 14)

- Assign a value of 1 if an on-ice event is a goal, 0 if not. Follow the same procedure for block, faceoff, giveaway, hits, miss, penalty, shot and takeaway. Group by season, game number and event type to find the sum of each on-ice event per game. 

In [132]:
dk['Goal'] = dk.apply(lambda x: 1 if (x['EventType'] == 'GOAL') else np.nan, axis=1)
dk['Block'] = dk.apply(lambda x: 1 if (x['EventType'] == 'BLOCK') else np.nan, axis=1)
dk['Faceoff'] = dk.apply(lambda x: 1 if (x['EventType'] == 'FAC') else np.nan, axis=1)
dk['Giveaway'] = dk.apply(lambda x: 1 if (x['EventType'] == 'GIVE') else np.nan, axis=1)
dk['Hit'] = dk.apply(lambda x: 1 if (x['EventType'] == 'HIT') else np.nan, axis=1)
dk['Miss'] = dk.apply(lambda x: 1 if (x['EventType'] == 'MISS') else np.nan, axis=1)
dk['Penalty'] = dk.apply(lambda x: 1 if (x['EventType'] == 'PENL') else np.nan, axis=1)
dk['Shot'] = dk.apply(lambda x: 1 if (x['EventType'] == 'SHOT') else np.nan, axis=1)
dk['Takeaway'] = dk.apply(lambda x: 1 if (x['EventType'] == 'TAKE') else np.nan, axis=1)

In [133]:
dk['Blocks'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Block'].transform('sum')
dk['Faceoffs'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Faceoff'].transform('sum')
dk['Giveaways'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Giveaway'].transform('sum')
dk['Goals'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Goal'].transform('sum')
dk['Hits'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Hit'].transform('sum')
dk['Misses'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Miss'].transform('sum')
dk['Penalties'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Penalty'].transform('sum')
dk['Shots'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Shot'].transform('sum')
dk['Takeaways'] = dk.groupby(['Season','GameNumber', 'EventTeamCode', 'EventType', 'GoalNumber'])['Takeaway'].transform('sum')

In [134]:
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,GD,Goal,Block,Faceoff,Giveaway,Hit,Miss,Penalty,Shot,Takeaway,Blocks,Faceoffs,Giveaways,Goals,Hits,Misses,Penalties,Shots,Takeaways
0,2010,20001,1,EV,1,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,-1.0,,,1.0,,,,,,,,2.0,,,,,,,
1,2010,20001,3,EV,1,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,1.0,,,,,1.0,,,,,,,,,3.0,,,,
2,2010,20001,4,EV,1,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,-1.0,,,,,1.0,,,,,,,,,7.0,,,,
3,2010,20001,5,EV,1,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,-1.0,,,,,1.0,,,,,,,,,7.0,,,,
4,2010,20001,6,EV,1,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,1.0,,,,1.0,,,,,,,,2.0,,,,,,


In [135]:
dk.shape

(178759, 32)

- reshape data wide to long.

In [136]:
dk = dk.rename(columns={'EventTeamCode': 'EventTeam'})
a = [col for col in dk.columns if 'TeamCode' in col]
dk = pd.lreshape(dk, {'TeamCode' : a})
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk = dk.rename(columns={'EventTeam': 'EventTeamCode'})
dk.head()

Unnamed: 0,AdvantageType,Block,Blocks,EventNumber,EventTeamCode,EventTimeFromZero,EventType,Faceoff,Faceoffs,GD,GameNumber,Giveaway,Giveaways,Goal,GoalNumber,Goals,HGoalNumber,Hit,Hits,Miss,Misses,Penalties,Penalty,Period,Season,Shot,Shots,Takeaway,Takeaways,VGoalNumber,TeamCode
0,EV,,,1,MTL,0,FAC,1.0,2.0,-1.0,20001,,,,1.0,,1.0,,,,,,,1,2010,,,,,0.0,MTL
178759,EV,,,1,MTL,0,FAC,1.0,2.0,-1.0,20001,,,,1.0,,1.0,,,,,,,1,2010,,,,,0.0,TOR
1,EV,,,3,TOR,15,HIT,,,1.0,20001,,,,1.0,,1.0,1.0,3.0,,,,,1,2010,,,,,0.0,MTL
178760,EV,,,3,TOR,15,HIT,,,1.0,20001,,,,1.0,,1.0,1.0,3.0,,,,,1,2010,,,,,0.0,TOR
2,EV,,,4,MTL,46,HIT,,,-1.0,20001,,,,1.0,,1.0,1.0,7.0,,,,,1,2010,,,,,0.0,MTL


- drop duplicates by season, game number, team code and event type.

In [137]:
dk = dk.drop_duplicates(['Season', 'GameNumber', 'TeamCode', 'EventTeamCode', 'EventType', 'GoalNumber'])
dk = dk [['Season', 'GameNumber', 'AdvantageType', 'Period', 'TeamCode', 'EventNumber', 'EventType', 'EventTeamCode', 'GoalNumber', 'GD',  'Blocks', 'Faceoffs', 'Giveaways', 'Goals', 'Hits', 'Misses', 'Penalties', 'Shots', 'Takeaways']]
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk.shape

(138546, 19)

- assign all on-ice events to their respectful teams. If team code is the same as event team code, then the on-ice event is assigned to that team. If not it is assigned to the opposing team. Each on-ice event generates two variables per team: For (F) and Against (A).

In [138]:
dk['Blocks_F'] = dk.apply(lambda x: x['Blocks'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Blocks_A'] = dk.apply(lambda x: x['Blocks'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Faceoffs_F'] = dk.apply(lambda x: x['Faceoffs'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Faceoffs_A'] = dk.apply(lambda x: x['Faceoffs'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Giveaways_F'] = dk.apply(lambda x: x['Giveaways'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Giveaways_A'] = dk.apply(lambda x: x['Giveaways'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Goals_F'] = dk.apply(lambda x: x['Goals'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Goals_A'] = dk.apply(lambda x: x['Goals'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Hits_F'] = dk.apply(lambda x: x['Hits'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Hits_A'] = dk.apply(lambda x: x['Hits'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Miss_F'] = dk.apply(lambda x: x['Misses'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Miss_A'] = dk.apply(lambda x: x['Misses'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Penalties_F'] = dk.apply(lambda x: x['Penalties'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Penalties_A'] = dk.apply(lambda x: x['Penalties'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Shots_F'] = dk.apply(lambda x: x['Shots'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Shots_A'] = dk.apply(lambda x: x['Shots'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Takeaways_F'] = dk.apply(lambda x: x['Takeaways'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Takeaways_A'] = dk.apply(lambda x: x['Takeaways'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk.head()

Unnamed: 0,Season,GameNumber,AdvantageType,Period,TeamCode,EventNumber,EventType,EventTeamCode,GoalNumber,GD,Blocks,Faceoffs,Giveaways,Goals,Hits,Misses,Penalties,Shots,Takeaways,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A
0,2010,20001,EV,1,MTL,1,FAC,MTL,1.0,-1.0,,2.0,,,,,,,,,,2.0,,,,,,,,,,,,,,,
178759,2010,20001,EV,1,TOR,1,FAC,MTL,1.0,-1.0,,2.0,,,,,,,,,,,2.0,,,,,,,,,,,,,,
1,2010,20001,EV,1,MTL,3,HIT,TOR,1.0,1.0,,,,,3.0,,,,,,,,,,,,,,3.0,,,,,,,,
178760,2010,20001,EV,1,TOR,3,HIT,TOR,1.0,1.0,,,,,3.0,,,,,,,,,,,,,3.0,,,,,,,,,
2,2010,20001,EV,1,MTL,4,HIT,MTL,1.0,-1.0,,,,,7.0,,,,,,,,,,,,,7.0,,,,,,,,,


- backward and forward fill of on-ice events by season, game number and team code.

In [139]:
dk['Blocks_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Blocks_F'].apply(lambda x: x.ffill().bfill())
dk['Faceoffs_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Faceoffs_F'].apply(lambda x: x.ffill().bfill())
dk['Giveaways_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Giveaways_F'].apply(lambda x: x.ffill().bfill())
dk['Goals_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Goals_F'].apply(lambda x: x.ffill().bfill())
dk['Hits_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Hits_F'].apply(lambda x: x.ffill().bfill())
dk['Miss_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Miss_F'].apply(lambda x: x.ffill().bfill())
dk['Penalties_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Penalties_F'].apply(lambda x: x.ffill().bfill())
dk['Shots_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Shots_F'].apply(lambda x: x.ffill().bfill())
dk['Takeaways_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Takeaways_F'].apply(lambda x: x.ffill().bfill())
dk['Blocks_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Blocks_A'].apply(lambda x: x.ffill().bfill())
dk['Faceoffs_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Faceoffs_A'].apply(lambda x: x.ffill().bfill())
dk['Giveaways_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Giveaways_A'].apply(lambda x: x.ffill().bfill())
dk['Goals_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Goals_A'].apply(lambda x: x.ffill().bfill())
dk['Hits_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Hits_A'].apply(lambda x: x.ffill().bfill())
dk['Miss_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Miss_A'].apply(lambda x: x.ffill().bfill())
dk['Penalties_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Penalties_A'].apply(lambda x: x.ffill().bfill())
dk['Shots_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Shots_A'].apply(lambda x: x.ffill().bfill())
dk['Takeaways_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'GoalNumber'])['Takeaways_A'].apply(lambda x: x.ffill().bfill())
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk = dk.fillna(0)
dk.head()

Unnamed: 0,Season,GameNumber,AdvantageType,Period,TeamCode,EventNumber,EventType,EventTeamCode,GoalNumber,GD,Blocks,Faceoffs,Giveaways,Goals,Hits,Misses,Penalties,Shots,Takeaways,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A
0,2010,20001,EV,1,MTL,1,FAC,MTL,1.0,-1.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0
178759,2010,20001,EV,1,TOR,1,FAC,MTL,1.0,-1.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,3.0,1.0,2.0,2.0,3.0,1.0,0.0,3.0,7.0,0.0,1.0,0.0,0.0,3.0,2.0,0.0,0.0
1,2010,20001,EV,1,MTL,3,HIT,TOR,1.0,1.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0
178760,2010,20001,EV,1,TOR,3,HIT,TOR,1.0,1.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,4.0,3.0,1.0,2.0,2.0,3.0,1.0,0.0,3.0,7.0,0.0,1.0,0.0,0.0,3.0,2.0,0.0,0.0
2,2010,20001,EV,1,MTL,4,HIT,MTL,1.0,-1.0,0.0,0.0,0.0,0.0,7.0,0.0,0.0,0.0,0.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0


- keep only relative columns and drop duplicates by season, gamenumber and teamcode, to have two observations per game.

In [140]:
dk = dk[['Season', 'GameNumber', 'TeamCode', 'GoalNumber', 'GD', 'Blocks_F', 'Blocks_A', 'Faceoffs_F', 'Faceoffs_A', 'Giveaways_F', 'Giveaways_A', 'Goals_F', 'Goals_A', 'Hits_F', 'Hits_A', 'Miss_F', 'Miss_A', 'Penalties_F', 'Penalties_A', 'Shots_F', 'Shots_A', 'Takeaways_F', 'Takeaways_A']]
dk = dk.sort_values(['Season', 'GameNumber'], ascending=[True, True])
dk = dk.drop_duplicates(['Season', 'GameNumber', 'TeamCode', 'GoalNumber', 'GD'])
dk.head()

Unnamed: 0,Season,GameNumber,TeamCode,GoalNumber,GD,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A
0,2010,20001,MTL,1.0,-1.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0
178759,2010,20001,TOR,1.0,-1.0,4.0,3.0,1.0,2.0,2.0,3.0,1.0,0.0,3.0,7.0,0.0,1.0,0.0,0.0,3.0,2.0,0.0,0.0
1,2010,20001,MTL,1.0,1.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0
178760,2010,20001,TOR,1.0,1.0,4.0,3.0,1.0,2.0,2.0,3.0,1.0,0.0,3.0,7.0,0.0,1.0,0.0,0.0,3.0,2.0,0.0,0.0
32,2010,20001,MTL,2.0,-2.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,1.0,1.0,0.0,0.0,0.0,1.0,1.0,1.0


In [141]:
dk.shape

(23452, 23)

In [142]:
dk.isnull().sum()

Season         0
GameNumber     0
TeamCode       0
GoalNumber     0
GD             0
Blocks_F       0
Blocks_A       0
Faceoffs_F     0
Faceoffs_A     0
Giveaways_F    0
Giveaways_A    0
Goals_F        0
Goals_A        0
Hits_F         0
Hits_A         0
Miss_F         0
Miss_A         0
Penalties_F    0
Penalties_A    0
Shots_F        0
Shots_A        0
Takeaways_F    0
Takeaways_A    0
dtype: int64

- group by season, team code and goal differential to compute the mean of each on-ice events while score differential was the same throughout the season.

In [143]:
#dk['MBlocks_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Blocks_F'].transform('mean')
#dk['MFaceoffs_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Faceoffs_F'].transform('mean')
#dk['MGiveaways_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Giveaways_F'].transform('mean')
#dk['MGoals_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Goals_F'].transform('mean')
#dk['MHits_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Hits_F'].transform('mean')
#dk['MMiss_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Miss_F'].transform('mean')
#dk['MPenalties_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Penalties_F'].transform('mean')
#dk['MShots_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Shots_F'].transform('mean')
#dk['MTakeaways_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Takeaways_F'].transform('mean')
#dk['MBlocks_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Blocks_A'].transform('mean')
#dk['MFaceoffs_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Faceoffs_A'].transform('mean')
#dk['MGiveaways_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Giveaways_A'].transform('mean')
#dk['MGoals_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Goals_A'].transform('mean')
#dk['MHits_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Hits_A'].transform('mean')
#dk['MMiss_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Miss_A'].transform('mean')
#dk['MPenalties_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Penalties_A'].transform('mean')
#dk['MShots_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Shots_A'].transform('mean')
#dk['MTakeaways_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Takeaways_A'].transform('mean')
#dk.head()

In [144]:
dk['MBlocks_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Blocks_F'].transform('sum')
dk['MFaceoffs_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Faceoffs_F'].transform('sum')
dk['MGiveaways_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Giveaways_F'].transform('sum')
dk['MGoals_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Goals_F'].transform('sum')
dk['MHits_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Hits_F'].transform('sum')
dk['MMiss_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Miss_F'].transform('sum')
dk['MPenalties_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Penalties_F'].transform('sum')
dk['MShots_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Shots_F'].transform('sum')
dk['MTakeaways_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Takeaways_F'].transform('sum')
dk['MBlocks_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Blocks_A'].transform('sum')
dk['MFaceoffs_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Faceoffs_A'].transform('sum')
dk['MGiveaways_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Giveaways_A'].transform('sum')
dk['MGoals_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Goals_A'].transform('sum')
dk['MHits_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Hits_A'].transform('sum')
dk['MMiss_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Miss_A'].transform('sum')
dk['MPenalties_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Penalties_A'].transform('sum')
dk['MShots_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Shots_A'].transform('sum')
dk['MTakeaways_A'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Takeaways_A'].transform('sum')
dk.head()

Unnamed: 0,Season,GameNumber,TeamCode,GoalNumber,GD,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A,MBlocks_F,MFaceoffs_F,MGiveaways_F,MGoals_F,MHits_F,MMiss_F,MPenalties_F,MShots_F,MTakeaways_F,MBlocks_A,MFaceoffs_A,MGiveaways_A,MGoals_A,MHits_A,MMiss_A,MPenalties_A,MShots_A,MTakeaways_A
0,2010,20001,MTL,1.0,-1.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0,294.0,592.0,218.0,84.0,472.0,274.0,101.0,643.0,137.0,324.0,589.0,178.0,84.0,497.0,219.0,95.0,538.0,157.0
178759,2010,20001,TOR,1.0,-1.0,4.0,3.0,1.0,2.0,2.0,3.0,1.0,0.0,3.0,7.0,0.0,1.0,0.0,0.0,3.0,2.0,0.0,0.0,326.0,603.0,242.0,84.0,519.0,226.0,86.0,488.0,150.0,342.0,545.0,231.0,99.0,526.0,311.0,89.0,606.0,158.0
1,2010,20001,MTL,1.0,1.0,3.0,4.0,2.0,1.0,3.0,2.0,0.0,1.0,7.0,3.0,1.0,0.0,0.0,0.0,2.0,3.0,0.0,0.0,294.0,593.0,218.0,85.0,472.0,275.0,101.0,643.0,137.0,324.0,590.0,178.0,85.0,497.0,219.0,95.0,539.0,157.0
178760,2010,20001,TOR,1.0,1.0,4.0,3.0,1.0,2.0,2.0,3.0,1.0,0.0,3.0,7.0,0.0,1.0,0.0,0.0,3.0,2.0,0.0,0.0,326.0,606.0,242.0,86.0,520.0,226.0,86.0,489.0,150.0,342.0,546.0,231.0,99.0,526.0,311.0,89.0,604.0,158.0
32,2010,20001,MTL,2.0,-2.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,1.0,1.0,0.0,0.0,0.0,1.0,1.0,1.0,173.0,344.0,120.0,51.0,259.0,132.0,58.0,331.0,66.0,170.0,365.0,95.0,47.0,258.0,134.0,63.0,374.0,71.0


- drop duplicates by season, team code and goal differential.

In [145]:
dk = dk.drop_duplicates(['Season', 'TeamCode', 'GD'])
dk = dk [['Season', 'TeamCode', 'GD','MBlocks_F', 'MFaceoffs_F', 'MGiveaways_F', 'MGoals_F','MHits_F', 'MMiss_F', 'MPenalties_F', 'MShots_F', 'MTakeaways_F','MBlocks_A', 'MFaceoffs_A', 'MGiveaways_A', 'MGoals_A', 'MHits_A','MMiss_A', 'MPenalties_A', 'MShots_A', 'MTakeaways_A']]
dk.head()

Unnamed: 0,Season,TeamCode,GD,MBlocks_F,MFaceoffs_F,MGiveaways_F,MGoals_F,MHits_F,MMiss_F,MPenalties_F,MShots_F,MTakeaways_F,MBlocks_A,MFaceoffs_A,MGiveaways_A,MGoals_A,MHits_A,MMiss_A,MPenalties_A,MShots_A,MTakeaways_A
0,2010,MTL,-1.0,294.0,592.0,218.0,84.0,472.0,274.0,101.0,643.0,137.0,324.0,589.0,178.0,84.0,497.0,219.0,95.0,538.0,157.0
178759,2010,TOR,-1.0,326.0,603.0,242.0,84.0,519.0,226.0,86.0,488.0,150.0,342.0,545.0,231.0,99.0,526.0,311.0,89.0,606.0,158.0
1,2010,MTL,1.0,294.0,593.0,218.0,85.0,472.0,275.0,101.0,643.0,137.0,324.0,590.0,178.0,85.0,497.0,219.0,95.0,539.0,157.0
178760,2010,TOR,1.0,326.0,606.0,242.0,86.0,520.0,226.0,86.0,489.0,150.0,342.0,546.0,231.0,99.0,526.0,311.0,89.0,604.0,158.0
32,2010,MTL,-2.0,173.0,344.0,120.0,51.0,259.0,132.0,58.0,331.0,66.0,170.0,365.0,95.0,47.0,258.0,134.0,63.0,374.0,71.0


### summary analysis

In [146]:
#dk['TBlocks_F'] = dk.groupby(['Season', 'GD'])['MBlocks_F'].transform('mean')
#dk['TFaceoffs_F'] = dk.groupby(['Season', 'GD'])['MFaceoffs_F'].transform('mean')
#dk['TGiveaways_F'] = dk.groupby(['Season', 'GD'])['MGiveaways_F'].transform('mean')
#dk['TGoals_F'] = dk.groupby(['Season', 'GD'])['MGoals_F'].transform('mean')
#dk['THits_F'] = dk.groupby(['Season', 'GD'])['MHits_F'].transform('mean')
#dk['TMiss_F'] = dk.groupby(['Season', 'GD'])['MMiss_F'].transform('mean')
#dk['TPenalties_F'] = dk.groupby(['Season', 'GD'])['MPenalties_F'].transform('mean')
#dk['TShots_F'] = dk.groupby(['Season', 'GD'])['MShots_F'].transform('mean')
#dk['TTakeaways_F'] = dk.groupby(['Season', 'GD'])['MTakeaways_F'].transform('mean')
#dk['TBlocks_A'] = dk.groupby(['Season', 'GD'])['MBlocks_A'].transform('mean')
#dk['TFaceoffs_A'] = dk.groupby(['Season', 'GD'])['MFaceoffs_A'].transform('mean')
#dk['TGiveaways_A'] = dk.groupby(['Season', 'GD'])['MGiveaways_A'].transform('mean')
#dk['TGoals_A'] = dk.groupby(['Season', 'GD'])['MGoals_A'].transform('mean')
#dk['THits_A'] = dk.groupby(['Season', 'GD'])['MHits_A'].transform('mean')
#dk['TMiss_A'] = dk.groupby(['Season', 'GD'])['MMiss_A'].transform('mean')
#dk['TPenalties_A'] = dk.groupby(['Season', 'GD'])['MPenalties_A'].transform('mean')
#dk['TShots_A'] = dk.groupby(['Season', 'GD'])['MShots_A'].transform('mean')
#dk['TTakeaways_A'] = dk.groupby(['Season', 'GD'])['MTakeaways_A'].transform('mean')
#dk.head()

In [147]:
dk['TBlocks_F'] = dk.groupby(['Season', 'GD'])['MBlocks_F'].transform('sum')
dk['TFaceoffs_F'] = dk.groupby(['Season', 'GD'])['MFaceoffs_F'].transform('sum')
dk['TGiveaways_F'] = dk.groupby(['Season', 'GD'])['MGiveaways_F'].transform('sum')
dk['TGoals_F'] = dk.groupby(['Season', 'GD'])['MGoals_F'].transform('sum')
dk['THits_F'] = dk.groupby(['Season', 'GD'])['MHits_F'].transform('sum')
dk['TMiss_F'] = dk.groupby(['Season', 'GD'])['MMiss_F'].transform('sum')
dk['TPenalties_F'] = dk.groupby(['Season', 'GD'])['MPenalties_F'].transform('sum')
dk['TShots_F'] = dk.groupby(['Season', 'GD'])['MShots_F'].transform('sum')
dk['TTakeaways_F'] = dk.groupby(['Season', 'GD'])['MTakeaways_F'].transform('sum')
dk['TBlocks_A'] = dk.groupby(['Season', 'GD'])['MBlocks_A'].transform('sum')
dk['TFaceoffs_A'] = dk.groupby(['Season', 'GD'])['MFaceoffs_A'].transform('sum')
dk['TGiveaways_A'] = dk.groupby(['Season', 'GD'])['MGiveaways_A'].transform('sum')
dk['TGoals_A'] = dk.groupby(['Season', 'GD'])['MGoals_A'].transform('sum')
dk['THits_A'] = dk.groupby(['Season', 'GD'])['MHits_A'].transform('sum')
dk['TMiss_A'] = dk.groupby(['Season', 'GD'])['MMiss_A'].transform('sum')
dk['TPenalties_A'] = dk.groupby(['Season', 'GD'])['MPenalties_A'].transform('sum')
dk['TShots_A'] = dk.groupby(['Season', 'GD'])['MShots_A'].transform('sum')
dk['TTakeaways_A'] = dk.groupby(['Season', 'GD'])['MTakeaways_A'].transform('sum')
dk.head()

Unnamed: 0,Season,TeamCode,GD,MBlocks_F,MFaceoffs_F,MGiveaways_F,MGoals_F,MHits_F,MMiss_F,MPenalties_F,MShots_F,MTakeaways_F,MBlocks_A,MFaceoffs_A,MGiveaways_A,MGoals_A,MHits_A,MMiss_A,MPenalties_A,MShots_A,MTakeaways_A,TBlocks_F,TFaceoffs_F,TGiveaways_F,TGoals_F,THits_F,TMiss_F,TPenalties_F,TShots_F,TTakeaways_F,TBlocks_A,TFaceoffs_A,TGiveaways_A,TGoals_A,THits_A,TMiss_A,TPenalties_A,TShots_A,TTakeaways_A
0,2010,MTL,-1.0,294.0,592.0,218.0,84.0,472.0,274.0,101.0,643.0,137.0,324.0,589.0,178.0,84.0,497.0,219.0,95.0,538.0,157.0,8703.0,17424.0,5230.0,2678.0,14605.0,7086.0,2603.0,17005.0,4457.0,8703.0,17424.0,5230.0,2678.0,14605.0,7086.0,2603.0,17005.0,4457.0
178759,2010,TOR,-1.0,326.0,603.0,242.0,84.0,519.0,226.0,86.0,488.0,150.0,342.0,545.0,231.0,99.0,526.0,311.0,89.0,606.0,158.0,8703.0,17424.0,5230.0,2678.0,14605.0,7086.0,2603.0,17005.0,4457.0,8703.0,17424.0,5230.0,2678.0,14605.0,7086.0,2603.0,17005.0,4457.0
1,2010,MTL,1.0,294.0,593.0,218.0,85.0,472.0,275.0,101.0,643.0,137.0,324.0,590.0,178.0,85.0,497.0,219.0,95.0,539.0,157.0,8709.0,17476.0,5235.0,2722.0,14611.0,7088.0,2603.0,17015.0,4457.0,8709.0,17476.0,5235.0,2722.0,14611.0,7088.0,2603.0,17015.0,4457.0
178760,2010,TOR,1.0,326.0,606.0,242.0,86.0,520.0,226.0,86.0,489.0,150.0,342.0,546.0,231.0,99.0,526.0,311.0,89.0,604.0,158.0,8709.0,17476.0,5235.0,2722.0,14611.0,7088.0,2603.0,17015.0,4457.0,8709.0,17476.0,5235.0,2722.0,14611.0,7088.0,2603.0,17015.0,4457.0
32,2010,MTL,-2.0,173.0,344.0,120.0,51.0,259.0,132.0,58.0,331.0,66.0,170.0,365.0,95.0,47.0,258.0,134.0,63.0,374.0,71.0,4096.0,8657.0,2367.0,1383.0,6446.0,3443.0,1389.0,8191.0,2089.0,4096.0,8657.0,2367.0,1383.0,6446.0,3443.0,1389.0,8191.0,2089.0


In [148]:
dk = dk[['GD', 'TBlocks_F', 'TBlocks_A', 'TFaceoffs_F', 'TFaceoffs_A',  'TGiveaways_F', 'TGiveaways_A', 'TGoals_F', 'TGoals_A', 'THits_F', 'THits_A',  'TMiss_F', 'TMiss_A', 'TPenalties_F', 'TPenalties_A', 'TShots_F', 'TShots_A', 'TTakeaways_F', 'TTakeaways_A']]
dk = dk.drop_duplicates(['GD'])
dk = dk.sort_values(['GD'], ascending=[False])
dk.set_index(['GD'])

Unnamed: 0_level_0,TBlocks_F,TBlocks_A,TFaceoffs_F,TFaceoffs_A,TGiveaways_F,TGiveaways_A,TGoals_F,TGoals_A,THits_F,THits_A,TMiss_F,TMiss_A,TPenalties_F,TPenalties_A,TShots_F,TShots_A,TTakeaways_F,TTakeaways_A
GD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
8.0,5.0,5.0,13.0,13.0,1.0,1.0,2.0,2.0,4.0,4.0,5.0,5.0,2.0,2.0,13.0,13.0,1.0,1.0
7.0,52.0,52.0,111.0,111.0,16.0,16.0,20.0,20.0,61.0,61.0,33.0,33.0,18.0,18.0,98.0,98.0,26.0,26.0
6.0,92.0,92.0,244.0,244.0,53.0,53.0,50.0,50.0,152.0,152.0,109.0,109.0,65.0,65.0,220.0,220.0,48.0,48.0
5.0,227.0,227.0,548.0,548.0,125.0,125.0,114.0,114.0,373.0,373.0,192.0,192.0,153.0,153.0,487.0,487.0,114.0,114.0
4.0,645.0,645.0,1411.0,1411.0,339.0,339.0,264.0,264.0,1009.0,1009.0,561.0,561.0,359.0,359.0,1266.0,1266.0,331.0,331.0
3.0,2054.0,2054.0,4459.0,4459.0,1144.0,1144.0,703.0,703.0,3007.0,3007.0,1765.0,1765.0,699.0,699.0,4129.0,4129.0,1075.0,1075.0
2.0,4105.0,4105.0,8702.0,8702.0,2368.0,2368.0,1419.0,1419.0,6452.0,6452.0,3450.0,3450.0,1390.0,1390.0,8197.0,8197.0,2092.0,2092.0
1.0,8709.0,8709.0,17476.0,17476.0,5235.0,5235.0,2722.0,2722.0,14611.0,14611.0,7088.0,7088.0,2603.0,2603.0,17015.0,17015.0,4457.0,4457.0
0.0,3388.0,3388.0,7216.0,7216.0,1854.0,1854.0,1251.0,1251.0,5527.0,5527.0,2779.0,2779.0,1102.0,1102.0,6641.0,6641.0,1745.0,1745.0
-1.0,8703.0,8703.0,17424.0,17424.0,5230.0,5230.0,2678.0,2678.0,14605.0,14605.0,7086.0,7086.0,2603.0,2603.0,17005.0,17005.0,4457.0,4457.0


In [149]:
dz = pd.pivot_table(dk, values=(['TGoals_F', 'TGoals_A', 'TShots_F', 'TShots_A', 'TMiss_F', 'TMiss_A', 'TBlocks_F', 'TBlocks_A']), index=['GD'])
dz = dz[['TGoals_F', 'TGoals_A', 'TShots_F', 'TShots_A', 'TMiss_F', 'TMiss_A', 'TBlocks_F', 'TBlocks_A']]
dz.head()

Unnamed: 0_level_0,TGoals_F,TGoals_A,TShots_F,TShots_A,TMiss_F,TMiss_A,TBlocks_F,TBlocks_A
GD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
-8.0,2.0,2.0,13.0,13.0,5.0,5.0,5.0,5.0
-7.0,18.0,18.0,96.0,96.0,33.0,33.0,52.0,52.0
-6.0,49.0,49.0,220.0,220.0,109.0,109.0,91.0,91.0
-5.0,106.0,106.0,485.0,485.0,192.0,192.0,226.0,226.0
-4.0,261.0,261.0,1268.0,1268.0,560.0,560.0,646.0,646.0


In [150]:
beginningtex = """\\documentclass{report}
\\usepackage{booktabs}
\\begin{document}"""
endtex = "\end{document}"

f = open('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/latex/events/sum_all_on_ice_events_1.tex', 'w')
f.write(beginningtex)
f.write(dz.to_latex())
f.write(endtex)
f.close()

In [151]:
dy = pd.pivot_table(dk, values=(['TFaceoffs_F', 'TFaceoffs_A', 'TPenalties_F', 'TPenalties_A', 'THits_F', 'THits_A', 'TGiveaways_F', 'TGiveaways_A', 'TTakeaways_F', 'TTakeaways_A']), index=['GD'])
dy = dy[['TFaceoffs_F', 'TFaceoffs_A', 'TPenalties_F', 'TPenalties_A', 'THits_F', 'THits_A', 'TGiveaways_F', 'TGiveaways_A', 'TTakeaways_F', 'TTakeaways_A']]
dy.head()

Unnamed: 0_level_0,TFaceoffs_F,TFaceoffs_A,TPenalties_F,TPenalties_A,THits_F,THits_A,TGiveaways_F,TGiveaways_A,TTakeaways_F,TTakeaways_A
GD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
-8.0,13.0,13.0,2.0,2.0,4.0,4.0,1.0,1.0,1.0,1.0
-7.0,109.0,109.0,18.0,18.0,61.0,61.0,16.0,16.0,26.0,26.0
-6.0,243.0,243.0,65.0,65.0,152.0,152.0,53.0,53.0,48.0,48.0
-5.0,537.0,537.0,153.0,153.0,373.0,373.0,124.0,124.0,113.0,113.0
-4.0,1408.0,1408.0,359.0,359.0,1010.0,1010.0,340.0,340.0,331.0,331.0


In [152]:
beginningtex = """\\documentclass{report}
\\usepackage{booktabs}
\\begin{document}"""
endtex = "\end{document}"

f = open('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/latex/events/sum_all_on_ice_events_2.tex', 'w')
f.write(beginningtex)
f.write(dy.to_latex())
f.write(endtex)
f.close()