# Occurance of on-ice events with zones

<p>data frames used in this notebook:</p>
<p>&nbsp; &nbsp; 1. all on-ice prior to a goal events.</p>
<p>&nbsp; &nbsp; 2. all even strength on-ice events.</p> 
 

In [44]:
import sys
import os
import pandas as pd
import numpy as np
import datetime, time
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.formula.api import ols
from pylab import hist, show
import scipy
import zipfile


pd.set_option('display.max_rows', 50)
pd.set_option('display.max_columns', 200)

In [45]:
pwd

'/Users/stefanostselios/Desktop/nhl_roster_design-master'

In [46]:
da = pd.read_csv('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/pbp_merged.csv')
#da = pd.read_csv('/Users/kevinmongeon/Brock University/Steve Tselios - StephanosShare/out/pbp_merged.csv')
da = da.drop('Unnamed: 0', axis=1)
da = da.rename(columns={'TeamCode': 'EventTeamCode'})

- keep regular season games and relevant on-ice events in **regulation time**. Drop duplicates by season, game number, event number and event team to have one obsrevation per event per game.

In [47]:
da = da[da['GameNumber'] <= 21230]
da = da[da['Period'] <= 3]
da = da[da['Period'] >= 1]
da = da[da['EventType']!='STOP']
da = da[da['EventType']!='EISTR']
da = da[da['EventType']!='EIEND']
da = da[da['EventType'] !='FIGHT']
da = da.dropna(subset=['EventNumber'])

In [48]:
da.head()

Unnamed: 0,Season,GameNumber,EventNumber,Period,AdvantageType,EventTimeFromZero,EventTimeFromTwenty,EventType,EventDetail,VPlayer1,VPosition1,VPlayer2,VPosition2,VPlayer3,VPosition3,VPlayer4,VPosition4,VPlayer5,VPosition5,VPlayer6,VPosition6,HPlayer1,HPosition1,HPlayer2,HPosition2,HPlayer3,HPosition3,HPlayer4,HPosition4,HPlayer5,HPosition5,HPlayer6,HPosition6,GameDate,VTeamCode,HTeamCode,EventTeamCode,PlayerNumber,PlayerName,ShotType,ShotResult,Zone,Length,PenaltyType
0,2010,20001,1,1,,0,1200,FAC,MTL won Neu. Zone - MTL #11 GOMEZ vs TOR #37 B...,11,C,21.0,R,57.0,L,26.0,D,75.0,D,31.0,G,37,C,9.0,R,11.0,L,3.0,D,22.0,D,35.0,G,2010-10-07,MTL,TOR,MTL,11.0,GOMEZ,,,N,,
1,2010,20001,3,1,EV,15,1185,HIT,"TOR #37 BRENT HIT MTL #26 GORGES, Off. Zone",11,C,21.0,R,57.0,L,26.0,D,75.0,D,31.0,G,37,C,9.0,R,11.0,L,3.0,D,22.0,D,35.0,G,2010-10-07,MTL,TOR,TOR,37.0,BRENT,,,O,,
2,2010,20001,4,1,EV,46,1154,HIT,"MTL #14 PLEKANEC HIT TOR #2 SCHENN, Off. Zone",14,C,81.0,C,46.0,L,6.0,D,76.0,D,31.0,G,42,C,81.0,C,32.0,R,2.0,D,15.0,D,35.0,G,2010-10-07,MTL,TOR,MTL,14.0,PLEKANEC,,,O,,
3,2010,20001,5,1,EV,57,1143,HIT,"MTL #76 SUBBAN HIT TOR #15 KABERLE, Neu. Zone",14,C,81.0,C,46.0,L,6.0,D,76.0,D,31.0,G,42,C,81.0,C,32.0,R,2.0,D,15.0,D,35.0,G,2010-10-07,MTL,TOR,MTL,76.0,SUBBAN,,,N,,
4,2010,20001,6,1,EV,69,1131,GIVE,"TOR&nbsp;GIVEAWAY - #35 GIGUERE, Def. Zone",14,C,81.0,C,46.0,L,6.0,D,76.0,D,31.0,G,42,C,81.0,C,32.0,R,2.0,D,15.0,D,35.0,G,2010-10-07,MTL,TOR,TOR,35.0,GIGUERE,,,D,,


In [49]:
da.shape

(310113, 44)

- create a goal dataframe that will display the number of goal per game.

In [50]:
df = da[['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'VTeamCode', 'HTeamCode', 'EventTeamCode']]
dg = df[df['EventType'] == 'GOAL']
dg['Goal'] = dg.apply(lambda x: 1 if (x['EventType'] == 'GOAL') else 0, axis=1)
dg['GoalNumber'] = dg.groupby(['Season', 'GameNumber']).cumcount()+1
dg.head()
dg = dg[['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode', 'GoalNumber']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


- merge dg onto df to display the goal number per game. Group by season, game number and period to backwardfill advantage type and goal number.

In [51]:
df = pd.merge(df, dg, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode'], how='left')
df['AdvantageType'] = df.groupby(['Season', 'GameNumber'])['AdvantageType'].apply(lambda x: x.bfill())
df['GoalNumber'] = df.groupby(['Season', 'GameNumber', 'Period'])['GoalNumber'].apply(lambda x: x.bfill())
df.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber
0,2010,20001,1,EV,1,N,FAC,0,MTL,TOR,MTL,1.0
1,2010,20001,3,EV,1,O,HIT,15,MTL,TOR,TOR,1.0
2,2010,20001,4,EV,1,O,HIT,46,MTL,TOR,MTL,1.0
3,2010,20001,5,EV,1,N,HIT,57,MTL,TOR,MTL,1.0
4,2010,20001,6,EV,1,D,GIVE,69,MTL,TOR,TOR,1.0


- display the home goal number and visitor goal number by game number and season. Keep all on-ice events that happened prior to a goal when the score differential was between -1 and 1. Exclude all other events.

In [52]:
dz = dg[dg['EventTeamCode'] == dg['HTeamCode']]
dz['HGoalNumber'] = dz.groupby(['Season', 'GameNumber']).cumcount()+1
dy = dg[dg['EventTeamCode'] == dg['VTeamCode']]
dy['VGoalNumber'] = dy.groupby(['Season', 'GameNumber']).cumcount()+1

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


- merge visitor goal number dataframe (dy) and home goal number dataframe (dz) onto goal dataframe (dg). 

In [53]:
dg = pd.merge(dg, dy, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode', 'GoalNumber'], how='left')
dg = pd.merge(dg, dz, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode', 'GoalNumber'], how='left')
dg.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,EventTeamCode,VTeamCode,HTeamCode,GoalNumber,VGoalNumber,HGoalNumber
0,2010,20001,35,EV,1,O,GOAL,402,TOR,MTL,TOR,1,,1.0
1,2010,20001,49,EV,1,O,GOAL,537,TOR,MTL,TOR,2,,2.0
2,2010,20001,68,EV,1,O,GOAL,739,MTL,MTL,TOR,3,1.0,
3,2010,20001,223,EV,3,O,GOAL,96,TOR,MTL,TOR,4,,3.0
4,2010,20001,232,EV,3,O,GOAL,148,MTL,MTL,TOR,5,2.0,


- forward fill home goal number and visitor game number by season and game number. Fill in 'NaN' values with zero for home and visitor game number.

In [54]:
dg['HGoalNumber'] = dg.groupby(['Season', 'GameNumber'])['HGoalNumber'].apply(lambda x: x.ffill())
dg['VGoalNumber'] = dg.groupby(['Season', 'GameNumber'])['VGoalNumber'].apply(lambda x: x.ffill())
dg['VGoalNumber'] = dg['VGoalNumber'].fillna(0)
dg['HGoalNumber'] = dg['HGoalNumber'].fillna(0)
dg.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,EventTeamCode,VTeamCode,HTeamCode,GoalNumber,VGoalNumber,HGoalNumber
0,2010,20001,35,EV,1,O,GOAL,402,TOR,MTL,TOR,1,0.0,1.0
1,2010,20001,49,EV,1,O,GOAL,537,TOR,MTL,TOR,2,0.0,2.0
2,2010,20001,68,EV,1,O,GOAL,739,MTL,MTL,TOR,3,1.0,2.0
3,2010,20001,223,EV,3,O,GOAL,96,TOR,MTL,TOR,4,1.0,3.0
4,2010,20001,232,EV,3,O,GOAL,148,MTL,MTL,TOR,5,2.0,3.0


- merge goal dataframe on dk and backward fill by home goal number and visitor goal number.

In [55]:
dk = da[['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'VTeamCode', 'HTeamCode', 'EventTeamCode']]
dk = pd.merge(dk, dg, on=['Season', 'GameNumber', 'EventNumber', 'AdvantageType', 'Period', 'Zone', 'EventType', 'EventTimeFromZero', 'EventTeamCode', 'VTeamCode', 'HTeamCode'], how='left')
dk['AdvantageType'] = dk.groupby(['Season', 'GameNumber'])['AdvantageType'].apply(lambda x: x.bfill())
dk['GoalNumber'] = dk.groupby(['Season', 'GameNumber', 'Period'])['GoalNumber'].apply(lambda x: x.bfill())
dk['HGoalNumber'] = dk.groupby(['Season', 'GameNumber', 'Period'])['HGoalNumber'].apply(lambda x: x.bfill())
dk['VGoalNumber'] = dk.groupby(['Season', 'GameNumber', 'Period'])['VGoalNumber'].apply(lambda x: x.bfill())
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber
0,2010,20001,1,EV,1,N,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0
1,2010,20001,3,EV,1,O,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0
2,2010,20001,4,EV,1,O,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0
3,2010,20001,5,EV,1,N,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0
4,2010,20001,6,EV,1,D,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0


- create zone for home and visitor team. Offensive zone for the home team is the defensive zone of the visitor team and defensive zone for the home team is the offensive zone for the visitor team. Neutral zone is the same for both teams.

In [56]:
dk['VZone'] = dk.apply(lambda x: x['Zone'] if (x['EventTeamCode'] == x['VTeamCode']) else 'D' if ((x['EventTeamCode'] ==x['HTeamCode']) & (x['Zone'] == 'O')) else 'O' if ((x['EventTeamCode'] == x['HTeamCode']) & (x['Zone'] == 'D')) else 'N', axis=1)
dk['HZone'] = dk.apply(lambda x: x['Zone'] if (x['EventTeamCode'] == x['HTeamCode']) else 'D' if ((x['EventTeamCode'] ==x['VTeamCode']) & (x['Zone'] == 'O')) else 'O' if ((x['EventTeamCode'] == x['VTeamCode']) & (x['Zone'] == 'D')) else 'N', axis=1)
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,VZone,HZone
0,2010,20001,1,EV,1,N,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,N,N
1,2010,20001,3,EV,1,O,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,D,O
2,2010,20001,4,EV,1,O,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,O,D
3,2010,20001,5,EV,1,N,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,N,N
4,2010,20001,6,EV,1,D,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,O,D


### even strength situations only!!

In [57]:
#dk = dk[dk['AdvantageType'] == 'EV']

- display the goal differential per game for each team.

In [58]:
dk['GD'] = dk.apply(lambda x: x['HGoalNumber'] - x['VGoalNumber'] if (x['EventTeamCode'] == x['HTeamCode']) else x['VGoalNumber'] - x['HGoalNumber'], axis=1)
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,VZone,HZone,GD
0,2010,20001,1,EV,1,N,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,N,N,-1.0
1,2010,20001,3,EV,1,O,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,D,O,1.0
2,2010,20001,4,EV,1,O,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,O,D,-1.0
3,2010,20001,5,EV,1,N,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,N,N,-1.0
4,2010,20001,6,EV,1,D,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,O,D,1.0


In [59]:
dk.shape

(310113, 17)

- On-ice events that occured in a different period from a goal or after a goal are excluded from the dataframe.

In [60]:
dk = dk.dropna(subset=['GoalNumber'])
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk = dk.drop_duplicates(['Season', 'GameNumber', 'EventNumber', 'EventTeamCode'])
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,VZone,HZone,GD
0,2010,20001,1,EV,1,N,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,N,N,-1.0
1,2010,20001,3,EV,1,O,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,D,O,1.0
2,2010,20001,4,EV,1,O,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,O,D,-1.0
3,2010,20001,5,EV,1,N,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,N,N,-1.0
4,2010,20001,6,EV,1,D,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,O,D,1.0


In [61]:
dk.shape

(178759, 17)

- Assign a value of 1 if an on-ice event is a goal, 0 if not. Follow the same procedure for block, faceoff, giveaway, hits, miss, penalty, shot and takeaway. Group by season, game number zone and event type to find the sum of each on-ice event per game. 

In [62]:
dk['Goal'] = dk.apply(lambda x: 1 if (x['EventType'] == 'GOAL') else np.nan, axis=1)
dk['Block'] = dk.apply(lambda x: 1 if (x['EventType'] == 'BLOCK') else np.nan, axis=1)
dk['Faceoff'] = dk.apply(lambda x: 1 if (x['EventType'] == 'FAC') else np.nan, axis=1)
dk['Giveaway'] = dk.apply(lambda x: 1 if (x['EventType'] == 'GIVE') else np.nan, axis=1)
dk['Hit'] = dk.apply(lambda x: 1 if (x['EventType'] == 'HIT') else np.nan, axis=1)
dk['Miss'] = dk.apply(lambda x: 1 if (x['EventType'] == 'MISS') else np.nan, axis=1)
dk['Penalty'] = dk.apply(lambda x: 1 if (x['EventType'] == 'PENL') else np.nan, axis=1)
dk['Shot'] = dk.apply(lambda x: 1 if (x['EventType'] == 'SHOT') else np.nan, axis=1)
dk['Takeaway'] = dk.apply(lambda x: 1 if (x['EventType'] == 'TAKE') else np.nan, axis=1)

In [63]:
dk['Blocks'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Block'].transform('sum')
dk['Faceoffs'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Faceoff'].transform('sum')
dk['Giveaways'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Giveaway'].transform('sum')
dk['Goals'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Goal'].transform('sum')
dk['Hits'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Hit'].transform('sum')
dk['Misses'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Miss'].transform('sum')
dk['Penalties'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Penalty'].transform('sum')
dk['Shots'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Shot'].transform('sum')
dk['Takeaways'] = dk.groupby(['Season','GameNumber', 'Zone', 'EventTeamCode', 'EventType', 'GoalNumber'])['Takeaway'].transform('sum')

In [64]:
dk.head()

Unnamed: 0,Season,GameNumber,EventNumber,AdvantageType,Period,Zone,EventType,EventTimeFromZero,VTeamCode,HTeamCode,EventTeamCode,GoalNumber,VGoalNumber,HGoalNumber,VZone,HZone,GD,Goal,Block,Faceoff,Giveaway,Hit,Miss,Penalty,Shot,Takeaway,Blocks,Faceoffs,Giveaways,Goals,Hits,Misses,Penalties,Shots,Takeaways
0,2010,20001,1,EV,1,N,FAC,0,MTL,TOR,MTL,1.0,0.0,1.0,N,N,-1.0,,,1.0,,,,,,,,1.0,,,,,,,
1,2010,20001,3,EV,1,O,HIT,15,MTL,TOR,TOR,1.0,0.0,1.0,D,O,1.0,,,,,1.0,,,,,,,,,3.0,,,,
2,2010,20001,4,EV,1,O,HIT,46,MTL,TOR,MTL,1.0,0.0,1.0,O,D,-1.0,,,,,1.0,,,,,,,,,5.0,,,,
3,2010,20001,5,EV,1,N,HIT,57,MTL,TOR,MTL,1.0,0.0,1.0,N,N,-1.0,,,,,1.0,,,,,,,,,1.0,,,,
4,2010,20001,6,EV,1,D,GIVE,69,MTL,TOR,TOR,1.0,0.0,1.0,O,D,1.0,,,,1.0,,,,,,,,2.0,,,,,,


In [65]:
dk.shape

(178759, 35)

### reshape data wide to long

In [66]:
dk = dk.rename(columns={'EventTeamCode': 'EventTeam', 'Zone': 'Z'})
a = [col for col in dk.columns if 'TeamCode' in col]
b = [col for col in dk.columns if 'Zone' in col]
dk = pd.lreshape(dk, {'TeamCode' : a, 'Zone':b})
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk = dk.rename(columns={'EventTeam': 'EventTeamCode'})
dk.head()

Unnamed: 0,AdvantageType,Block,Blocks,EventNumber,EventTeamCode,EventTimeFromZero,EventType,Faceoff,Faceoffs,GD,GameNumber,Giveaway,Giveaways,Goal,GoalNumber,Goals,HGoalNumber,Hit,Hits,Miss,Misses,Penalties,Penalty,Period,Season,Shot,Shots,Takeaway,Takeaways,VGoalNumber,Z,TeamCode,Zone
0,EV,,,1,MTL,0,FAC,1.0,1.0,-1.0,20001,,,,1.0,,1.0,,,,,,,1,2010,,,,,0.0,N,MTL,N
178759,EV,,,1,MTL,0,FAC,1.0,1.0,-1.0,20001,,,,1.0,,1.0,,,,,,,1,2010,,,,,0.0,N,TOR,N
1,EV,,,3,TOR,15,HIT,,,1.0,20001,,,,1.0,,1.0,1.0,3.0,,,,,1,2010,,,,,0.0,O,MTL,D
178760,EV,,,3,TOR,15,HIT,,,1.0,20001,,,,1.0,,1.0,1.0,3.0,,,,,1,2010,,,,,0.0,O,TOR,O
2,EV,,,4,MTL,46,HIT,,,-1.0,20001,,,,1.0,,1.0,1.0,5.0,,,,,1,2010,,,,,0.0,O,MTL,O


In [67]:
dk.shape

(357518, 33)

- drop duplicates by season, game number, team code and event type.

In [68]:
dk = dk.drop_duplicates(['Season', 'GameNumber', 'TeamCode', 'EventTeamCode', 'EventType', 'GoalNumber'])
dk = dk [['Season', 'GameNumber', 'AdvantageType', 'Zone', 'Period', 'TeamCode', 'EventNumber', 'EventType', 'EventTeamCode', 'GoalNumber', 'GD',  'Blocks', 'Faceoffs', 'Giveaways', 'Goals', 'Hits', 'Misses', 'Penalties', 'Shots', 'Takeaways']]
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk.shape

(138546, 20)

- assign all on-ice events to their respectful teams by zone. If team code is the same as event team code, then the on-ice event is assigned to that team. If not it is assigned to the opposing team. Each on-ice event generates two variables per team: For (F) and Against (A).

In [69]:
dk['Blocks_F'] = dk.apply(lambda x: x['Blocks'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Blocks_A'] = dk.apply(lambda x: x['Blocks'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Faceoffs_F'] = dk.apply(lambda x: x['Faceoffs'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Faceoffs_A'] = dk.apply(lambda x: x['Faceoffs'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Giveaways_F'] = dk.apply(lambda x: x['Giveaways'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Giveaways_A'] = dk.apply(lambda x: x['Giveaways'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Goals_F'] = dk.apply(lambda x: x['Goals'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Goals_A'] = dk.apply(lambda x: x['Goals'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Hits_F'] = dk.apply(lambda x: x['Hits'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Hits_A'] = dk.apply(lambda x: x['Hits'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Miss_F'] = dk.apply(lambda x: x['Misses'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Miss_A'] = dk.apply(lambda x: x['Misses'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Penalties_F'] = dk.apply(lambda x: x['Penalties'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Penalties_A'] = dk.apply(lambda x: x['Penalties'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Shots_F'] = dk.apply(lambda x: x['Shots'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Shots_A'] = dk.apply(lambda x: x['Shots'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk['Takeaways_F'] = dk.apply(lambda x: x['Takeaways'] if (x['TeamCode'] == x['EventTeamCode']) else np.nan, axis=1)
dk['Takeaways_A'] = dk.apply(lambda x: x['Takeaways'] if (x['TeamCode'] != x['EventTeamCode']) else np.nan, axis=1)
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk.head()

Unnamed: 0,Season,GameNumber,AdvantageType,Zone,Period,TeamCode,EventNumber,EventType,EventTeamCode,GoalNumber,GD,Blocks,Faceoffs,Giveaways,Goals,Hits,Misses,Penalties,Shots,Takeaways,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A
0,2010,20001,EV,N,1,MTL,1,FAC,MTL,1.0,-1.0,,1.0,,,,,,,,,,1.0,,,,,,,,,,,,,,,
178759,2010,20001,EV,N,1,TOR,1,FAC,MTL,1.0,-1.0,,1.0,,,,,,,,,,,1.0,,,,,,,,,,,,,,
1,2010,20001,EV,D,1,MTL,3,HIT,TOR,1.0,1.0,,,,,3.0,,,,,,,,,,,,,,3.0,,,,,,,,
178760,2010,20001,EV,O,1,TOR,3,HIT,TOR,1.0,1.0,,,,,3.0,,,,,,,,,,,,,3.0,,,,,,,,,
2,2010,20001,EV,O,1,MTL,4,HIT,MTL,1.0,-1.0,,,,,5.0,,,,,,,,,,,,,5.0,,,,,,,,,


- backward and forward fill of on-ice events by season, game number and team code.

In [70]:
dk['Blocks_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Blocks_F'].apply(lambda x: x.ffill().bfill())
dk['Faceoffs_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Faceoffs_F'].apply(lambda x: x.ffill().bfill())
dk['Giveaways_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Giveaways_F'].apply(lambda x: x.ffill().bfill())
dk['Goals_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Goals_F'].apply(lambda x: x.ffill().bfill())
dk['Hits_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Hits_F'].apply(lambda x: x.ffill().bfill())
dk['Miss_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Miss_F'].apply(lambda x: x.ffill().bfill())
dk['Penalties_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Penalties_F'].apply(lambda x: x.ffill().bfill())
dk['Shots_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Shots_F'].apply(lambda x: x.ffill().bfill())
dk['Takeaways_F'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Takeaways_F'].apply(lambda x: x.ffill().bfill())
dk['Blocks_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Blocks_A'].apply(lambda x: x.ffill().bfill())
dk['Faceoffs_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Faceoffs_A'].apply(lambda x: x.ffill().bfill())
dk['Giveaways_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Giveaways_A'].apply(lambda x: x.ffill().bfill())
dk['Goals_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Goals_A'].apply(lambda x: x.ffill().bfill())
dk['Hits_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Hits_A'].apply(lambda x: x.ffill().bfill())
dk['Miss_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Miss_A'].apply(lambda x: x.ffill().bfill())
dk['Penalties_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Penalties_A'].apply(lambda x: x.ffill().bfill())
dk['Shots_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Shots_A'].apply(lambda x: x.ffill().bfill())
dk['Takeaways_A'] = dk.groupby(['Season','GameNumber', 'TeamCode', 'Zone', 'GoalNumber'])['Takeaways_A'].apply(lambda x: x.ffill().bfill())
dk = dk.sort_values(['Season', 'GameNumber', 'EventNumber'], ascending=[True, True, True])
dk = dk.fillna(0)
dk.head()

Unnamed: 0,Season,GameNumber,AdvantageType,Zone,Period,TeamCode,EventNumber,EventType,EventTeamCode,GoalNumber,GD,Blocks,Faceoffs,Giveaways,Goals,Hits,Misses,Penalties,Shots,Takeaways,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A
0,2010,20001,EV,N,1,MTL,1,FAC,MTL,1.0,-1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
178759,2010,20001,EV,N,1,TOR,1,FAC,MTL,1.0,-1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2010,20001,EV,D,1,MTL,3,HIT,TOR,1.0,1.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0
178760,2010,20001,EV,O,1,TOR,3,HIT,TOR,1.0,1.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0
2,2010,20001,EV,O,1,MTL,4,HIT,MTL,1.0,-1.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,1.0,0.0,2.0,0.0,0.0,5.0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0


- keep only relative columns and drop duplicates by season, gamenumber and teamcode, to have two observations per game.

In [71]:
dk = dk[['Season', 'GameNumber', 'Zone', 'TeamCode', 'GoalNumber', 'GD', 'Blocks_F', 'Blocks_A', 'Faceoffs_F', 'Faceoffs_A', 'Giveaways_F', 'Giveaways_A', 'Goals_F', 'Goals_A', 'Hits_F', 'Hits_A', 'Miss_F', 'Miss_A', 'Penalties_F', 'Penalties_A', 'Shots_F', 'Shots_A', 'Takeaways_F', 'Takeaways_A']]
dk = dk.sort_values(['Season', 'GameNumber'], ascending=[True, True])
dk = dk.drop_duplicates(['Season', 'GameNumber', 'TeamCode', 'GoalNumber', 'GD', 'Zone'])
dk.head()

Unnamed: 0,Season,GameNumber,Zone,TeamCode,GoalNumber,GD,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A
0,2010,20001,N,MTL,1.0,-1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
178759,2010,20001,N,TOR,1.0,-1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2010,20001,D,MTL,1.0,1.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0
178760,2010,20001,O,TOR,1.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0
2,2010,20001,O,MTL,1.0,-1.0,0.0,4.0,0.0,1.0,0.0,2.0,0.0,0.0,5.0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0


In [72]:
dk.shape

(58544, 24)

In [73]:
dk.isnull().sum()

Season         0
GameNumber     0
Zone           0
TeamCode       0
GoalNumber     0
GD             0
Blocks_F       0
Blocks_A       0
Faceoffs_F     0
Faceoffs_A     0
Giveaways_F    0
Giveaways_A    0
Goals_F        0
Goals_A        0
Hits_F         0
Hits_A         0
Miss_F         0
Miss_A         0
Penalties_F    0
Penalties_A    0
Shots_F        0
Shots_A        0
Takeaways_F    0
Takeaways_A    0
dtype: int64

- group by season, team code and goal differential to compute the mean of each on-ice events while score differential was the same throughout the season.

In [74]:
#dk['MBlocks_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Blocks_F'].transform('mean')
#dk['MFaceoffs_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Faceoffs_F'].transform('mean')
#dk['MGiveaways_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Giveaways_F'].transform('mean')
#dk['MGoals_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Goals_F'].transform('mean')
#dk['MHits_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Hits_F'].transform('mean')
#dk['MMiss_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Miss_F'].transform('mean')
#dk['MPenalties_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Penalties_F'].transform('mean')
#dk['MShots_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Shots_F'].transform('mean')
#dk['MTakeaways_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Takeaways_F'].transform('mean')
#dk['MBlocks_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Blocks_A'].transform('mean')
#dk['MFaceoffs_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Faceoffs_A'].transform('mean')
#dk['MGiveaways_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Giveaways_A'].transform('mean')
#dk['MGoals_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Goals_A'].transform('mean')
#dk['MHits_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Hits_A'].transform('mean')
#dk['MMiss_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Miss_A'].transform('mean')
#dk['MPenalties_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Penalties_A'].transform('mean')
#dk['MShots_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Shots_A'].transform('mean')
#dk['MTakeaways_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Takeaways_A'].transform('mean')
#dk.head()

In [75]:
dk['MBlocks_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Blocks_F'].transform('sum')
dk['MFaceoffs_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Faceoffs_F'].transform('sum')
dk['MGiveaways_F'] = dk.groupby(['Season', 'TeamCode', 'GD'])['Giveaways_F'].transform('sum')
dk['MGoals_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Goals_F'].transform('sum')
dk['MHits_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Hits_F'].transform('sum')
dk['MMiss_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Miss_F'].transform('sum')
dk['MPenalties_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Penalties_F'].transform('sum')
dk['MShots_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Shots_F'].transform('sum')
dk['MTakeaways_F'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Takeaways_F'].transform('sum')
dk['MBlocks_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Blocks_A'].transform('sum')
dk['MFaceoffs_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Faceoffs_A'].transform('sum')
dk['MGiveaways_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Giveaways_A'].transform('sum')
dk['MGoals_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Goals_A'].transform('sum')
dk['MHits_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Hits_A'].transform('sum')
dk['MMiss_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Miss_A'].transform('sum')
dk['MPenalties_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Penalties_A'].transform('sum')
dk['MShots_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Shots_A'].transform('sum')
dk['MTakeaways_A'] = dk.groupby(['Season', 'TeamCode', 'Zone', 'GD'])['Takeaways_A'].transform('sum')
dk.head()

Unnamed: 0,Season,GameNumber,Zone,TeamCode,GoalNumber,GD,Blocks_F,Blocks_A,Faceoffs_F,Faceoffs_A,Giveaways_F,Giveaways_A,Goals_F,Goals_A,Hits_F,Hits_A,Miss_F,Miss_A,Penalties_F,Penalties_A,Shots_F,Shots_A,Takeaways_F,Takeaways_A,MBlocks_F,MFaceoffs_F,MGiveaways_F,MGoals_F,MHits_F,MMiss_F,MPenalties_F,MShots_F,MTakeaways_F,MBlocks_A,MFaceoffs_A,MGiveaways_A,MGoals_A,MHits_A,MMiss_A,MPenalties_A,MShots_A,MTakeaways_A
0,2010,20001,N,MTL,1.0,-1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,143.0,189.0,1.0,28.0,1.0,24.0,8.0,22.0,1.0,138.0,8.0,1.0,29.0,3.0,17.0,4.0,26.0
178759,2010,20001,N,TOR,1.0,-1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,126.0,177.0,0.0,19.0,3.0,23.0,2.0,25.0,0.0,127.0,17.0,1.0,15.0,7.0,23.0,4.0,25.0
1,2010,20001,D,MTL,1.0,1.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,292.0,59.0,189.0,0.0,94.0,1.0,37.0,0.0,51.0,1.0,39.0,15.0,82.0,162.0,204.0,30.0,504.0,43.0
178760,2010,20001,O,TOR,1.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,1.0,49.0,179.0,83.0,157.0,214.0,15.0,467.0,30.0,333.0,45.0,130.0,1.0,156.0,1.0,35.0,5.0,59.0
2,2010,20001,O,MTL,1.0,-1.0,0.0,4.0,0.0,1.0,0.0,2.0,0.0,0.0,5.0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,1.0,29.0,189.0,76.0,131.0,258.0,20.0,582.0,37.0,315.0,71.0,120.0,0.0,79.0,1.0,32.0,3.0,37.0


- drop duplicates by season, team code and goal differential.

In [76]:
dk = dk.drop_duplicates(['Season', 'TeamCode', 'Zone', 'GD'])
dk = dk [['Season', 'TeamCode', 'Zone', 'GD','MBlocks_F', 'MFaceoffs_F', 'MGiveaways_F', 'MGoals_F','MHits_F', 'MMiss_F', 'MPenalties_F', 'MShots_F', 'MTakeaways_F','MBlocks_A', 'MFaceoffs_A', 'MGiveaways_A', 'MGoals_A', 'MHits_A','MMiss_A', 'MPenalties_A', 'MShots_A', 'MTakeaways_A']]
dk.head()

Unnamed: 0,Season,TeamCode,Zone,GD,MBlocks_F,MFaceoffs_F,MGiveaways_F,MGoals_F,MHits_F,MMiss_F,MPenalties_F,MShots_F,MTakeaways_F,MBlocks_A,MFaceoffs_A,MGiveaways_A,MGoals_A,MHits_A,MMiss_A,MPenalties_A,MShots_A,MTakeaways_A
0,2010,MTL,N,-1.0,0.0,143.0,189.0,1.0,28.0,1.0,24.0,8.0,22.0,1.0,138.0,8.0,1.0,29.0,3.0,17.0,4.0,26.0
178759,2010,TOR,N,-1.0,0.0,126.0,177.0,0.0,19.0,3.0,23.0,2.0,25.0,0.0,127.0,17.0,1.0,15.0,7.0,23.0,4.0,25.0
1,2010,MTL,D,1.0,292.0,59.0,189.0,0.0,94.0,1.0,37.0,0.0,51.0,1.0,39.0,15.0,82.0,162.0,204.0,30.0,504.0,43.0
178760,2010,TOR,O,1.0,1.0,49.0,179.0,83.0,157.0,214.0,15.0,467.0,30.0,333.0,45.0,130.0,1.0,156.0,1.0,35.0,5.0,59.0
2,2010,MTL,O,-1.0,1.0,29.0,189.0,76.0,131.0,258.0,20.0,582.0,37.0,315.0,71.0,120.0,0.0,79.0,1.0,32.0,3.0,37.0


### summary analysis

In [77]:
#dk['TBlocks_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MBlocks_F'].transform('mean')
#dk['TFaceoffs_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MFaceoffs_F'].transform('mean')
#dk['TGiveaways_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MGiveaways_F'].transform('mean')
#dk['TGoals_F'] = dk.groupby(['Season','Zone', 'GD'])['MGoals_F'].transform('mean')
#dk['THits_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MHits_F'].transform('mean')
#dk['TMiss_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MMiss_F'].transform('mean')
#dk['TPenalties_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MPenalties_F'].transform('mean')
#dk['TShots_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MShots_F'].transform('mean')
#dk['TTakeaways_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MTakeaways_F'].transform('mean')
#dk['TBlocks_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MBlocks_A'].transform('mean')
#dk['TFaceoffs_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MFaceoffs_A'].transform('mean')
#dk['TGiveaways_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MGiveaways_A'].transform('mean')
#dk['TGoals_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MGoals_A'].transform('mean')
#dk['THits_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MHits_A'].transform('mean')
#dk['TMiss_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MMiss_A'].transform('mean')
#dk['TPenalties_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MPenalties_A'].transform('mean')
#dk['TShots_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MShots_A'].transform('mean')
#dk['TTakeaways_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MTakeaways_A'].transform('mean')
#dk.head()

In [78]:
dk['TBlocks_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MBlocks_F'].transform('sum')
dk['TFaceoffs_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MFaceoffs_F'].transform('sum')
dk['TGiveaways_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MGiveaways_F'].transform('sum')
dk['TGoals_F'] = dk.groupby(['Season','Zone', 'GD'])['MGoals_F'].transform('sum')
dk['THits_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MHits_F'].transform('sum')
dk['TMiss_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MMiss_F'].transform('sum')
dk['TPenalties_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MPenalties_F'].transform('sum')
dk['TShots_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MShots_F'].transform('sum')
dk['TTakeaways_F'] = dk.groupby(['Season', 'Zone', 'GD'])['MTakeaways_F'].transform('sum')
dk['TBlocks_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MBlocks_A'].transform('sum')
dk['TFaceoffs_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MFaceoffs_A'].transform('sum')
dk['TGiveaways_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MGiveaways_A'].transform('sum')
dk['TGoals_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MGoals_A'].transform('sum')
dk['THits_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MHits_A'].transform('sum')
dk['TMiss_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MMiss_A'].transform('sum')
dk['TPenalties_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MPenalties_A'].transform('sum')
dk['TShots_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MShots_A'].transform('sum')
dk['TTakeaways_A'] = dk.groupby(['Season', 'Zone', 'GD'])['MTakeaways_A'].transform('sum')
dk.head()

Unnamed: 0,Season,TeamCode,Zone,GD,MBlocks_F,MFaceoffs_F,MGiveaways_F,MGoals_F,MHits_F,MMiss_F,MPenalties_F,MShots_F,MTakeaways_F,MBlocks_A,MFaceoffs_A,MGiveaways_A,MGoals_A,MHits_A,MMiss_A,MPenalties_A,MShots_A,MTakeaways_A,TBlocks_F,TFaceoffs_F,TGiveaways_F,TGoals_F,THits_F,TMiss_F,TPenalties_F,TShots_F,TTakeaways_F,TBlocks_A,TFaceoffs_A,TGiveaways_A,TGoals_A,THits_A,TMiss_A,TPenalties_A,TShots_A,TTakeaways_A
0,2010,MTL,N,-1.0,0.0,143.0,189.0,1.0,28.0,1.0,24.0,8.0,22.0,1.0,138.0,8.0,1.0,29.0,3.0,17.0,4.0,26.0,3.0,3974.0,3861.0,7.0,632.0,45.0,495.0,152.0,587.0,3.0,3974.0,330.0,7.0,632.0,45.0,495.0,152.0,587.0
178759,2010,TOR,N,-1.0,0.0,126.0,177.0,0.0,19.0,3.0,23.0,2.0,25.0,0.0,127.0,17.0,1.0,15.0,7.0,23.0,4.0,25.0,3.0,3974.0,3861.0,7.0,632.0,45.0,495.0,152.0,587.0,3.0,3974.0,330.0,7.0,632.0,45.0,495.0,152.0,587.0
1,2010,MTL,D,1.0,292.0,59.0,189.0,0.0,94.0,1.0,37.0,0.0,51.0,1.0,39.0,15.0,82.0,162.0,204.0,30.0,504.0,43.0,8563.0,1729.0,3871.0,9.0,3637.0,21.0,1100.0,46.0,1440.0,33.0,1490.0,1050.0,2624.0,3932.0,6692.0,553.0,15569.0,1070.0
178760,2010,TOR,O,1.0,1.0,49.0,179.0,83.0,157.0,214.0,15.0,467.0,30.0,333.0,45.0,130.0,1.0,156.0,1.0,35.0,5.0,59.0,33.0,1490.0,3871.0,2624.0,3932.0,6692.0,553.0,15569.0,1070.0,8563.0,1729.0,2500.0,9.0,3637.0,21.0,1100.0,46.0,1440.0
2,2010,MTL,O,-1.0,1.0,29.0,189.0,76.0,131.0,258.0,20.0,582.0,37.0,315.0,71.0,120.0,0.0,79.0,1.0,32.0,3.0,37.0,36.0,1493.0,3861.0,2343.0,3955.0,6685.0,564.0,15472.0,1066.0,8530.0,1724.0,2478.0,7.0,3618.0,22.0,1099.0,47.0,1430.0


In [79]:
dk = dk[['GD', 'Zone', 'TBlocks_F', 'TFaceoffs_F', 'TGiveaways_F', 'TGoals_F', 'THits_F', 'TMiss_F', 'TPenalties_F', 'TShots_F', 'TTakeaways_F', 'TBlocks_A', 'TFaceoffs_A', 'TGiveaways_A', 'TGoals_A', 'THits_A', 'TMiss_A', 'TPenalties_A', 'TShots_A', 'TTakeaways_A']]
dk = dk.drop_duplicates(['GD', 'Zone'])
dk = dk.sort_values(['GD'], ascending=[False])

In [80]:
#dk = pd.pivot_table(dk, values=(['TBlocks_F', 'TFaceoffs_F', 'TGiveaways_F', 'TGoals_F', 'THits_F', 'TMiss_F', 'TPenalties_F', 'TShots_F', 'TTakeaways_F', 'TBlocks_A', 'TFaceoffs_A', 'TGiveaways_A', 'TGoals_A', 'THits_A', 'TMiss_A', 'TPenalties_A', 'TShots_A', 'TTakeaways_A']), index=['GD'], columns=['Zone'])
#dk.head()

In [81]:
dz = pd.pivot_table(dk, values=(['TGoals_F', 'TGoals_A', 'TShots_F', 'TShots_A', 'TMiss_F', 'TMiss_A']), index=['GD'], columns=['Zone'])
dz.head(20)

Unnamed: 0_level_0,TGoals_F,TGoals_F,TGoals_F,TGoals_A,TGoals_A,TGoals_A,TShots_F,TShots_F,TShots_F,TShots_A,TShots_A,TShots_A,TMiss_F,TMiss_F,TMiss_F,TMiss_A,TMiss_A,TMiss_A
Zone,D,N,O,D,N,O,D,N,O,D,N,O,D,N,O,D,N,O
GD,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2
-8.0,0.0,0.0,2.0,2.0,0.0,0.0,0.0,1.0,8.0,8.0,1.0,0.0,0.0,0.0,5.0,5.0,0.0,0.0
-7.0,0.0,0.0,14.0,14.0,0.0,0.0,0.0,2.0,87.0,87.0,2.0,0.0,0.0,0.0,30.0,30.0,0.0,0.0
-6.0,1.0,1.0,35.0,35.0,1.0,1.0,3.0,2.0,189.0,189.0,2.0,3.0,1.0,2.0,96.0,96.0,2.0,1.0
-5.0,1.0,1.0,89.0,89.0,1.0,1.0,2.0,7.0,439.0,439.0,7.0,2.0,0.0,2.0,178.0,178.0,2.0,0.0
-4.0,4.0,2.0,192.0,192.0,2.0,4.0,3.0,8.0,1146.0,1146.0,8.0,3.0,5.0,6.0,522.0,522.0,6.0,5.0
-3.0,10.0,21.0,531.0,531.0,21.0,10.0,23.0,30.0,3717.0,3717.0,30.0,23.0,5.0,13.0,1637.0,1637.0,13.0,5.0
-2.0,23.0,39.0,1132.0,1132.0,39.0,23.0,26.0,61.0,7437.0,7437.0,61.0,26.0,8.0,22.0,3244.0,3244.0,22.0,8.0
-1.0,7.0,7.0,2343.0,2343.0,7.0,7.0,47.0,152.0,15472.0,15472.0,152.0,47.0,22.0,45.0,6685.0,6685.0,45.0,22.0
0.0,3.0,3.0,1245.0,1245.0,3.0,3.0,18.0,72.0,6201.0,6201.0,72.0,18.0,11.0,23.0,2680.0,2680.0,23.0,11.0
1.0,9.0,9.0,2624.0,2624.0,9.0,9.0,46.0,145.0,15569.0,15569.0,145.0,46.0,21.0,49.0,6692.0,6692.0,49.0,21.0


In [82]:
beginningtex = """\\documentclass{report}
\\usepackage{booktabs}
\\begin{document}"""
endtex = "\end{document}"

f = open('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/latex/events/zones/sum_all_on_ice_events_with_zones_1.tex', 'w')
f.write(beginningtex)
f.write(dz.to_latex())
f.write(endtex)
f.close()

In [83]:
dz2 = pd.pivot_table(dk, values=(['TBlocks_F', 'TBlocks_A', 'THits_F', 'THits_A', 'TPenalties_F', 'TPenalties_A']), index=['GD'], columns=['Zone'])
dz2.head()

Unnamed: 0_level_0,TBlocks_F,TBlocks_F,TBlocks_F,TBlocks_A,TBlocks_A,TBlocks_A,THits_F,THits_F,THits_F,THits_A,THits_A,THits_A,TPenalties_F,TPenalties_F,TPenalties_F,TPenalties_A,TPenalties_A,TPenalties_A
Zone,D,N,O,D,N,O,D,N,O,D,N,O,D,N,O,D,N,O
GD,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2
-8.0,5.0,0.0,0.0,0.0,0.0,5.0,2.0,0.0,1.0,1.0,0.0,2.0,1.0,1.0,0.0,0.0,1.0,1.0
-7.0,51.0,0.0,0.0,0.0,0.0,51.0,18.0,3.0,13.0,13.0,3.0,18.0,8.0,1.0,3.0,3.0,1.0,8.0
-6.0,89.0,0.0,1.0,1.0,0.0,89.0,56.0,7.0,49.0,49.0,7.0,56.0,19.0,16.0,12.0,12.0,16.0,19.0
-5.0,219.0,0.0,1.0,1.0,0.0,219.0,108.0,22.0,95.0,95.0,22.0,108.0,45.0,56.0,25.0,25.0,56.0,45.0
-4.0,634.0,1.0,0.0,0.0,1.0,634.0,290.0,45.0,277.0,277.0,45.0,290.0,120.0,71.0,89.0,89.0,71.0,120.0


In [84]:
beginningtex = """\\documentclass{report}
\\usepackage{booktabs}
\\begin{document}"""
endtex = "\end{document}"

f = open('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/latex/events/zones/sum_all_on_ice_events_with_zones_2.tex', 'w')
f.write(beginningtex)
f.write(dz2.to_latex())
f.write(endtex)
f.close()

In [85]:
dy = pd.pivot_table(dk, values=(['TFaceoffs_F', 'TFaceoffs_A', 'TGiveaways_F', 'TGiveaways_A', 'TTakeaways_F', 'TTakeaways_A' ]), index=['GD'], columns=['Zone'])
dy.head()

Unnamed: 0_level_0,TFaceoffs_F,TFaceoffs_F,TFaceoffs_F,TFaceoffs_A,TFaceoffs_A,TFaceoffs_A,TGiveaways_F,TGiveaways_F,TGiveaways_F,TGiveaways_A,TGiveaways_A,TGiveaways_A,TTakeaways_F,TTakeaways_F,TTakeaways_F,TTakeaways_A,TTakeaways_A,TTakeaways_A
Zone,D,N,O,D,N,O,D,N,O,D,N,O,D,N,O,D,N,O
GD,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2
-8.0,0.0,2.0,1.0,1.0,2.0,0.0,1.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0
-7.0,10.0,26.0,8.0,8.0,26.0,10.0,12.0,11.0,12.0,1.0,4.0,7.0,9.0,3.0,5.0,5.0,3.0,9.0
-6.0,29.0,84.0,19.0,19.0,84.0,29.0,42.0,41.0,42.0,15.0,4.0,23.0,21.0,7.0,11.0,11.0,7.0,21.0
-5.0,53.0,133.0,58.0,58.0,133.0,53.0,95.0,95.0,95.0,24.0,12.0,59.0,49.0,19.0,20.0,20.0,19.0,49.0
-4.0,124.0,361.0,130.0,130.0,361.0,124.0,264.0,264.0,264.0,88.0,22.0,154.0,135.0,29.0,85.0,85.0,29.0,135.0


In [86]:
beginningtex = """\\documentclass{report}
\\usepackage{booktabs}
\\begin{document}"""
endtex = "\end{document}"

f = open('/Users/stefanostselios/Brock University/Kevin Mongeon - StephanosShare/out/latex/events/zones/sum_all_on_ice_events_with_zones_3.tex', 'w')
f.write(beginningtex)
f.write(dy.to_latex())
f.write(endtex)
f.close()