# Player Evaluation

Player evaluation will be conducted prior to roster design, as it weighs the overall performance of each player from each game throughout the course of a season. Total Hockey Rating (THoR) is an all-inclusive statistic rating of all NHL defensemen and forwards incidental to all on-ice events. All events of a game are documented and appointed a value determined by the probability that event generated a goal. 

### purpose of notebook:

- determine roster position for both home and visitor team.

- reshape data set

- generate a variable that will show the time difference between a goal and all events that happened prior.


- keep only events that happened 20 seconds prior to a goal.

- group events by goal number to count the occurance of each event prior to a goal.

- sum by event type to display the incidence of each event in two games.

- determine the value of home ice advantage ( μ )

- estimate the value each zone has on an event ( γ )

- determine if zone start has a positive or negative impact on each team for a given on ice event ( offensive, neutral and defensive)

- establish the impact of each event on a goal.

- determine if events have a positive or negative impact on each team.

- assign values to players based on their participation in events that led to a goal.

##  import modules

In [3]:
import sys
import os
import pandas as pd
import numpy as np
import datetime, time
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.formula.api import ols
from pylab import hist, show
import scipy

## import data frame

The merged data frame created in the roster_design_stephanos notebook is imported and used for player evaluation.

In [4]:
dm = pd.read_csv('pbpmerge.csv')

## drop unnamed column (irrelevant)

In [5]:
dm = dm.drop('Unnamed: 0', axis=1)

## rename visitor and home player position column

In [6]:
dm.columns

Index(['Season', 'GameNumber', 'EventNumber', 'Period', 'AdvantageType',
       'EventTimeFromZero', 'EventTimeFromTwenty', 'EventType', 'EventDetail',
       'VPlayer1', 'VPosition1', 'VPlayer2', 'VPosition2', 'VPlayer3',
       'VPosition3', 'VPlayer4', 'VPosition4', 'VPlayer5', 'VPosition5',
       'VPlayer6', 'VPosition6', 'HPlayer1', 'HPosition1', 'HPlayer2',
       'HPosition2', 'HPlayer3', 'HPosition3', 'HPlayer4', 'HPosition4',
       'HPlayer5', 'HPosition5', 'HPlayer6', 'HPosition6', 'TeamCode',
       'PlayerNumber', 'PlayerName', 'ShotType', 'Zone', 'Length',
       'ShotResult', 'ShotTeamCode', 'ShotPlayerNumber', 'ShotPlayerName',
       'WinTeamCode', 'VTeamCode', 'VNumber', 'VName', 'HTeamCode', 'HNumber',
       'HName', 'HitterTeamCode', 'HitterPlayerNumber', 'HitterPlayerName',
       'HitteeTeamCode', 'HitteePlayerNumber', 'HitteePlayerName',
       'PenaltyTeamCode', 'PenaltyPlayerNumber', 'PenaltyPlayerName',
       'PenaltyType', 'DrawnByTeamCode', 'DrawnByPlayer

## roster position

Play by play data reports each on-ice event along all 12 players that were on the ice during a given event, along with the outcome of that on-ice event. There are 6 players for the visitor team and 6 for the home team. Positions 1, 2 and 3 are the forward positions, 4 and 5 are the defense positions and 6 is the goaltender position. Each position is categorized below. 

### a) for visitor team:

**position 1** is the **centre position** of forward lines

In [7]:
dm['VPosition1'] = 'C'

**position 2** is the **right wing** position of forward lines

In [8]:
dm['VPosition2'] = 'RW'

**position 3** is the **left wing** position of forward lines

In [9]:
dm['VPosition3'] = 'LW'

**position 4** is the **right defense** position of defensive pairings

In [10]:
dm['VPosition4'] = 'RD'

**position 5** is the **left defense** position of defensive pairings

In [11]:
dm['VPosition5'] = 'LD'

### b) for home team: 

**position 1** is the **centre position** of forward lines

In [12]:
dm['HPosition1'] = 'C'

**position 2** is the **right wing** position of forward lines

In [13]:
dm['HPosition2'] = 'RW'

**position 3** is the **left wing** position of forward lines

In [14]:
dm['HPosition3'] = 'LW'

**position 4** is the **right defense** position of defensive pairings

In [15]:
dm['HPosition4'] = 'RD'

**position 5** is the **left defense** position of defensive pairings

In [16]:
dm['HPosition5'] = 'LD'

## reshape from wide to long

Once each roster position has been determined, the next step is to reshape the data set form wide to long. Instead of having 2 columns for each roster position (24 total), all players will be listed into 4 columns: 2 columns for the visitor team ** 'VPlayer' & 'VPosition'** and 2 columns for the home team **'HPlayer' & 'HPosition'**

In [17]:
a = [col for col in dm.columns if 'VPlayer' in col]
b = [col for col in dm.columns if 'HPlayer' in col]
c = [col for col in dm.columns if 'VPosition' in col]
d = [col for col in dm.columns if 'HPosition' in col]
dm = pd.lreshape(dm, {'VPlayer' : a, 'HPlayer' : b, 'VPosition' : c, 'HPosition': d})

In [18]:
dm = dm.sort_values(['Season', 'GameNumber', 'Period', 'EventNumber'], ascending=[True, True, True, True])

In [19]:
dm.columns

Index(['AdvantageType', 'Assist1Player', 'Assist2Player', 'DrawnByPlayerName',
       'DrawnByPlayerNumber', 'DrawnByTeamCode', 'EventDetail', 'EventNumber',
       'EventTimeFromTwenty', 'EventTimeFromZero', 'EventType', 'GameDate',
       'GameNumber', 'GoalNumber', 'GoalTime', 'HName', 'HNumber', 'HTeamCode',
       'HitteePlayerName', 'HitteePlayerNumber', 'HitteeTeamCode',
       'HitterPlayerName', 'HitterPlayerNumber', 'HitterTeamCode', 'Length',
       'PenaltyPlayerName', 'PenaltyPlayerNumber', 'PenaltyTeamCode',
       'PenaltyType', 'Period', 'PlayerName', 'PlayerNumber', 'Season',
       'ShotPlayerName', 'ShotPlayerNumber', 'ShotResult', 'ShotTeamCode',
       'ShotType', 'TeamCode', 'VName', 'VNumber', 'VTeamCode', 'WinTeamCode',
       'Zone', 'VPlayer', 'HPlayer', 'VPosition', 'HPosition'],
      dtype='object')

In [20]:
dm = dm[['Season', 'GameNumber', 'EventNumber', 'Period', 'AdvantageType', 'EventTimeFromZero', 'EventTimeFromTwenty', 'EventType', 'EventDetail', 'TeamCode', 'VPlayer', 'VPosition', 'HPlayer', 'HPosition', 'PlayerNumber', 'PlayerName', 'ShotType', 'Zone', 'Length', 'ShotResult', 'ShotTeamCode', 'ShotPlayerNumber', 'ShotPlayerName', 'WinTeamCode', 'VTeamCode', 'HTeamCode', 'HitterTeamCode', 'HitterPlayerNumber', 'HitterPlayerName', 'HitteeTeamCode', 'HitteePlayerNumber', 'HitteePlayerName', 'PenaltyTeamCode', 'PenaltyPlayerNumber', 'PenaltyPlayerName', 'PenaltyType', 'DrawnByTeamCode', 'DrawnByPlayerNumber', 'DrawnByPlayerName', 'GameDate', 'GoalNumber', 'GoalTime', 'Assist1Player', 'Assist2Player']]
           

## fill in team code for all type of events

For team code column not to be missing any data, **numpy where** is used. It is a command that assigns values to team code based on the event type and the outcome of that event. 

- if an event is a faceoff, the team that won the faceoff will be assigned to 'TeamCode'. 

- if an event is a hit, the team that registered a hit will be assigned to 'TeamCode'. 

- if an event is a penalty, the team that committed the penalty will be assigned to 'TeamCode'.

In [21]:
dm['TeamCode'] = np.where(dm['EventType'] == 'FAC', dm['WinTeamCode'],
                             (np.where(dm['EventType'] == 'HIT', dm['HitterTeamCode'],
                                       (np.where(dm['EventType'] == 'PENL', dm['PenaltyTeamCode'], dm['TeamCode'])))))

## fill in home and visitor team code 

To confirm there are no missing data, home and visitor team code are filled in *backwards*. 

 - visitor team code for all events prior to a goal filled in backwards

In [22]:
dm['VTeamCode'] = dm['VTeamCode'].fillna(method='bfill')

 - home team code for all events prior to a goal filled in backwards

In [23]:
dm['HTeamCode'] = dm['HTeamCode'].fillna(method='bfill')

##  fill in variables goal number and goal time with values

Goal number and goal time values will be assigned to every event, dependent on the number of goals scored in a game and the time (from zero) they happened. Since events that occured **prior to a goal** are being examined, *fill in backwards method* is used. This will assist with the calculation of time difference between a goal and a given event.

 - goal number for all events prior to a goal filled in backwards

In [24]:
dm['GoalNumber']= dm['GoalNumber'].fillna(method='bfill')

- goal time for all events prior to a goal filled in backwards

In [25]:
dm['GoalTime'] = dm['GoalTime'].fillna(method='bfill')

## generate a variable that will calculate the time difference between goal and events

The time difference between a goal and an event is calculated as followed: 

In [26]:
dm['TBGoalandEvent'] = dm['GoalTime'] - dm['EventTimeFromZero']

## keep only events that happened 20 seconds prior to goal

The playler evaluation model uses only events that happened 20 seconds prior to a goal. If the time between a goal and an event exceeds 20 seconds, the event will not be included in the dataframe. Thus:

In [27]:
dm = dm[dm['TBGoalandEvent'] <= 20]

In [28]:
dm = dm[dm['TBGoalandEvent'] >= 0]

## create a column that will show the total observations for two games

the data is grouped by season to count the total occurance of events that happened 20 seconds prior to a goal, in the first two games of the season

In [29]:
dm['counts'] = dm.groupby('Season')['EventType'].transform('count')

## display of each event leading to a goal for two games

The below table lists the occurance of each event type prior to each goal.

In [30]:
dy = dm.groupby(['Season','GameNumber', 'GoalNumber', 'EventType', 'Zone']).size()

## create a variable that will establish the value of home ice advantage (μ)

The home team advantage is a phenomenon whereby sports teams experience a competitive benefit from playing at their home venue. Home team ice advantage is definied as the **mean goal differential in home games less the mean goal differential in road games of a given team**. This is strictly a measure of home ice advantage and does not depend on the quality of the team.

- assign value of 1 for a goal being scored for the home team.

In [31]:
dm['hgd'] = np.where((dm['Season'] == dm['Season']) & (dm['TeamCode'] == dm['HTeamCode']) & (dm['TeamCode'] == dm['TeamCode']) & (dm['EventType'] == 'GOAL'), 1, 0)

- assign a value of 1 for a goal being scored for the visitor team.

In [32]:
dm['vgd'] = np.where((dm['Season'] == dm['Season']) & (dm['TeamCode'] == dm['VTeamCode']) & (dm['TeamCode'] == dm['TeamCode']) & (dm['EventType'] == 'GOAL'), 1, 0)

- goal differential for a given team is the difference between mean goals as the home team and mean goals as the visitor team.

In [33]:
dm['gd'] = np.where((dm['Season'] == dm['Season']) & (dm['TeamCode'] == dm['TeamCode']), dm['hgd'].mean() - dm['vgd'].mean(), 0)

The variable generated is then divided with the mean total goals per game in season, which is calculated by averaging the total goals scored by all teams in the league over all games. The statistic is the quantity of interest in assessing differential home ice advantage in the NHL.

In [34]:
dm['homeadv'] = np.where((dm['Season'] == dm['Season']), dm['gd']/(dm['hgd'].mean() + dm['vgd'].mean()), 0)

## create a variable that will display the value of each zone

The impact of zone start on each event that occured 20 seconds prior to a goal needs to be calculated. Values are assigned to three zone starts: offensive, neutral and defensive. 

- create a column that assigns a value of 1 to **offensive** zone events and a value of 0 to **non offensive** zone start events.

In [35]:
dm['ozone'] = np.where(dm['Zone'] == 'O', 1, 0)

- create a column that assigns a value of 1 to **neutral** zone start events and a value of 0 to **non neutral** zone start events.

In [36]:
dm['nzone'] = np.where(dm['Zone'] == 'N', 1, 0)

- create a column that assigns a value of 1 to **defensive** zone start events and a value of 0 to **non defensive** zone start events.

In [37]:
dm['dzone'] = np.where(dm['Zone'] == 'D', 1, 0)

## create a variable that will display the value of each zone start

Values have been assigned to all three zone starts. The **mean** of each zone start is used to estimate the effect each zone start has on a goal.

In [38]:
dm['zsvalue'] = np.where(dm['Season']== dm['Season'] & (dm['Zone'] == 'O'), dm['ozone'].mean(),
                             (np.where(dm['Season']== dm['Season'] & (dm['Zone'] == 'N'), dm['nzone'].mean(),
                                       (np.where(dm['Season']== dm['Season'] & (dm['Zone'] == 'D'), (dm['dzone'].mean()), 0 )))))

All zone starts have been assigned a value based on the impact they had on a goal. 

 ## create zone value for home and visitor teams (γ)

Every zone start has an effect on both home and visitor team. A zone start that happened in the **offensive zone of the home team,** will be in the defensive zone of the visitor team. A zone start that happened in the **defensive zone of the home team,** will be in the offensive zone for the visitor team. Neutral zone is the same for both teams. 

If zone start is **offensive**, it has a positive impact and **assigned a value of +1**. If zone start is **neutral**, it has no impact and is assigned a **value of 0**. If zone start is **defensive**, it has a negaive impact and assigned a **value of -1**.

### a) zone value for home team

First, we must deterimine if the event refers to the home or visitor team. To estimate zone starts relevant to home team, team code of the event must be the same as home team code. The next step is to assign value based on the zone start of a given event.

- If zone start is offensive for the home team, the mean value of offensive zone start is assigned to the home team. 

- if zone start is offensive for the visitor team, the **negative** mean value of defensive zone start is assigned to home team.

- if zone start is defensive for the home team, the **negative** mean value of defensive zone start is assigned to home team

- if zone start is defensive for the away team, the mean value of offensive zone start is assigned to the home team.

- if zone start neutral, zone start has no impact on the given event and assigned a value of zero.

In [39]:
dm['hzsvalue'] = np.where((dm['Season']== dm['Season']) & (dm['TeamCode'] == dm['HTeamCode']) & (dm['Zone'] == 'O'), dm['ozone'].mean(),
                          (np.where((dm['Season']== dm['Season']) & (dm['TeamCode'] != dm['HTeamCode']) & (dm['Zone'] == 'O'), -dm['dzone'].mean(),
                                   (np.where((dm['Season']== dm['Season']) &(dm['TeamCode'] == dm['HTeamCode']) & (dm['Zone'] == 'D'), -dm['dzone'].mean(),
                                          (np.where((dm['Season']== dm['Season']) & (dm['TeamCode'] != dm['HTeamCode']) & (dm['Zone'] == 'D'), dm['ozone'].mean(), 0)))))))
                                                                                              

### b) zone value for visitor team

First, we must deterimine if the event refers to the home or visitor team. To estimate zone starts relevant to visitor team, team code of a given event must be the same as visitor team code. The next step is to assign value based on the zone start of a given event.

- if zone start is offensive for the visitor team, the mean value of offensive zone start is assigned to the visitor team. 

- if zone start is offensive for the home team, the **negative** mean value of defensive zone start is assigned to the visitor team. 

- if zone start is defensive for the visitor team, the **negative** mean value of deffensive zone start is assigned to the visitor team. 

- if zone start is defensive for the home team, the mean value of offensive zone start is assigned to the visitor team. 

- if zone start neutral, zone start has no impact on the given event and assigned a value of zero.

In [40]:
dm['vzsvalue'] = np.where((dm['Season']== dm['Season']) & (dm['TeamCode'] == dm['VTeamCode']) & (dm['Zone'] == 'O'), dm['ozone'].mean(),
                          (np.where((dm['Season']== dm['Season']) & (dm['TeamCode'] != dm['VTeamCode']) & (dm['Zone'] == 'O'), -dm['dzone'].mean(),
                                (np.where((dm['Season']== dm['Season']) & (dm['TeamCode'] == dm['VTeamCode']) & (dm['Zone'] == 'D'), -dm['dzone'].mean(),
                                        (np.where((dm['Season']== dm['Season']) &(dm['TeamCode'] != dm['VTeamCode']) & (dm['Zone'] == 'D'), dm['ozone'].mean(), 0)))))))
                                                                                              

## zone start (ZS)

With the help of zone variable, offensive, neutral and defensive zone starts will be created.

**zone start variable:** 

- a value of 1 will be assigned if the on-ice event happened in the offensive zone.

- a value of 0 will be assigned if the on-ice event happened in the neutral zone.

- a value of -1 if it happened in the defensive zone of the representative team.

In [41]:
dm['zs'] = np.where(dm['Zone'] == 'O', 1,
                    (np.where(dm['Zone'] == 'D', -1, 0)))

## home and visitor zone start

- **visitor team zone start (vzs)**

If team code of event is the same as visitor team, the visitor zone start variable will be assigned identical value to zone start. If not, it will be assigned the opposite (negative) value of zone start. 

In [42]:
dm['vzs'] = np.where(dm['TeamCode'] == dm['VTeamCode'], dm['zs'], -dm['zs'] )

- **home team zone start (hzs) **

If team code of event is the same as home team, the home team will be assigned identical value to zone start. If not, it will be assigned the opposite (negative) value of zone start.

In [43]:
dm['hzs'] = np.where(dm['TeamCode'] == dm['HTeamCode'], dm['zs'], -dm['zs'] )

## create columns for each type of event and assign values to determine the impact they have on a goal

Values are appointed to eight even strengths events: block shot, face-off, shot on goal, missed shot, penalty, hit, takeaway, giveaway and goal.

 - create a column that assigns a value of 1 to block events and a value of 0 to every non block event

In [44]:
dm['block'] = np.where(dm['EventType'] == 'BLOCK', 1, 0)

 - create a column that assigns a value of 1 to faceoff events and a value of 0 to every non faceoff event

In [45]:
dm['faceoff'] = np.where(dm['EventType'] == 'FAC', 1, 0)

 - create a column that assigns a value of 1 to giveaway events and a value of 0 to every non giveaway event

In [46]:
dm['giveaway'] = np.where(dm['EventType'] == 'GIVE', 1, 0)

- create a column that assigns a value of 1 to goal events and a value of 0 to every non goal event

In [47]:
dm['goal'] = np.where(dm['EventType'] == 'GOAL', 1, 0)

- create a column that assigns a value of 1 to hit events and a value of 0 to every non hit event

In [48]:
dm['hit'] = np.where(dm['EventType'] == 'HIT', 1, 0)

- create a column that assigns a value of 1 to miss events and a value of 0 to every non miss shot event

In [49]:
dm['miss'] = np.where(dm['EventType'] == 'MISS', 1, 0)

 - create a column that assigns a value of 1 to penalty events and a value of 0 to every non penalty events

In [50]:
dm['penalty'] = np.where(dm['EventType'] == 'PENL', 1, 0)

- create a column that assigns a value of 1 to shot events and a value of 0 to every non shot events

In [51]:
dm['shot'] = np.where(dm['EventType'] == 'SHOT', 1, 0)

 - create a column that assigns a value of 1 to takeaway events and a value of 0 to non takeaway events

In [52]:
dm['takeaway'] = np.where(dm['EventType'] == 'TAKE', 1, 0)

## create a variable that will display the value of each event 

All events that happened 20 seconds prior to a goal are counted. The **mean** is used to establish the impact each event has on a goal.

Fist step is to determine if an event has a positive or negative impact on a goal:

 - giveaway has a negative impact on the team that lost possession.

 - faceoff has a positive impact on the team that won the faceoff and a negative impact on the team that lost. 

- hit has a positive impact for the team that delivered the hit and a negative impact on the team that received the hit.

- penalty has a positive impact on the team that drew the penalty and a negative impact on the team serving. 

 - takeaway has a positive impact on the team that stole the puck and gained possession.

In [53]:
dm['eventvalue'] = np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'BLOCK'), dm['block'].mean(),
                             (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'FAC'), dm['faceoff'].mean(),
                                       (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'GIVE'), -(dm['giveaway'].mean()),
                                                 (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'GOAL'), dm['goal'].mean(),
                                                          (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'HIT'), dm['hit'].mean(),
                                                                   (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'MISS'), dm['miss'].mean(),
                                                                            (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'PENL'), -(dm['penalty'].mean()),
                                                                                     (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'SHOT'), dm['shot'].mean(),
                                                                                              (np.where((dm['Season']== dm['Season']) & (dm['EventType'] == 'TAKE'), dm['takeaway'].mean(), 0)))))))))))))))))

All event types have been assigned a value based on the impact they had on a goal. 

##  create event value for home and visitor teams

Each event has an effect on both home and visitor team. An event that has a positive impact on the home team will have a negative impact on the visitor team. Similarly, an event that has a negative effect on the home team, will have a positive effect on the visitor team.

- If an event has a positive impact on the **home team**, the mean will be positive. If an event has a negative impact on the home team, the mean will be negative.

In [54]:
dm['heventvalue'] = np.where((dm['Season']== dm['Season']) & dm['TeamCode'] == dm['HTeamCode'], dm['eventvalue'], -(dm['eventvalue']))

- If an event has a positive impact on the **visitor team**, the mean will be positive. If an event has a negative impact on the visitor team, the mean will be negative.

In [55]:
dm['veventvalue'] = np.where((dm['Season']== dm['Season']) & dm['TeamCode'] == dm['VTeamCode'], dm['eventvalue'], -(dm['eventvalue']))

## assign values to each player 

The value of an event is assigned to all players that were on ice, a total of 12 players (6 per team). The overall contribution of each player is the total (sum) of events they participated in. 

### a) overall contribution of each player from the visitor team in all (6) positions 

Group data frame by season, visitor team code and visitor player to seperate players that play in the same position. 

In [56]:
dm['vp'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer'])['veventvalue'].transform('sum')

### b) overall contribution of each player from the home team in all (6) positions

Group data frame by season, home team code and home player to seperate players that play in the same position. 

In [57]:
dm['hp'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer'])['heventvalue'].transform('sum')

## games played

Create a variable that will calculate the sum of games each player played in. The total contribution of a given  player is the sum of events he participated in divided by the number of games he played.

- **a) games played per player for visitor team:**

- create variable that counts the amount of games each player from the **visitor team** played.

In [58]:
dm['vgp'] = dm.groupby(['Season', 'VTeamCode', 'EventNumber', 'VPlayer'])['GameNumber'].transform('count')

- **b) games played per player for home team:**

- create variable that counts the amount of games each player from the **home team** played.

In [59]:
dm['hgp'] = dm.groupby(['Season', 'HTeamCode', 'EventNumber', 'HPlayer'])['GameNumber'].transform('count')

## overall games played

The amount of games each player played has been calculated only for his team being at home or away for the season, since home games played and visitor games played were used. The **total games** of each player is the sum of home and away games he participated in for a whole season.

- create a variable will add up the home event value and away event value for all players of a given team.

In [60]:
dm['gp'] = np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] == dm['VTeamCode']) & (dm['HPlayer'] == dm['VPlayer']), (dm['hgp'] + dm['vgp']),
                   (np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] != dm['VTeamCode']) & (dm['HPlayer'] != dm['VPlayer']), dm['hgp'],
                   (np.where((dm['Season'] == dm['Season']) &(dm['VTeamCode'] == dm['HTeamCode']) & (dm['VPlayer'] == dm['HPlayer']), (dm['vgp'] + dm['hgp']), dm['vgp'])))))

## overall player contribution

The impact of each player has been calculated only for his team being at home or away for the season, since home event value and visitor event value were used. The **total contribution** of each player is the total of events he participated for a whole season. Thus, the sum of both home and away event values must be computed.

- create a variable will add up the home event value and away event value for all players of a given team, that played in **position 1.**

In [61]:
dm['plyr'] = np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] == dm['VTeamCode']) & (dm['HPlayer'] == dm['VPlayer']), (dm['hp'] + dm['vp'])/dm['gp'],
                   (np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] != dm['VTeamCode']) & (dm['HPlayer'] != dm['VPlayer']), dm['hp']/dm['hgp'],
                   (np.where((dm['Season'] == dm['Season']) &(dm['VTeamCode'] == dm['HTeamCode']) & (dm['VPlayer'] == dm['HPlayer']), (dm['vp'] + dm['hp'])/dm['gp'], dm['vp']/dm['vgp'])))))

The total contribution of each player for the duration of a season has been measured.

## store player evaluation data frame

the player evaluation data frame will be stored and used for the next stage of analysis, player allocation.

In [62]:
dm.to_csv('player_evaluation.csv', index='False', sep=',')

The next step is to allocate players to their respectful roster position.