# Player Allocation

Upon completion of evaluation of overall performance for all skaters, each player will be categorized and assigned to their respectful position within the roster of their team. The fundamental statistic used to differentiate top from bottom players, is zone start percentage. There are three zones from where plays can be started: offensive, neutral and defensive. Offensive zone start percentage is calculated by the number of face-offs held in the attacking area divided by the sum of face-offs an individual player was on the ice for. Identically, defensive zone start percentage is computed by the number of face-offs taken in their own territory divided by the total face-offs each player was on the ice for. Skaters who are talented in creating opportunities and producing goals will have a greater percentage of offensive zone starts in comparison to defensive zone starts. Equivalently, players who are skilled in preventing chances and goals being conceded, will have a higher defensive zone start percentage in correlation to offensive zone starts. In other words, top six forwards and top four pairing defensemen will start much of their shifts in the zone their opponent whereas bottom six forwards and bottom pairing defensemen in their own zone.


### purpose of notebook:

a) create two variables: offensive zone start and defensive zone start.

b) sum up the total offensive and defensive zone start per player.

c) categorize forwards into top and bottom six.

d) categorize defensemen into top four and bottom pairing.

e) determine each player's roster depth position

##  import modules

In [1]:
import sys
import os
import pandas as pd
import numpy as np
import datetime, time
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.formula.api import ols
from pylab import hist, show
import scipy

## import data frame

The player evaluation data frame is used for player allocation.

In [2]:
dm = pd.read_csv('plyreval.csv')

## drop unnamed column (irrelevant)

In [3]:
dm = dm.drop('Unnamed: 0', axis=1)

## zone start

With the help of zone variable, offensive, neutral and defensive zone starts will be created.

**zone start variable:** 

- a value of 1 will be assigned if the on-ice event happened in the offensive zone.

- a value of 0 will be assigned if the on-ice event happened in the neutral zone.

- a value of -1 if it happened in the defensive zone of the representative team.

In [4]:
dm['zs'] = np.where(dm['Zone'] == 'O', 1,
                    (np.where(dm['Zone'] == 'D', -1, 0)))

## home and visitor zone start

- visitor team zone start (vzs)

If team code of event is the same as visitor team, the visitor zone start variable will be assigned identical value to zone start. If not, it will be assigned the opposite (negative) value of zone start. 

In [5]:
dm['vzs'] = np.where(dm['TeamCode'] == dm['VTeamCode'], dm['zs'], -dm['zs'] )

- home team zone start (hzs) 

If team code of event is the same as home team, the home team will be assigned identical value to zone start. If not, it will be assigned the opposite (negative) value of zone start. 

In [6]:
dm['hzs'] = np.where(dm['TeamCode'] == dm['HTeamCode'], dm['zs'], -dm['zs'] )

## assign zone start to players

The value of zone start is assigned to all players that were on ice, a total of 12 players (6 per team). The overall zone start variable of each player is the total (sum) of events they participated in. 

### a) overall zone start of each player from the visitor team in all (6) positions 

Group data frame by season, visitor team code and visitor player position to seperate players that play in the same position. 

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "VPlayer1"

In [7]:
dm['zvp1'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer1'])['vzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "VPlayer2"

In [8]:
dm['zvp2'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer2'])['vzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "VPlayer3"

In [9]:
dm['zvp3'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer3'])['vzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "VPlayer4"

In [10]:
dm['zvp4'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer4'])['vzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "VPlayer5"

In [11]:
dm['zvp5'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer5'])['vzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "VPlayer6"

In [12]:
dm['zvp6'] = dm.groupby(['Season', 'VTeamCode', 'VPlayer6'])['vzs'].transform('sum')

### b) overall zone start of each player from the home team in all (6) positions

Group data frame by season, home team code and home player position to seperate players that play in the same position.

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "HPlayer1"

In [13]:
dm['zhp1'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer1'])['hzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "HPlayer2"

In [14]:
dm['zhp2'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer2'])['hzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "HPlayer3"

In [15]:
dm['zhp3'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer3'])['hzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "HPlayer4"

In [16]:
dm['zhp4'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer4'])['hzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "HPlayer5"

In [17]:
dm['zhp5'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer5'])['hzs'].transform('sum')

- create variable that sums up the overall zone start of each player from the visitor team that is listed as "HPlayer6"

In [18]:
dm['zhp6'] = dm.groupby(['Season', 'HTeamCode', 'HPlayer6'])['hzs'].transform('sum')

## allocate players per position to forward lines and defensive pairings

- If the zone start variable of a player for a given team is the highest in that specific position, it indicates that he participated in the most offensive zone starts. These skaters will be identified as **first line forwards and top defenisive pairing**. 

- If the zone start variable of a player is the second highest in that specific position, it indicates that the given skater participated in the second most offensive zone starts. These skaters will be identified as **second line forwards and second defenisive pairing**. 

- If the zone start variable of a player is the third highest in that specific position, it indicates that the given skater participated in the third most offensive zone starts. These skaters will be identified as **third line forwards and bottom defenisive pairing**. 

- If the zone start variable of a player is the lowest in that specific position, it indicates that the given skater participated in the least offensive zone starts. These skaters will be identified as **fourth line forwards**. 

### a) visitor team

### b) home team