# Face Off Analysis

Root question is: "Who does Patrice Bergeron lost faceoffs to most"

From there I'm sure I'll want to explore
- What zone are most face off losses?
- How often is Bergy waved off the dot?
- What are his percentages in each zone?
- How does period and +/- impact faceoff percentage?

But first is something to parse a game file.

In [91]:


# Set up the environment and pull in the game data
import pandas as pd
from pandas.io.json import json_normalize
import json

# Pulled from https://statsapi.web.nhl.com/api/v1/game/{1}/feed/live
filename = 'api_data/game_2019020010.json'

with open(filename) as json_file:
    game_data = json.load(json_file)

## Get Faceoff data

Get the faceoff data from the liveData file and massage it into something more helpful. Basically flatten the Faceoff data.

- [ ] Push the PlayID into the index for this DataFrame
- [ ] Add GameID to the data so that we can export this data to a tracking dataset
  - We'll build up a dataset of total faceoffs, wins, and who `select_player` lost to
- [ ] Add Period to the data so that we can analyze win/loss by Period
- [ ] Add Zone to the data so that we can analyze win/loss on Zone
- [ ] Add losing/winning/event to the data so that we can analyze win/loss on that dimension as well

In [54]:


gameId = game_data['gamePk']

plays = json_normalize(game_data['liveData']['plays']['allPlays'])

plays_list = plays.loc[lambda r: r['result.event'] == 'Faceoff']

faceoff_list = []

for i, p in plays_list.iterrows():
    play_object = { 
        'winner': None, 
        'loser': None, 
        'gameId': gameId, 
        'period': p['about.period']
    }
    
    players_list = p['players']
    
    for player in players_list:
        if player['playerType'] == 'Winner':
            play_object['winner'] = player['player']['fullName']
        else:
            play_object['loser'] = player['player']['fullName']
        
    faceoff_list.append(play_object)

faceoff_df = pd.DataFrame.from_dict(faceoff_list)

faceoff_df.head(5)

Unnamed: 0,winner,loser,gameId,period
0,Patrice Bergeron,Tyler Seguin,2019020010,1
1,Mattias Janmark,Sean Kuraly,2019020010,1
2,Roope Hintz,Par Lindholm,2019020010,1
3,Par Lindholm,Roope Hintz,2019020010,1
4,Blake Comeau,Patrice Bergeron,2019020010,1


## Faceoff data for a `selected_player`

Get the Faceoff plays from the game data, and filter them down by a `selected_player`

- [ ] [TODO] Figure out iPython controls and make this a data-drive control (all players who took face-offs in the game)

In [55]:


select_player = 'Patrice Bergeron'

player_faceoffs = faceoff_df.loc[lambda p: (p['winner'] == select_player) | (p['loser'] == select_player)]

player_faceoffs.head(5)

Unnamed: 0,winner,loser,gameId,period
0,Patrice Bergeron,Tyler Seguin,2019020010,1
4,Blake Comeau,Patrice Bergeron,2019020010,1
6,Patrice Bergeron,Radek Faksa,2019020010,1
9,Radek Faksa,Patrice Bergeron,2019020010,1
12,Jamie Benn,Patrice Bergeron,2019020010,1


## Faceoff Wins Count

Group by the winners and count the wins

In [85]:


faceoff_count = player_faceoffs.groupby(['winner','period','gameId']).count()
#faceoff_count.drop(columns=['loser'], inplace=True)

normalized_faceoffs = []

for i, p in faceoff_count.iterrows():
    row = { 'Name': i[0], 'faceoff_win_count': p.values[0], 'gameId': i[2], 'period': i[1] }
    normalized_faceoffs.append(row)

normalized_faceoffs_df = pd.DataFrame.from_dict(normalized_faceoffs)

normalized_faceoffs_df.head(5)

Unnamed: 0,Name,faceoff_win_count,gameId,period
0,Andrew Cogliano,1,2019020010,3
1,Blake Comeau,1,2019020010,1
2,Jamie Benn,1,2019020010,1
3,Jamie Benn,1,2019020010,2
4,Jamie Benn,1,2019020010,3


In [90]:


outfile = open("data/faceoff_normalized.csv", "a")
outfile.write(normalized_faceoffs_df.to_csv(index=False))
outfile.close()

## Faceoff Wins Percent

Calculate the Percent of total faceoffs each winner won.

In [48]:
#faceoff_sum = faceoff_count.winner.sum()
#faceoff_perc = faceoff_count.winner.apply(lambda f: (f / faceoff_sum) * 100)
#faceoff_perc