Collecting Info on A Specific Teams Games Over a Specific Period
===

This vignette shows you how to collect gameids for a specific team during a specific time and aggregate corresponding box score information. 

Specifically, it will walk you through collecting information on all Phoenix Suns games that took place during December and January of the 2006-07 season.

In [68]:
import numpy as np
import goldsberry
import pandas as pd
from datetime import datetime
pd.set_option("display.max_columns",50) # Change Pandas Display Options
goldsberry.__version__

'1.0.1'

In [69]:
gameids = goldsberry.GameIDs()
gameids2015 = pd.DataFrame(gameids.game_list())
gameids2015.head(n=5)[["TEAM_NAME","MATCHUP","PTS","PLUS_MINUS"]]
# gameids2015.columns.values

# list(gameids2015)

Unnamed: 0,TEAM_NAME,MATCHUP,PTS,PLUS_MINUS
0,Minnesota Timberwolves,MIN vs. NOP,144,35
1,Orlando Magic,ORL vs. BKN,139,34
2,Detroit Pistons,DET @ CHI,147,3
3,Golden State Warriors,GSW @ ORL,130,16
4,Sacramento Kings,SAC vs. PHX,142,23


Like the PlayerList() class, the GameIDs() class defaults to the current season. If we want the 2006-07 season, we need to identify and change the proper parameters. We can see the available parameters to set by printing the `api_params` attribute.

In [70]:
gameids.api_params

{'Direction': 'DESC',
 'LeagueID': '00',
 'PlayerOrTeam': 'T',
 'Season': '2015-16',
 'SeasonType': 'Regular Season',
 'Sorter': 'FGM'}

From there, we can see we should set the `Season` value to `2006-07`. Once we set the parameter, we need to get new data and then save the new data as a data frame to a new object.

In [71]:
gameids.get_new_data(Season='2016-17',PlayerOrTeam = "T")
gameids._set_api_parameters(Sorter = "GAME_DATE")
gameids2017 = pd.DataFrame(gameids.game_list())
gameids2017.head()["GAME_DATE"]

0    2017-01-28
1    2017-03-04
2    2017-02-10
3    2017-01-08
4    2016-12-10
Name: GAME_DATE, dtype: object

A quick filter for team names that contain 'Suns' returns all games for the Suns for the 2006-07 season

In [72]:
suns_logs = gameids2017.ix[gameids2017['TEAM_NAME'].str.contains('Suns')]
suns_logs.head()

Unnamed: 0,AST,BLK,DREB,FG3A,FG3M,FG3_PCT,FGA,FGM,FG_PCT,FTA,FTM,FT_PCT,GAME_DATE,GAME_ID,MATCHUP,MIN,OREB,PF,PLUS_MINUS,PTS,REB,SEASON_ID,STL,TEAM_ABBREVIATION,TEAM_ID,TEAM_NAME,TOV,VIDEO_AVAILABLE,WL
54,24,2,30,23,7,0.304,92,49,0.533,24,13,0.542,2017-03-30,21601124,PHX vs. LAC,240,13,27,-6,118,43,22016,6,PHX,1610612756,Phoenix Suns,14,1,L
65,27,6,35,24,8,0.333,93,49,0.527,25,15,0.6,2017-02-24,21600859,PHX @ CHI,265,8,23,-7,121,43,22016,8,PHX,1610612756,Phoenix Suns,19,1,L
67,32,10,47,24,10,0.417,88,49,0.557,34,29,0.853,2017-02-15,21600844,PHX vs. LAL,240,12,20,36,137,59,22016,7,PHX,1610612756,Phoenix Suns,15,1,W
90,25,8,30,19,8,0.421,95,48,0.505,13,11,0.846,2017-02-10,21600807,PHX vs. CHI,240,13,18,18,115,43,22016,4,PHX,1610612756,Phoenix Suns,9,1,W
98,21,2,20,24,10,0.417,89,48,0.539,17,14,0.824,2017-01-26,21600692,PHX @ DEN,240,8,26,-7,120,28,22016,8,PHX,1610612756,Phoenix Suns,11,1,L


We can verify all of the games are there by checking the shape of the data frame.

In [37]:
suns_logs.shape

(82, 29)

In [38]:
gameids2017.columns.values

array([u'AST', u'BLK', u'DREB', u'FG3A', u'FG3M', u'FG3_PCT', u'FGA',
       u'FGM', u'FG_PCT', u'FTA', u'FTM', u'FT_PCT', u'GAME_DATE',
       u'GAME_ID', u'MATCHUP', u'MIN', u'OREB', u'PF', u'PLUS_MINUS',
       u'PTS', u'REB', u'SEASON_ID', u'STL', u'TEAM_ABBREVIATION',
       u'TEAM_ID', u'TEAM_NAME', u'TOV', u'VIDEO_AVAILABLE', u'WL'], dtype=object)

In [202]:
# game_team_2017 = gameids2017.groupby("TEAM_NAME")
# team_2017 = game_team_2017.apply(lambda x: x.sort_values('GAME_DATE'))

In [203]:
# num_team = 30; num_games = 82
# matrix = []
# names = []
# for i in range(0,num_team):
#     matrix.append(team_2017.iloc[0+num_games*i:num_games*(i+1)].to_dict('records'))
#     names.append(team_2017.iloc[num_games*i]["TEAM_NAME"])

In [204]:
# new = pd.DataFrame(matrix,index = names,columns = range(1,num_games+1))

In [205]:
def To_matrix(game_df):
    game_group = game_df.sort_values("TEAM_NAME").groupby("TEAM_NAME")
    game_sort = game_group.apply(lambda x: x.sort_values('GAME_DATE'))
    num_team = 30; num_games = 82
    matrix = []
    names = []
    for i in range(0,num_team):
        matrix.append(game_sort.iloc[0+num_games*i:num_games*(i+1)].to_dict('records'))
        names.append(game_sort.iloc[num_games*i]["TEAM_NAME"])
    new_game_df = pd.DataFrame(matrix,index = names,columns = range(1,num_games+1))
    return new_game_df

In [206]:
game_2014 = goldsberry.GameIDs(Season = "2014-15")
new_game_2014 = To_matrix(pd.DataFrame(game_2014.game_list()))

In [207]:
new_game_2014.head().loc["Atlanta Hawks",1]

{u'AST': 26,
 u'BLK': 8,
 u'DREB': 32,
 u'FG3A': 22,
 u'FG3M': 13,
 u'FG3_PCT': 0.591,
 u'FGA': 80,
 u'FGM': 40,
 u'FG_PCT': 0.5,
 u'FTA': 17,
 u'FTM': 9,
 u'FT_PCT': 0.529,
 u'GAME_DATE': u'2014-10-29',
 u'GAME_ID': u'0021400008',
 u'MATCHUP': u'ATL @ TOR',
 u'MIN': 240,
 u'OREB': 10,
 u'PF': 24,
 u'PLUS_MINUS': -7,
 u'PTS': 102,
 u'REB': 42,
 u'SEASON_ID': u'22014',
 u'STL': 6,
 u'TEAM_ABBREVIATION': u'ATL',
 u'TEAM_ID': 1610612737,
 u'TEAM_NAME': u'Atlanta Hawks',
 u'TOV': 19,
 u'VIDEO_AVAILABLE': 1,
 u'WL': u'L'}