# Imports
General imports only.  API imports are found later in the notebook for clarity.

In [1]:
import pandas as pd

# Resources

This project uses the NBA stats API.  The following links 
were critical to the construction and implementation of this
project:

1) https://pypi.org/project/nba-api/

    * "An API Client package to access the APIs for NBA.com"

2) https://github.com/swar/nba_api

    * The github page for the NBA API above

3) https://github.com/seemethere/nba_py

    * Another helpful github page to see how other people dealt with getting the info they wanted

4) https://www.espn.com/nba/stats

    * Here you can crosscheck some of your own searches and numbers to make sure they are in agreement with a professional API mining operation.
    
5) https://github.com/swar/nba_api/tree/master/docs/nba_api/stats/endpoints

    * List of NBA API endpoint and their documentation.

# API Access Download Test

In [2]:
'''

Everything we do in this project will go to nba_api.stats.endpoints
The import "teamgamelogs" is one endpoint that can be used.  The NBA
API clusters the data a bunch of different ways ath the endpoints.
This was the one I found most useful/appropriate for my project, but
this could have been done using other endpoints.  Choose your own
adventure.

A list of endpoints can be found here:

* https://github.com/swar/nba_api/tree/master/docs/nba_api/stats/endpoints


'''

from nba_api.stats.endpoints import teamgamelogs
team = teamgamelogs.TeamGameLogs(season_nullable='2001-02', game_segment_nullable='First Half')
a= team.get_data_frames()[0] 

                             #NOTE: The endpoint name after 'import' is LOWERCASE ONLY.  The attribute
                             # denoted after a '.' following 'team =' requires uppercase letters to work.
        
                             #NOTE: The code for accessing more than one season at a time is located
                             # farther down this notebook.
        
                             #NOTE: We use the "game_segment_nullable" = "First Half" because of the nature
                             # and focus of this study, but other segments can be targeted from this endpoint.
                             
                             #NOTE: The "nullable" is required for the parameters here.  This is seemingly not
                             # the case for other endpoints or query types(?) from the NBA API.  But it was a 
                             # point of frustration for me so I wanted to make sure I pointed that out.

                             #NOTE: Use various values for [x] from 0-x to see different parts of
                             # the dictionary-format output.  If you check the endpoint documentation
                             # from link (5) above you will see what is in each dictionary-format output.

In [3]:
#Transfer to dataframe format.

df = pd.DataFrame(a)

In [None]:
#This is an optional setting to max all columns visible in outputs.

pd.set_option('display.max_columns', None)

In [4]:
#Test and view output dataframe.

df

Unnamed: 0,SEASON_YEAR,TEAM_ID,TEAM_ABBREVIATION,TEAM_NAME,GAME_ID,GAME_DATE,MATCHUP,WL,MIN,FGM,...,REB_RANK,AST_RANK,TOV_RANK,STL_RANK,BLK_RANK,BLKA_RANK,PF_RANK,PFD_RANK,PTS_RANK,PLUS_MINUS_RANK
0,2001-02,1610612765,DET,Detroit Pistons,0020101181,2002-04-17T00:00:00,DET vs. MIL,W,24.0,25,...,172,168,1336,494,1673,707,903,80,50,50
1,2001-02,1610612749,MIL,Milwaukee Bucks,0020101181,2002-04-17T00:00:00,MIL @ DET,L,24.0,19,...,2077,1963,986,856,1101,218,1555,80,1580,2320
2,2001-02,1610612758,SAC,Sacramento Kings,0020101188,2002-04-17T00:00:00,SAC @ LAL,L,24.0,20,...,525,2144,1336,2146,1101,218,140,80,462,1318
3,2001-02,1610612752,NYK,New York Knicks,0020101179,2002-04-17T00:00:00,NYK vs. NJN,L,24.0,18,...,271,1753,1336,1771,1673,1,1555,1,1723,790
4,2001-02,1610612737,ATL,Atlanta Hawks,0020101176,2002-04-17T00:00:00,ATL @ BOS,L,24.0,13,...,172,1963,1336,41,1101,1743,19,80,2076,1943
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2373,2001-02,1610612762,UTA,Utah Jazz,0020100009,2001-10-30T00:00:00,UTA vs. MIL,L,24.0,21,...,1757,55,1915,113,637,1,903,80,462,894
2374,2001-02,1610612759,SAS,San Antonio Spurs,0020100008,2001-10-30T00:00:00,SAS vs. LAC,W,24.0,17,...,525,1166,1652,1287,1673,1,2275,80,1068,790
2375,2001-02,1610612754,IND,Indiana Pacers,0020100003,2001-10-30T00:00:00,IND @ NJN,L,24.0,23,...,525,648,1915,1287,1101,1743,903,80,554,501
2376,2001-02,1610612764,WAS,Washington Wizards,0020100002,2001-10-30T00:00:00,WAS @ NYK,L,24.0,19,...,1757,648,171,244,317,218,1839,80,1723,972


# Targe Seasons for-loop

In [4]:
'''
Now we need to make a for-loop to get the parameter "season_nullable" in the
right format.  After this loop I also have a cell that is constructed manually
as an alternative method.

NOTE: Here, the target number of seasons is 20 (line 12).  Change this depending on the 
focus of your study.
'''

start_season = '2000-01'
seasons_all_list=[]
num_years = 20
for i in range(num_years):
    list1=start_season.split('-')
    first_part=int(list1[0])+1
    second_part=int(list1[0])+2
    second_part=str(second_part)[2:]
    season_years='-'.join([str(first_part), str(second_part)])
    start_season=season_years
    team = teamgamelogs.TeamGameLogs(season_nullable=season_years, game_segment_nullable='First Half')
    team = team.get_data_frames()[0]
    seasons_all_list.append(team)

In [5]:
#This combines (concatenates) the two parts of the season_nullable year range format.

all_seasons_df = pd.concat(seasons_all_list, ignore_index=True)

In [6]:
#Check the dataframe.

all_seasons_df

Unnamed: 0,SEASON_YEAR,TEAM_ID,TEAM_ABBREVIATION,TEAM_NAME,GAME_ID,GAME_DATE,MATCHUP,WL,MIN,FGM,...,REB_RANK,AST_RANK,TOV_RANK,STL_RANK,BLK_RANK,BLKA_RANK,PF_RANK,PFD_RANK,PTS_RANK,PLUS_MINUS_RANK
0,2001-02,1610612746,LAC,Los Angeles Clippers,0020101189,2002-04-17T00:00:00,LAC @ GSW,L,24.0,17,...,711,1475,647,1287,317,2369,1220,80,1723,2275
1,2001-02,1610612761,TOR,Toronto Raptors,0020101178,2002-04-17T00:00:00,TOR vs. CLE,W,24.0,19,...,1930,648,171,244,637,1,1220,80,832,1318
2,2001-02,1610612739,CLE,Cleveland Cavaliers,0020101178,2002-04-17T00:00:00,CLE @ TOR,L,24.0,19,...,1126,1475,1336,1771,2162,1279,1220,80,628,972
3,2001-02,1610612758,SAC,Sacramento Kings,0020101188,2002-04-17T00:00:00,SAC @ LAL,L,24.0,20,...,525,2144,1336,2146,1101,218,140,80,462,1318
4,2001-02,1610612762,UTA,Utah Jazz,0020101184,2002-04-17T00:00:00,UTA vs. SAS,L,24.0,13,...,711,1753,171,856,73,2063,1555,80,1826,1486
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
46443,2020-21,1610612761,TOR,Toronto Raptors,0022000014,2020-12-23T00:00:00,TOR vs. NOP,L,24.0,21,...,524,159,739,156,190,440,701,129,347,199
46444,2020-21,1610612747,LAL,Los Angeles Lakers,0022000002,2020-12-22T00:00:00,LAL vs. LAC,L,24.0,17,...,256,589,702,402,719,1,651,3,449,432
46445,2020-21,1610612751,BKN,Brooklyn Nets,0022000001,2020-12-22T00:00:00,BKN vs. GSW,W,24.0,23,...,72,497,762,156,17,229,453,228,161,52
46446,2020-21,1610612744,GSW,Golden State Warriors,0022000001,2020-12-22T00:00:00,GSW @ BKN,L,24.0,17,...,320,589,563,266,340,748,453,228,698,723


## Export to CSV

In [None]:
#Change name to desired name.
#Change to 'index=True' to retain an index column in the export.

all_seasons_df.to_csv('all_seasons_df.csv', index=False)

# Alternative 'season_nullable' Data Format Parameter Construction

In [None]:
#Define manually

team00_01 = teamgamelogs.TeamGameLogs(season_nullable='2000-01', game_segment_nullable='First Half')
team00_01 = team00_01.get_data_frames()[0]

team01_02 = teamgamelogs.TeamGameLogs(season_nullable='2001-02', game_segment_nullable='First Half')
team01_02 = team01_02.get_data_frames()[0]

team02_03 = teamgamelogs.TeamGameLogs(season_nullable='2002-03', game_segment_nullable='First Half')
team02_03 = team02_03.get_data_frames()[0]

team03_04 = teamgamelogs.TeamGameLogs(season_nullable='2003-04', game_segment_nullable='First Half')
team03_04 = team03_04.get_data_frames()[0]

team04_05 = teamgamelogs.TeamGameLogs(season_nullable='2004-05', game_segment_nullable='First Half')
team04_05 = team04_05.get_data_frames()[0]

team05_06 = teamgamelogs.TeamGameLogs(season_nullable='2005-06', game_segment_nullable='First Half')
team05_06 = team05_06.get_data_frames()[0]

team06_07 = teamgamelogs.TeamGameLogs(season_nullable='2006-07', game_segment_nullable='First Half')
team06_07 = team06_07.get_data_frames()[0]

team07_08 = teamgamelogs.TeamGameLogs(season_nullable='2007-08', game_segment_nullable='First Half')
team07_08 = team07_08.get_data_frames()[0]

team08_09 = teamgamelogs.TeamGameLogs(season_nullable='2008-09', game_segment_nullable='First Half')
team08_09 = team08_09.get_data_frames()[0]

team09_10 = teamgamelogs.TeamGameLogs(season_nullable='2009-10', game_segment_nullable='First Half')
team09_10 = team09_10.get_data_frames()[0]

team10_11 = teamgamelogs.TeamGameLogs(season_nullable='2010-11', game_segment_nullable='First Half')
team10_11 = team10_11.get_data_frames()[0]

team11_12 = teamgamelogs.TeamGameLogs(season_nullable='2011-12', game_segment_nullable='First Half')
team11_12 = team11_12.get_data_frames()[0]

team12_13 = teamgamelogs.TeamGameLogs(season_nullable='2012-13', game_segment_nullable='First Half')
team12_13 = team12_13.get_data_frames()[0]

team13_14 = teamgamelogs.TeamGameLogs(season_nullable='2013-14', game_segment_nullable='First Half')
team13_14 = team13_14.get_data_frames()[0]

team14_15 = teamgamelogs.TeamGameLogs(season_nullable='2014-15', game_segment_nullable='First Half')
team14_15 = team14_15.get_data_frames()[0]

team15_16 = teamgamelogs.TeamGameLogs(season_nullable='2015-16', game_segment_nullable='First Half')
team15_16 = team15_16.get_data_frames()[0]

team16_17 = teamgamelogs.TeamGameLogs(season_nullable='2016-17', game_segment_nullable='First Half')
team16_17 = team16_17.get_data_frames()[0]

team17_18 = teamgamelogs.TeamGameLogs(season_nullable='2017-18', game_segment_nullable='First Half')
team17_18 = team17_18.get_data_frames()[0]

team18_19 = teamgamelogs.TeamGameLogs(season_nullable='2018-19', game_segment_nullable='First Half')
team18_19 = team18_19.get_data_frames()[0]

team19_20 = teamgamelogs.TeamGameLogs(season_nullable='2019-20', game_segment_nullable='First Half')
team19_20 = team19_20.get_data_frames()[0]

In [None]:
#Add them up.

seasons_00_20_df = pd.concat([team00_01, team01_02, team02_03, team03_04, team04_05,
                        team05_06, team06_07, team07_08, team08_09, team09_10,
                        team10_11, team11_12, team12_13, team13_14, team14_15,
                        team15_16, team16_17, team17_18, team18_19, team19_20],
                        ignore_index=True)

In [None]:
seasons_00_20_df