# Assignment 2

__Due date__: April 25, 2018 at 10 pm
__Submission__: IPython notebook to GauchoSpace

We have been building up to creating components for a dashboard app in Jupyter notebook:

- Data downloading function: `get_nba_data()` in a `03-Data-collection-and-manipulation.ipynb`. 
- Accessing pandas data frame and creating dictionary structures in `04-Pandas-Data-Frame.ipynb` and `05-Data-Frame-and-Visualization.ipynb`.
- Creating widgets for interactivity: `05-Data-Frame-and-Visualization.ipynb`.
- Plotting visualizations with Matplotlib and Seaborn: `05-Data-Frame-and-Visualization.ipynb`.

We can put these components together to create an interactive dashboard for creating something similar to the R package: https://github.com/toddwschneider/ballr

Your assignment is to create an interactive dashboard.

It doesn't have to be exactly the same as the package or what I have proposed. If you would like to create another visualization, that would be great as well. I will refer to __default option__ as continuing to do what we started in class: dashboard of shotchart similar to what [BallR package](https://github.com/toddwschneider/ballr) does. I will refer to __open-ended option__ as creating a dashboard of your choice.

Below, I specify some necessary components to your dashboard.

## Problem 1: Data Download

__Default option__: you can use the `get_nba_data()` function. No additional work is needed.

__Open-ended option__: you can choose to create a different dashboard. 

If you are familiar with http://stats.nba.com/, some stats pages will directly tell you how the data can be retrieved. For example, the data needed for [Tracking Shots Dashboard](http://stats.nba.com/player/201935/shots-dash/) comes from [this URL](http://stats.nba.com/stats/playerdashptshots?DateFrom=&DateTo=&GameSegment=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PerMode=PerGame&Period=0&PlayerID=201935&Season=2017-18&SeasonSegment=&SeasonType=Playoffs&TeamID=0&VsConference=&VsDivision=). This link can be found under Tools > Developer tools menu (Control-Shift-I) if you are using [Google Chrome](https://www.google.com/chrome/). Once you open Developer tools, you need to reload the page (for F5). Then, go to `Networks` menu under Developer tools pane. Type into the `Filter` text input box, type `stats/`. This will find (if any) GET URL links that we can use. The URL corresponding to the [Tracking Shots Dashboard](http://stats.nba.com/player/201935/shots-dash/) is [here](http://stats.nba.com/stats/playerdashptshots?DateFrom=&DateTo=&GameSegment=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PerMode=PerGame&Period=0&PlayerID=201935&Season=2017-18&SeasonSegment=&SeasonType=Playoffs&TeamID=0&VsConference=&VsDivision=). If you are feeling adventurous, you can use another data to create your dashboard with.

There are other interesting data sources: https://schoolofdata.org/2013/11/18/web-apis-for-non-programmers/ (note them some may be out of date since this is from 5 years ago!). If you would like to pursue a completely different data source, you are encouraged to. Keep in mind you want to create some dashboard to automatically update information. I can help you to determine wheter a site can be reverse engineered relatively easily if you choose to pursue this option.

Determine the set of parameters for creating an appropriate `param` dictionary. Test if your data download function works as intended.

In [None]:
import pandas as pd

def get_nba_data(endpt, params, return_url=False):

    ## endpt: https://github.com/seemethere/nba_py/wiki/stats.nba.com-Endpoint-Documentation
    ## params: dictionary of parameters: i.e., {'LeagueID':'00'}
    
    from pandas import DataFrame
    from urllib.parse import urlencode
    import json
    
    useragent = "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9\""
    dataurl = "\"" + "http://stats.nba.com/stats/" + endpt + "?" + urlencode(params) + "\""
    
    # for debugging: just return the url
    if return_url:
        return(dataurl)
    
    jsonstr = !wget -q -O - --user-agent={useragent} {dataurl}
    
    data = json.loads(jsonstr[0])
    
    h = data['resultSets'][0]['headers']
    d = data['resultSets'][0]['rowSet']
    
    return(DataFrame(d, columns=h))

In [None]:
params = {"MeasureType":"Base",
          "PerMode":"PerGame",
          "PlusMinus":"N",
          "PaceAdjust":"N",
          "Rank":"N",
          "LeagueID":"00",
          "Season":"2017-18",
          "SeasonType":"Playoffs",
          "PORound":"0",
          "Outcome":"",
          "Location":"",
          "Month":"0",
          "SeasonSegment":"",
          "DateFrom":"",
          "DateTo":"",
          "OpponentTeamID":"0",
          "VsConference":"",
          "VsDivision":"",
          "TeamID":"0",
          "Conference":"",
          "Division":"",
          "GameSegment":"",
          "Period":"0",
          "ShotClockRange":"",
          "LastNGames":"0",
          "GameScope":"",
          "PlayerExperience":"",
          "PlayerPosition":"",
          "StarterBench":""}

team_Playoffs = get_nba_data('leaguedashteamstats', params)
team_Playoffs.head()

__We downloaded the general socre board of all teams with Playoffs season type.__

## Problem 2: Creating interactive widgets

__Default option__: create at least one more widgets (three total) for specifying data downloads. For example, you can change year, opposing team, etc.

__Open-ended option__: create at least one widget for specifying data downloads.

In this problem, you will simply create widget(s). Next section will be showing how you can combine them.

In [None]:
## get all teams
params = {'LeagueID':'00'}
teams = get_nba_data('commonTeamYears', params)

## get all players
params = {'LeagueID':'00', 'Season': '2016-17', 'IsOnlyCurrentSeason': '0'}
players = get_nba_data('commonallplayers', params)

In [None]:
teams.head()

In [None]:
players.head()

In [None]:
team_names = players[['TEAM_ABBREVIATION', 'TEAM_CODE']].drop_duplicates()
teams_clean = teams.copy()
teams = pd.merge(teams_clean, team_names, left_on='ABBREVIATION', right_on='TEAM_ABBREVIATION')

In [None]:
teams.TEAM_CODE = teams.TEAM_CODE.str.capitalize() # returns values so needs to be reassigned
teams.sort_values('ABBREVIATION', inplace=True)    # modifies object
teams.tail()

In [None]:
team_dd_text = teams.ABBREVIATION+', '+teams.TEAM_CODE
team_dd = dict(zip(team_dd_text, teams.TEAM_ID))
team_dd

In [None]:
plyr_by_team_dd = dict()

for t, p in players.groupby('TEAM_ID'):
    
    plyr_by_team_dd[t] = dict(zip(p.DISPLAY_LAST_COMMA_FIRST, p.PERSON_ID))
    
plyr_by_team_dd

__We donwnload and combine the score board with four different season type together using `pd.concat()`__

In [None]:
Season_Type = ["Pre Season","Regular Season","Playoffs","All Star"]
frames = [] # create an empty dataframe

for i in Season_Type:
    params = {"MeasureType":"Base", "PerMode":"PerGame", "PlusMinus":"N", "PaceAdjust":"N",
              "Rank":"N", "LeagueID":"00", "Season":"2017-18", "SeasonType": i, "PORound":"0",
              "Outcome":"", "Location":"", "Month":"0", "SeasonSegment":"", "DateFrom":"", 
              "DateTo":"", "OpponentTeamID":"0", "VsConference":"", "VsDivision":"", "TeamID":"0",
              "Conference":"", "Division":"", "GameSegment":"", "Period":"0", "ShotClockRange":"",
              "LastNGames":"0", "GameScope":"", "PlayerExperience":"", "PlayerPosition":"",
              "StarterBench":""}

    team_data = get_nba_data('leaguedashteamstats', params)
    team_data["SEASON_TYPE"] = i
    frames.append(team_data)

season_dd = pd.concat(frames) # combine data of different season type
season_dd.head()

In [None]:
params = {"MeasureType":"Base", "PerMode":"PerGame", "PlusMinus":"N", "PaceAdjust":"N",
              "Rank":"N", "LeagueID":"00", "Season":"2017-18", "SeasonType": "All Star", "PORound":"0",
              "Outcome":"", "Location":"", "Month":"0", "SeasonSegment":"", "DateFrom":"", 
              "DateTo":"", "OpponentTeamID":"0", "VsConference":"", "VsDivision":"", "TeamID":"0",
              "Conference":"", "Division":"", "GameSegment":"", "Period":"0", "ShotClockRange":"",
              "LastNGames":"0", "GameScope":"", "PlayerExperience":"", "PlayerPosition":"",
              "StarterBench":""}

team_allstar = get_nba_data('leaguedashteamstats', params)
team_allstar # We find that there is not All-Star data

__However, we find that there is no data with All-Star during this season. Then, we change the data type of `TEAM_ID` to combine with other data later.__

In [None]:
print(season_dd.TEAM_ID.dtype) 

In [None]:
season_dd.TEAM_ID = season_dd.TEAM_ID.astype('int')
print(season_dd.TEAM_ID.dtype)

__We filter `season_dd` whose `TEAM_ID` are also in `teams`.__

In [None]:
season_dd = season_dd[season_dd.TEAM_ID.isin(teams.TEAM_ID)]
season_dd.tail()

__We want to create a widget that shows the win rate of each teams with different season types.__

In [None]:
W_PCT_by_season = dict()

for i, j in season_dd.groupby('TEAM_ID'):
    
    W_PCT_by_season[i] = dict(zip(j.SEASON_TYPE, j.W_PCT)) # Win rate by different season type

W_PCT_by_season

In [None]:
from ipywidgets import interact, FloatSlider, Dropdown, Button

selected = 'LAC, Clippers'

team_menu = Dropdown(options=team_dd, label=selected)
plyr_menu = Dropdown(options=plyr_by_team_dd[team_dd[selected]])
W_PCT_menu = Dropdown(options=W_PCT_by_season[team_dd[selected]])

display(team_menu, plyr_menu, W_PCT_menu)

In [None]:
W_PCT_by_season[team_dd[selected]]

## Problem 3: Downloading data with changing widget states

__Both options__: Add event handlers (`observe`, `on_click`, etc) to be called when some widget changes to another state. Make sure this works as expected.

In [None]:
selected = 'LAC, Clippers'

team_menu = Dropdown(options=team_dd, label=selected)
plyr_menu = Dropdown(options=plyr_by_team_dd[team_dd[selected]])
W_PCT_menu = Dropdown(options=W_PCT_by_season[team_dd[selected]])
fetch_button = Button(description='Get Data!', icon='check')

display(team_menu, plyr_menu, W_PCT_menu, fetch_button)

## update players list
def update_team(change):
    plyr_menu.index = None
    W_PCT_menu.index = None
    plyr_menu.options = plyr_by_team_dd[change['new']]
    W_PCT_menu.options = W_PCT_by_season[change['new']]

team_menu.observe(update_team, names='value')

## get data action
def get_data(change):
    print(team_menu.value, plyr_menu.value)
    print('WIN RATE IS:', W_PCT_menu.value)

    
fetch_button.on_click(get_data)

## Problem 4: Data transformation and visualization

__Default option__: create at least two data transformation using split-apply-combine approach. Some ideas are, 

- What is the shooting average against different teams? You would split based on opposing team, compute the average, and plot a bar chart. 

- What is the shooting average over different periods per game? You would split based on periods and game, then plot the changing shooting average over periods. Is this helpful? Why? Why not?

- Any other setting in which you would need to split-apply-combine to calculate a summary statistic.

- A setting of your choosing

Plot your result

__Open-ended option__: create at least one data transformation using split-apply-combine approach.

Plot your result

In [None]:
params = {'PlayerID':'202344',
          'PlayerPosition':'',
          'Season':'2016-17',
          'ContextMeasure':'FGA',
          'DateFrom':'',
          'DateTo':'',
          'GameID':'',
          'GameSegment':'',
          'LastNGames':'0',
          'LeagueID':'00',
          'Location':'',
          'Month':'0',
          'OpponentTeamID':'0',
          'Outcome':'',
          'Period':'0',
          'Position':'',
          'RookieYear':'',
          'SeasonSegment':'',
          'SeasonType':'Regular Season',
          'TeamID':'0',
          'VsConference':'',
          'VsDivision':''}

shotdata = get_nba_data('shotchartdetail', params)
shotdata.head()

In [None]:
list(shotdata)

__We split data by `GAME_DATE`, calcluate percentage of shot made flage using `mean()`, and plot the transformed data.__

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.figure(figsize=(12,11))

SHOT_MADE_PCT = shotdata.groupby("GAME_DATE")['SHOT_MADE_FLAG'].mean()

plt.title("Ratio of Shot Made Flag by Date")
plt.xlabel("Date")
plt.ylabel("Percentage")
SHOT_MADE_PCT.plot.bar()
plt.show()

__In terms of this histgram, we find that Trevor Booker has the highest rate to make shots in April 2nd, 2017.__

__The second transformation is count the shot made flag by its action types.__

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.figure(figsize=(12,11))

NUM_OF_SHOT_ATTEMPTED = shotdata.groupby("ACTION_TYPE")['SHOT_MADE_FLAG'].sum()

NUM_OF_SHOT_ATTEMPTED.plot.bar()
plt.title("Number of Shot Attempted Flag by Action Type")
plt.xlabel("Action Type")
plt.ylabel("Number of Shot")
plt.show()

__According to the histgram, me conclude that Trevor Booker prefers Jump shot and Lay up shot.__

## Notes

- Open-ended option will be a lot more work; however, you can use this towards building up your final projects if you so choose. 

- Label figures, and explain your steps. PSTAT 234 students' work are expected to be more refined.

- Exceptional assignments will receive extra credit.