# Sample code for Question 1

This notebook shows how you can use the provided Python function to pull player stats from an online website.
This function lives in a custom package that is provided to you in this repository.
You are encouraged to leverage this package as a skeleton and add all of your reusable code, functions, etc. into relevant modules.
This makes collaboration much easier as the package could be seen as a "single source of truth" to pull data, create visualizations, etc. rather than relying on a jumble of notebooks.
You can still run into trouble if branches are not frequently merged as work progresses, so try to not let your branches diverge too much.

In [1]:
from ift6758.data import get_player_stats

If the above doesn't work for you, make sure you've installed the repo as specified in the readme file. 
Essentially you must make sure that your environment is setup (either through conda or virtualenv), and then install it using:

```bash
pip install -e /path/to/repo 
```

The nice thing using this approach is if you have your environment activated, you can import modules from anywhere on your system!

In [2]:
df = get_player_stats(2016, 'goalies')

Retrieving data from 'https://www.hockey-reference.com/leagues/NHL_2016_goalies.html'...


If you're curious, this function uses the `pd.read_html()` method ([doc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html)), which internally uses [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) to parse HTML data.

In [3]:
df.head()

Unnamed: 0,Rk,Player,Age,Tm,GP,GS,W,L,T/O,GA,...,MIN,QS,QS%,RBS,GA%-,GSAA,G,A,PTS,PIM
0,1,Jake Allen,25,STL,47,44,26,15,3,101,...,2583,26,0.591,6,94.0,6.28,0,0,0,0
1,2,Frederik Andersen,26,ANA,43,37,22,9,7,88,...,2298,24,0.649,5,95.0,4.46,0,1,1,2
2,3,Craig Anderson,34,OTT,60,60,31,23,5,161,...,3477,31,0.517,8,99.0,2.05,0,2,2,0
3,4,Richard Bachman,28,VAN,1,1,1,0,0,3,...,60,0,0.0,0,,,0,0,0,0
4,5,Niklas Bäckström,37,CGY,4,3,2,2,0,13,...,233,2,0.667,1,,,0,0,0,0


In [1]:
from ift6758.data.acquisition import Season

season_list = {}
for year in range(2016, 2021):
    new_season = Season(year=year, data_path=f'./cache/pickleFiles/{year}.pkl')
    season_list[year] = new_season

Loading cached data in cache\pickleFiles\2016.pkl
1332 cached games found for year 2016
Loading cached data in cache\pickleFiles\2017.pkl
1376 cached games found for year 2017
Loading cached data in cache\pickleFiles\2018.pkl
1376 cached games found for year 2018
Loading cached data in cache\pickleFiles\2019.pkl
1428 cached games found for year 2019
Loading cached data in cache\pickleFiles\2020.pkl
973 cached games found for year 2020


In [53]:
from ift6758.data.acquisition import SeasonType
import ipywidgets as widgets
import json

def get_games_for_season_and_seasonType(season_type: SeasonType, season:int):
    return [item for item in season_list[season].data if season_type.value in str(item['gamePk'])[4:6]]

year_slider = widgets.IntSlider(
    min=2016, 
    max=2020, 
    description='Season :')

seasonType_selector = widgets.Dropdown(
    options=[item.name.capitalize() for item in SeasonType],
    value=SeasonType.REGULAR.name.capitalize(),
    description='Season Type :',
    disabled=False,
)

game_id_slider = widgets.IntSlider(
    min=0, 
    max=1350, 
    description='Game ID :')

game_summary_desc = widgets.HTML(
    value="Game date",
)

event_id_slider = widgets.IntSlider(
    min=0, 
    max=1350, 
    description='Event ID :')

event_desc = widgets.HTML(
    value = "Event data"
)

display(
    seasonType_selector, 
    year_slider, 
    game_id_slider,
    game_summary_desc,
    event_id_slider,
    event_desc)

all_games = []
selectedGame = None

def on_value_change(year, season_type):
    global all_games 
    global selectedGame
    all_games = get_games_for_season_and_seasonType(SeasonType[season_type.upper()], year)
    game_id_slider.max = (len(all_games) - 1)
    game_id_slider.value = 0
    update_game_summary(0)
    events = selectedGame['liveData']['plays']['allPlays'] 
    event_id_slider.max = (len(events) - 1)
    event_id_slider.value = 0
    update_event_info(0)

def on_year_change(change):
    on_value_change(year=change['new'], season_type=seasonType_selector.value)

def on_seasonType_change(change):
    on_value_change(year=year_slider.value, season_type=change['new'])

def update_game_summary(game_id):
    global all_games
    global selectedGame
    selectedGame = all_games[game_id]
    dateTime = selectedGame['gameData']['datetime']
    game_summary_desc.value = (f'{dateTime["dateTime"]}<br>' 
        f'Game ID : {game_id} &nbsp {selectedGame["gameData"]["teams"]["home"]["abbreviation"]} (home) vs {selectedGame["gameData"]["teams"]["away"]["abbreviation"]} (away) <br>'
        'Summary :'
        '<table>'
        '   <tr>'
        '       <th></th>'
        '       <th>Home</th>'
        '       <th>Away</th>'
        '   </tr>'
        '   <tr>'
        '       <td>Teams</td>'
        f'       <td>{selectedGame["gameData"]["teams"]["home"]["abbreviation"]}</td>'
        f'       <td>{selectedGame["gameData"]["teams"]["away"]["abbreviation"]}</td>'
        '   </tr>'
        '   <tr>'
        '       <td>Goals</th>'
        f'       <td>{selectedGame["liveData"]["linescore"]["teams"]["home"]["goals"]}</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["teams"]["away"]["goals"]}</td>'
        '   </tr>'
        '   <tr>'
        '       <td>SoG</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["teams"]["home"]["shotsOnGoal"]}</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["teams"]["away"]["shotsOnGoal"]}</td>'
        '   </tr>'
        '   <tr>'
        '       <td>SO Goals</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["shootoutInfo"]["home"]["scores"]}</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["shootoutInfo"]["away"]["scores"]}</td>'
        '   </tr>'
        '   <tr>'
        '       <td>SO Attempts</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["shootoutInfo"]["home"]["attempts"]}</td>'
        f'       <td>{selectedGame["liveData"]["linescore"]["shootoutInfo"]["away"]["attempts"]}</td>'
        '   </tr>'
        
        )

def update_event_info(event_id):
    global selectedGame
    event = selectedGame['liveData']['plays']['allPlays'][event_id]
    event_desc.value = '<pre id="json"> {' + json.dumps(event, indent=2, sort_keys=True) + '}</pre>'

def on_event_id_change(change):
    update_event_info(change['new'])

def on_game_id_change(change):
    update_game_summary(change['new'])

year_slider.observe(on_year_change, names='value')
seasonType_selector.observe(on_seasonType_change, names='value')
game_id_slider.observe(on_game_id_change, names='value')
event_id_slider.observe(on_event_id_change, names='value')

on_value_change(year_slider.value, seasonType_selector.value)

Dropdown(description='Season Type :', options=('Regular', 'Playoff'), value='Regular')

IntSlider(value=2016, description='Season :', max=2020, min=2016)

IntSlider(value=0, description='Game ID :', max=1350)

HTML(value='Game date')

IntSlider(value=0, description='Event ID :', max=1350)

HTML(value='Event data')