# Sample code for Question 1

This notebook shows how you can use the provided Python function to pull player stats from an online website.
This function lives in a custom package that is provided to you in this repository.
You are encouraged to leverage this package as a skeleton and add all of your reusable code, functions, etc. into relevant modules.
This makes collaboration much easier as the package could be seen as a "single source of truth" to pull data, create visualizations, etc. rather than relying on a jumble of notebooks.
You can still run into trouble if branches are not frequently merged as work progresses, so try to not let your branches diverge too much.

In [19]:
from ift6758.data import get_player_stats
import sys

If the above doesn't work for you, make sure you've installed the repo as specified in the readme file. 
Essentially you must make sure that your environment is setup (either through conda or virtualenv), and then install it using:

```bash
pip install -e /path/to/repo 
```

The nice thing using this approach is if you have your environment activated, you can import modules from anywhere on your system!

In [7]:
df = get_player_stats(2016, 'goalies')

Retrieving data from 'https://www.hockey-reference.com/leagues/NHL_2016_goalies.html'...


If you're curious, this function uses the `pd.read_html()` method ([doc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_html.html)), which internally uses [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) to parse HTML data.

In [8]:
df.head()

Unnamed: 0,Rk,Player,Age,Tm,GP,GS,W,L,T/O,GA,...,MIN,QS,QS%,RBS,GA%-,GSAA,G,A,PTS,PIM
0,1,Jake Allen,25,STL,47,44,26,15,3,101,...,2583,26,0.591,6,94.0,6.28,0,0,0,0
1,2,Frederik Andersen,26,ANA,43,37,22,9,7,88,...,2298,24,0.649,5,95.0,4.46,0,1,1,2
2,3,Craig Anderson,34,OTT,60,60,31,23,5,161,...,3477,31,0.517,8,99.0,2.05,0,2,2,0
3,4,Richard Bachman,28,VAN,1,1,1,0,0,3,...,60,0,0.0,0,,,0,0,0,0
4,5,Niklas Bäckström,37,CGY,4,3,2,2,0,13,...,233,2,0.667,1,,,0,0,0,0


## Q1 - Intento

In [2]:
sys.path.append('../ift6758/data/')
from get_data import get_games_data

In [None]:
data = get_games_data(2017, 2018, './json')

Game IDs

The first 4 digits identify the season of the game (ie. 2017 for the 2017-2018 season). The next 2 digits give the type of game, where 01 = preseason, 02 = regular season, 03 = playoffs, 04 = all-star. The final 4 digits identify the specific game number. For regular season and preseason games, this ranges from 0001 to the number of games played. (1271 for seasons with 31 teams (2017 and onwards) and 1230 for seasons with 30 teams). For playoff games, the 2nd digit of the specific number gives the round of the playoffs, the 3rd digit specifies the matchup, and the 4th digit specifies the game (out of 7).

In [51]:
a = get_game_data('2017030146')

Retrieving data from 'http://statsapi.web.nhl.com/api/v1/game/2017030146/feed/live'...


In [89]:
a;

In [53]:
a.keys()

dict_keys(['copyright', 'gamePk', 'link', 'metaData', 'gameData', 'liveData'])

In [61]:
a['gameData'].keys()

dict_keys(['game', 'datetime', 'status', 'teams', 'players', 'venue'])

In [66]:
a['liveData'].keys()

dict_keys(['plays', 'linescore', 'boxscore', 'decisions'])

In [67]:
a['liveData']['plays'].keys()

dict_keys(['allPlays', 'scoringPlays', 'penaltyPlays', 'playsByPeriod', 'currentPlay'])

In [75]:
b = a['liveData']['plays']['allPlays']

In [76]:
type(b)

list

In [77]:
len(b)

321

In [90]:
b;

In [88]:
b[5]['result']['event']

'Blocked Shot'

In [130]:
b[5]['coordinates']

{'x': -46.0, 'y': 18.0}

In [129]:
b[5]['coordinates']['x']

-46.0

In [92]:
c = b[0]

In [93]:
type(c)

dict

In [101]:
c.keys()

dict_keys(['result', 'about', 'coordinates'])

In [104]:
df2 = pd.DataFrame.from_dict(c, orient="index")

In [105]:
df2

Unnamed: 0,event,eventCode,eventTypeId,description,eventIdx,eventId,period,periodType,ordinalNum,periodTime,periodTimeRemaining,dateTime,goals
result,Game Scheduled,PHI1,GAME_SCHEDULED,Game Scheduled,,,,,,,,,
about,,,,,0.0,1.0,1.0,REGULAR,1st,00:00,20:00,2018-04-22T18:17:18Z,"{'away': 0, 'home': 0}"


In [118]:
teams_url = 'http://statsapi.web.nhl.com/api/v1/game/2017030178/feed/live'

team_response = requests.get(teams_url)

In [120]:
ej = (team_response.status_code)

In [121]:
type(ej)

int