Skip to content

Commit

Permalink
Added functionality to scrape NWHL data
Browse files Browse the repository at this point in the history
  • Loading branch information
HarryShomer committed Jan 7, 2019
1 parent 2f81359 commit 8c74c95
Show file tree
Hide file tree
Showing 41 changed files with 1,678 additions and 308 deletions.
Binary file modified .DS_Store
Binary file not shown.
6 changes: 6 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,9 @@ v1.2.7

* Added functionality to easier scrape live games
* Fixed user warnings


v1.3
----

* Added functionality to scrape NWHL data
3 changes: 1 addition & 2 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
License
=======

The MIT License (MIT)

Copyright (c) 2018 Harry Shomer
Copyright (c) 2019 Harry Shomer

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use,
Expand Down
44 changes: 40 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@ Hockey-Scraper
Purpose
-------

This package is designed to allow people to scrape the Play by Play and Shift data off of the National Hockey League
(NHL) API and website for all preseason, regular season, and playoff games since the 2007-2008 season.
This package is designed to allow people to scrape both NHL and NWHL data. For the NHL, one can scrape the Play by Play
and Shift data off of the National Hockey League (NHL) API and website for all preseason, regular season, and playoff
games since the 2007-2008 season. For the NWHL, one is able to scrape the Play by Play data off of their API and website
for all preseason, regular season, and playoff games since the 2015-2016 season.

Prerequisites
-------------
Expand All @@ -38,8 +40,8 @@ To install all you need to do is open up your terminal and type in:



Usage
-----
NHL Usage
---------

Standard Scrape Functions
~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -168,6 +170,40 @@ Here is a simple example of a way to setup live scraping. I strongly suggest che
to_csv(game)


NWHL Usage
----------

Scrape data on a season by season level:

::

import hockey_scraper

# Scrapes the 2015 & 2016 season and stores the data in a Csv file
hockey_scraper.nwhl.scrape_seasons([2015, 2016])

# Scrapes the 2008 season and returns a Pandas DataFrame containing the pbp
scraped_data = hockey_scraper.nwhl.scrape_seasons([2017], data_format='Pandas')

Scrape a list of games:

::

import hockey_scraper

# Scrape some games and store the results in a Csv file
# Also saves the scraped pages
hockey_scraper.nwhl.scrape_games([14694271, 14814946, 14689491], docs_dir="...Path you specified")

Scrape all games in a given date range:

::

import hockey_scraper

# Scrapes all games between 2016-10-10 and 2017-01-01 and returns a Pandas DataFrame containing the pbp
hockey_scraper.nwhl.scrape_date_range('2016-10-10', '2017-01-01', data_format='pandas')


The full documentation can be found `here <http://hockey-scraper.readthedocs.io/en/latest/>`_.

Expand Down
Binary file modified docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/build/doctrees/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/license_link.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/live_scrape.doctree
Binary file not shown.
Binary file added docs/build/doctrees/nhl_scrape_functions.doctree
Binary file not shown.
Binary file added docs/build/doctrees/nwhl_scrape_functions.doctree
Binary file not shown.
Binary file removed docs/build/doctrees/scrape_functions.doctree
Binary file not shown.
3 changes: 2 additions & 1 deletion docs/build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ Contents
.. toctree::
:maxdepth: 1

scrape_functions
nhl_scrape_functions
nwhl_scrape_functions
live_scrape
license_link

Expand Down
2 changes: 2 additions & 0 deletions docs/build/html/_sources/license_link.rst.txt
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
License
=======
.. include:: ../../LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Scraping Functions
==================
NHL Scraping Functions
======================

Scraping
--------
Expand Down
62 changes: 62 additions & 0 deletions docs/build/html/_sources/nwhl_scrape_functions.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
NWHL Scraping Functions
=======================

Scraping
--------

There are three ways to scrape games:

\1. *Scrape by Season*:

Scrape games on a season by season level (Note: A given season is referred to by the first of the two years it spans.
So you would refer to the 2016-2017 season as 2016).
::

import hockey_scraper

# Scrapes the 2015 & 2016 season and stores the data in a Csv file (both are equivalent!!!)
hockey_scraper.nwhl.scrape_seasons([2015, 2016])
hockey_scraper.nwhl.scrape_seasons([2015, 2016], data_format='Csv')

# Scrapes the 2008 season and returns a Pandas DataFrame
scraped_data = hockey_scraper.nwhl.scrape_seasons([2017], data_format='Pandas')


\2. *Scrape by Game*:

Scrape a list of games provided.
::

import hockey_scraper

# Scrapes games and store in a Csv file
hockey_scraper.nwhl.scrape_games([14694271, 14814946, 14689491], True)

# Scrapes games and return DataFrame with data
scraped_data = hockey_scraper.nwhl.scrape_games([14689624, 18507470, 20575219, 22207005], data_format='Pandas')

\3. *Scrape by Date Range*:

Scrape all games between a specified date range. All dates must be written in a "yyyy-mm-dd" format.
::

import hockey_scraper

# Scrapes all games between 2016-10-10 and 2017-01-01 and returns a Pandas DataFrame containing the pbp
hockey_scraper.nwhl.scrape_date_range('2016-10-10', '2017-01-01', data_format='pandas')


Scrape Functions
~~~~~~~~~~~~~~~~
.. automodule:: hockey_scraper.nwhl.scrape_functions
:members:

Html Schedule
~~~~~~~~~~~~~
.. automodule:: hockey_scraper.nwhl.html_schedule
:members:

Json PBP
~~~~~~~~
.. automodule:: hockey_scraper.nwhl.json_pbp
:members:

0 comments on commit 8c74c95

Please sign in to comment.