Skip to content

sportsdataverse/fastRhockey-data

Repository files navigation

fastRhockey-data

PHF & NHL Data

This repository holds historical boxscore and play-by-play data for the Premier Hockey Federation (PHF, formerly known as NWHL), which was compiled with the fastRhockey package from GitHub.

You can find fastRhockey here: BenHowell71/fastRhockey

The scraper was created to increase access to play-by-play and boxscore data for the PHF, which has historically been one of the bigger barriers to entry regarding women’s hockey analytics.


Data

This repo contains three main CSVs of data, each of which is outlined in a little more detail below.

  • phf_meta_data.csv: this csv contains all the data that you’d want on an individual game in one row. Contains home/away teams, arena information, game IDs, league IDs, and more
  • boxscore.csv: this csv contains all the boxscore information from the PHF for the games in phf_meta_data.csv. Contains data on game ID, scoring by period, shots be period, power play numbers, and more, all broken down by each team involved in a game
  • play_by_play.csv: this csv contains all the play-by-play data from the PHF. It includes information on events, how many skaters were on the ice, penalties, shots, etc. This data is essentially complete for the more recent PHF seasons, while it is spottier, usually just goals and penalties, from the early seasons of the league

The best way to get familiar with this data is to use it! You can either download directly from this repo or use fastRhockey to scrape the data yourself.


Follow SportsDataverse on Twitter and star fastRhockey

Twitter Follow

GitHub stars

Our Authors

Our Contributors (they’re awesome)

Citations

To cite the fastRhockey R package in publications, use:

BibTex Citation

@misc{howell_fastRhockey_2021,
  author = {Ben Howell},
  title = {fastRhockey: The SportsDataverse's R Package for Women's Hockey Data.},
  url = {https://benhowell71.github.io/fastRhockey/},
  year = {2021}
}