This is version 2 of my Fantasy Hockey Analyzer. The purpose of this notebook is to predict the number of fantasy points every hockey player in the league will get based on previous years' performance.

This notebook primarily uses data from moneypuck.com for analysis, and it also uses data from rotowire.com to get +/- for each player.

Section 1: Parameters and Modules

These are the variables that can be adjusted. My model is an ensemble model consisting of neural nets and random forests, with data going back one, two, and three years.

In [1]:
# Set these values to the appropriate ammonts

current_year = 2025
common_number = 100
number_of_one_year_neural_nets = common_number
number_of_two_year_neural_nets = common_number
number_of_three_year_neural_nets = common_number
number_of_one_year_random_forests = common_number
number_of_two_year_random_forests = common_number
number_of_three_year_random_forests = common_number

# This is the breakdown of how many fantasy points a player gets for each category
points_dict = {
    "Goals":5, 
    "Assists":3, 
    "+/-":1.5, 
    "PIM":-0.25, 
    "PP_Goals":4, 
    "PP_Assists":2, 
    "SH_Goals":6, 
    "SH_Assists":4, 
    "Faceoffs_Won":0.25, 
    "Faceoffs_Lost":-0.15, 
    "Hits":0.5, 
    "Blocked_Shots":0.75
    }


The following is a list of modules that I used and the reason why they were used:

-os: to allow the program to read data in the repository

-numpy: basic math operations

-pandas: all dataframe operations/data storage/data cleaning

-various sklearn: all machine learning operations/analysis

In addition to these modules, I also have a custom module that contains helper functions that help in data cleaning/accuracy evaluation. These functions are contained in the "my_module.py" file in the repository. If you are interested in taking a look at these functions, they are available at https://github.com/chrisberry888/FantasyHockeyAnalyzer in the "my_module.py" file.

In [2]:
#Import block
import os
import numpy as np
import pandas as pd
from sklearn.neural_network import MLPRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.base import clone

import my_module_v2 as mx

Section 2: Data Gathering and Cleaning

This section compiles the Moneypuck and Rotowire data into a format that is usable by the ML models.

In [None]:
yearly_player_data = []

mp_teams = []
rw_teams = []

for year in range(2010, 2025):

    moneypuck_data = mx.get_moneypuck_data(year)
    rotowire_data = mx.get_rotowire_data(year)

    #combined_data = mx.combine_dataframes(moneypuck_data, rotowire_data)
    #display(moneypuck_data.head())

    #display(rotowire_data.head())

    mp_year_teams = moneypuck_data['team'].unique()
    rw_year_teams = rotowire_data['Team'].unique()

    for team in mp_year_teams:
        if team not in mp_teams:
            mp_teams.append(team)

    for team in rw_year_teams:
        if team not in rw_teams:
            rw_teams.append(team)

    

    
for team in mp_teams:
    print(team)
print()
print()
for team in rw_teams:
    print(team)


NYI
MTL
MIN
COL
FLA
STL
PIT
NSH
CHI
WSH
S.J
N.J
CBJ
BUF
ARI
NYR
ANA
VAN
CGY
TOR
DET
T.B
DAL
EDM
L.A
BOS
OTT
ATL
PHI
CAR
WPG
VGK
LAK
SJS
TBL
NJD
SEA
UTA
TB
FA
PIT
FLA
WAS
CHI
VGK
SJ
BUF
CGY
TOR
ARI
NSH
LA
OTT
DAL
BOS
CAR
PHI
EDM
NYI
SEA
WPG
CLS
MIN
STL
COL
ANH
VAN
NJ
MON
DET
NYR
TBL
ANA
UTA
SJS
MTL
LAK
WSH
NJD
CBJ
