To analyse the teams in the league, we need to get the results in a format we can use.

The results are hosted on Yahoo's fantasy hockey site. To date, I've manually retrieved the results and created a csv file (I'll automate it someday). Each week's results are in a separate file in the same directory.

I need to retreive results from csv files and put them into a data structure that is easy to work with.

Prepare this workspace

In [1]:
import numpy as np
import csv
import pandas as pd
#import matplotlib.pyplot as plt
#%matplotlib qt
#%matplotlib inline

There are 12 teams in the league. Their full and abbreviated names will come in handy when diplaying results, so let's code them in.

The category names are also coded in (Goals, Assists, Plus/Minus, Hits, Blocked Shots, Goalie Wins, Goalie Save Percentage)

In [2]:
names = ['Basement Dwellers','Chotchmahoneless','Dice-n-Draft','Dont Toews Me Bro','Happys Hustlers','Hard Off the Glass','Neals Neat Team','Newfie Rockers','RyansNOTsoRandomTeam','The Gallows Pole', 'TopShelf','Tylers Tilers']
shortnames = ['BDwell','Chotch','D-n-D','Toews','Hustle','HrdGls','NNeatT','Newfie','RnsRT','T G P','TpShlf','Tilers']

cats = ['G', 'A', '+/-', 'Hits', 'Blk', 'W', "SV%"]

We want to load the first week's results into a Numpy array.

I've uploaded the league results for weeks 1 through 19 to a Github repository, so we'll grab it from there.

In [4]:
# define a function to load a file from Github
def loadweekonline(weeknumber):
    
    url = 'https://raw.githubusercontent.com/scibbatical/fan_hockey/master/w%s.csv' % weeknumber
    results = np.array(pd.read_csv(url))
    
    return results

# load a file
w1 = loadweekonline(1)
        
# show results
print(w1)

[[  17.      32.       4.      40.      41.       4.       0.922]
 [  12.      18.     -14.      57.      42.       5.       0.908]
 [   6.      22.      -7.      71.      42.       5.       0.893]
 [   9.       9.       4.     113.      54.       4.       0.906]
 [  13.      25.       1.      65.      47.       1.       0.897]
 [   7.      16.      -9.      84.      56.       2.       0.916]
 [  11.      27.      -2.      54.      65.       3.       0.891]
 [  11.      35.      16.      95.      58.       6.       0.907]
 [  14.      33.      -1.      30.      39.       5.       0.898]
 [  10.      20.      14.      46.      38.       4.       0.897]
 [   9.      17.      -2.     101.      51.       2.       0.906]]


This is great. A week's results are a 2D array. Each row is a team's stats for week 1, and each column is a stat category.

We can add another dimension to the data: time. Each week will be represented by a layer, and each layer will be a 2D array of results.

Let's define a function to create our 3D result array, then use it to load 19 weeks of results.

In [5]:
def compresultsonline(uptoweek):
    
    #start by loading the first week    
    results = [loadweekonline(1)]
    
    # now append the other weeks onto results        
    for i in range(uptoweek-1):
        results = np.append(results,[loadweekonline(i+2)],axis=0)
        
    return results


BHLresults = compresultsonline(19)

The data is now in a 3D array from which we can grab data according to team, week, and category index by:

BHLresults[(week),(team),(category)]

For example, The Gallows Pole's (team index 9) performance in Goals (category index 0) this season can be found:

In [11]:
print(BHLresults[:,9,0])

[ 10.    4.    7.    7.    4.    5.    7.   10.    8.    6.    5.   12.
   8.    6.    9.5   8.    6.    7.    7. ]
