## Test Final Model

## Code Setup

In [1]:
%load_ext autoreload

In [10]:
%autoreload 2

import numpy as np
import pandas as pd
from sklearn.metrics import mean_absolute_error


from augury.ml_estimators import StackingEstimator
from augury.sklearn.metrics import match_accuracy_scorer
from augury.ml_data import MLData
from augury.settings import TEST_YEAR_RANGE, SEED

np.random.seed(SEED)

In [5]:
data = MLData(train_year_range=(min(TEST_YEAR_RANGE),), test_year_range=TEST_YEAR_RANGE)
data.data

2020-03-17 07:36:04,026 - kedro.io.data_catalog - INFO - Loading data from `model_data` (JSONLocalDataSet)...


Unnamed: 0,Unnamed: 1,Unnamed: 2,team,oppo_team,round_type,venue,prev_match_oppo_team,oppo_prev_match_oppo_team,date,team_goals,team_behinds,score,...,oppo_rolling_prev_match_goals_divided_by_rolling_prev_match_goals_plus_rolling_prev_match_behinds,win_odds,oppo_win_odds,line_odds,oppo_line_odds,betting_pred_win,rolling_betting_pred_win_rate,oppo_betting_pred_win,oppo_rolling_betting_pred_win_rate,win_odds_multiplied_by_ladder_position
Adelaide,1991,1,Adelaide,Hawthorn,Regular,Football Park,0,Melbourne,1991-03-22 03:56:00+00:00,24.0,11.0,155.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Adelaide,1991,2,Adelaide,Carlton,Regular,Football Park,Hawthorn,Fitzroy,1991-03-31 03:56:00+00:00,12.0,9.0,81.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Adelaide,1991,3,Adelaide,Sydney,Regular,S.C.G.,Carlton,Hawthorn,1991-04-07 03:05:00+00:00,19.0,18.0,132.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Adelaide,1991,4,Adelaide,Essendon,Regular,Windy Hill,Sydney,North Melbourne,1991-04-13 03:30:00+00:00,6.0,11.0,47.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Adelaide,1991,5,Adelaide,West Coast,Regular,Subiaco,Essendon,North Melbourne,1991-04-21 05:27:00+00:00,9.0,11.0,65.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Western Bulldogs,2020,19,Western Bulldogs,Richmond,Regular,M.C.G.,St Kilda,Gold Coast,2020-07-26 03:30:00+00:00,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Western Bulldogs,2020,20,Western Bulldogs,Geelong,Regular,Kardinia Park,Richmond,Melbourne,2020-08-01 06:55:00+00:00,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Western Bulldogs,2020,21,Western Bulldogs,North Melbourne,Regular,Docklands,Geelong,Carlton,2020-08-09 05:40:00+00:00,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Western Bulldogs,2020,22,Western Bulldogs,Port Adelaide,Regular,Eureka Stadium,North Melbourne,Essendon,2020-08-15 04:05:00+00:00,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [8]:
X_train, y_train = data.train_data

In [9]:
se = StackingEstimator()
se

StackingEstimator(min_year=1965, name='stacking_estimator',
                  pipeline=StackingRegressor(meta_regressor=Pipeline(memory=None,
                                                                     steps=[('standardscaler',
                                                                             StandardScaler(copy=True,
                                                                                            with_mean=True,
                                                                                            with_std=True)),
                                                                            ('extratreesregressor',
                                                                             ExtraTreesRegressor(bootstrap=False,
                                                                                                 ccp_alpha=0.0,
                                                                                                 criterion='mse',
   

## Train and predict for 2019 season

Up till now, I've held out data from the 2019 season as the final test set. Params were tuned on data going up through the 2018 season, so the model hasn't "seen" 2019 yet. This will give us a good point of comparison with the 2019 vintage of Tipresias (a bagging ensemble composed of `XGBRegressor` estimators).

In [11]:
se.fit(*data.train_data)
y_pred = se.predict(data.test_data[0])

print(f"Match Accuracy: {match_accuracy_scorer(se, *data.test_data)}")
print(f"MAE: {mean_absolute_error(data.test_data[1], y_pred)}")

Match Accuracy: 0.6763285024154589
MAE: 27.143333614200653


## Conclusion

Tipresias 2019 got an accuracy of 0.6667 and an MAE of 27.087 in 2019. Given that I optimised Tipresias 2019 for MAE and Tipresias 2020 for accuracy, this split makes sense. It also suggests that the overall performance of the two models isn't very different, which also makes sense, because, despite some of the big structural changes to the model, the shape of the data is the same as last year, and the addition of the ARIMA model has a very small impact on the overall model's performance. This offseason was mostly about refactoring and cleaning up the code, so maybe next offseason I'll have more time to tinker with the model.