# Inference

We have previously trained a model, now we provide it new data, and attempt to generate some predictions

Load in the old model

In [1]:
import models

mod = models.loadModel("model.joblib")
mod

And load the new dataset

In [2]:
trainData, testData, cols = models.loadData("cleaned_results.csv")

print(f"Columns: {', '.join(cols)}")
trainData["x"].head()

Columns: Home_Team, Away_Team, Season, Round, Elo_home, Elo_away, awayGoal, homeGoal, awayGoalTotal, homeGoalTotal, homeStreak, awayStreak, homeStreakTotal, awayStreakTotal


Unnamed: 0,Home_Team,Away_Team,Season,Round,Elo_home,Elo_away,awayGoal,homeGoal,awayGoalTotal,homeGoalTotal,homeStreak,awayStreak,homeStreakTotal,awayStreakTotal
668,355,372,2022,14,48,59,13,11,13,11,2,0,2,2
3745,494,432,2022,12,71,76,11,10,11,10,1,0,2,2
1284,249,407,2022,31,52,65,34,30,34,30,0,0,3,3
70,225,431,2022,8,54,52,6,3,6,3,0,1,2,1
1105,231,24,2022,13,49,58,11,7,11,7,0,1,2,2


We also need to reduce our dataset to the selected features

In [3]:
import numpy as np
feats = list(np.load("selectedFeatures.npy"))

trainData = models.subData(trainData, feats)
testData = models.subData(testData, feats)

print(f"Selected features: {', '.join(feats)}")
trainData["x"].head()

Selected features: Home_Team, Away_Team, Season, Round, Elo_home, Elo_away


Unnamed: 0,Away_Team,Elo_away,Elo_home,Home_Team,Round,Season
668,372,59,48,355,14,2022
3745,432,76,71,494,12,2022
1284,407,65,52,249,31,2022
70,431,52,54,225,8,2022
1105,24,58,49,231,13,2022


We can then evaluate this model on the newer data

In [4]:
models.trainAndScore(mod, trainData, testData)
models.performace(mod, trainData, testData)

Training model: LinearDiscriminantAnalysis
Performance summary for LinearDiscriminantAnalysis
Score:
- Training:  0.4685
- Testing:   0.4595
- Difference:-0.0091
Performance summary for LinearDiscriminantAnalysis
Score:
- Training:  0.4685
- Testing:   0.4595
- Difference:-0.0091


(0.4594594594594595, 0.46854791299235743)

# Predictions

We now have trained a model up to the newest dataset, we can start to generate predictions

First we load the new features and ensure they are reduced to our selected features

In [5]:
newData, _, newCols = models.loadData("to_predict.csv", 0, hasY=False)
newData = models.subData(newData, feats)

newData["x"].head()

Unnamed: 0,Away_Team,Elo_away,Elo_home,Home_Team,Round,Season
64,215,34,44,148,33.0,2022.0
83,69,63,74,293,30.0,2022.0
129,124,51,42,127,34.0,2022.0
131,132,47,55,199,34.0,2022.0
130,419,53,50,264,34.0,2022.0


And then make a set of predictions

In [7]:
prediction = mod.predict(newData["x"])
prediction

array([2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 2, 3, 2, 2, 3, 2, 3, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 2, 2, 2, 2, 2, 2, 3, 2,
       2, 3, 3, 2, 3, 2, 3, 2, 2, 2, 2, 2, 3, 2, 2, 3, 2, 2, 2, 2, 2, 2,
       3, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2,
       3, 2, 2, 2, 2, 2, 3, 2, 2, 2, 3, 2, 3, 2, 2, 3, 2, 2, 3, 2, 3, 2,
       2, 2, 2, 3, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 3, 2,
       2, 2, 3, 2, 2, 2])

Which we can translate into a set of outcomes

In [15]:
predicted = newData["x"].copy()
predicted["Outcome"] = prediction

# convert from number to score
# 0 for zero draw, 1 for nonzerodraw, 2 for homewin, 3 for awaywin
predicted["Outcome"].replace(0, "0-0", inplace=True)
predicted["Outcome"].replace(1, "Tie", inplace=True)
predicted["Outcome"].replace(2, "Home Win", inplace=True)
predicted["Outcome"].replace(3, "Away Win", inplace=True)

predicted

Unnamed: 0,Away_Team,Elo_away,Elo_home,Home_Team,Round,Season,Outcome
64,215,34,44,148,33.0,2022.0,Home Win
83,69,63,74,293,30.0,2022.0,Home Win
129,124,51,42,127,34.0,2022.0,Home Win
131,132,47,55,199,34.0,2022.0,Home Win
130,419,53,50,264,34.0,2022.0,Home Win
...,...,...,...,...,...,...,...
77,315,70,89,478,31.0,2022.0,Home Win
44,342,89,59,102,30.0,2022.0,Away Win
35,417,46,60,503,35.0,2022.0,Home Win
97,138,53,98,50,32.0,2022.0,Home Win


And show some interesting stats about the prediction

In [19]:
h = len(np.where(prediction==2)[0])
a = len(np.where(prediction==3)[0])
t = len(prediction) - h - a

f"The network predicts there will be {t} Ties, {h} Home wins, and {a} Away wins."

'The network predicts there will be 0 Ties, 108 Home wins, and 30 Away wins.'