Korfball Game-By-Game Predictions
=====

In this notebook we will create a Python class for a league table that allows us to add games one at a time to change the standings. As new results come in, or we make predictions for results, we can add these to the table by calling a single method. In this version of the notebook we will create a simple model that can predict the results of games by looking at recent forms of different teams and how that corresponds to future game results. Scroll to the bottom of this notebook to see the final results. For a basic look at how to make overall predictions without looking at each game, [See this notebook](link.to.notebook)

In [1]:
import numpy as np
import pandas as pd

DATA_DIR = '../data/'

# Get Previous Results

Firstly we will read the National Korfball League data for the season so far.

In [2]:
data = pd.read_csv(DATA_DIR + 'all_data.csv', index_col=0)

Then we will create a new dataframe with the results of each game in. In the csv file we have 

In [3]:
data

Unnamed: 0,Season,Home Team,Home Score,-,Away Score,Away Team
0,2014/15,Trojans 1,22.0,-,16.0,Kingfisher 1
1,2014/15,Bec 1,26.0,-,16.0,Norwich Knights 1
2,2014/15,Birmingham City 1,5.0,-,21.0,Nomads 1
3,2014/15,Nomads 1,14.0,-,21.0,KV 1
4,2014/15,Trojans 1,27.0,-,10.0,Tornadoes 1
...,...,...,...,...,...,...
433,2019/20,Cambridge Tigers 1,,-,,Tornadoes 1
434,2019/20,Highbury 1,,-,,Kingfisher 1
435,2019/20,Bec 1,,-,,Tornadoes 1
436,2019/20,Norwich Knights 1,,-,,Highbury 1


In [4]:
results = data.loc[data.Season == '2019/20'].copy()

In [5]:
results.reset_index(drop=True, inplace=True)
results

Unnamed: 0,Season,Home Team,Home Score,-,Away Score,Away Team
0,2019/20,Norwich Knights 1,20.0,-,13.0,KV 1
1,2019/20,Bec 1,24.0,-,17.0,Trojans 1
2,2019/20,Kingfisher 1,23.0,-,18.0,Highbury 1
3,2019/20,Cambridge Tigers 1,19.0,-,13.0,Bristol Thunder 1
4,2019/20,Tornadoes 1,31.0,-,12.0,Nomads 1
...,...,...,...,...,...,...
85,2019/20,Cambridge Tigers 1,,-,,Tornadoes 1
86,2019/20,Highbury 1,,-,,Kingfisher 1
87,2019/20,Bec 1,,-,,Tornadoes 1
88,2019/20,Norwich Knights 1,,-,,Highbury 1


In [6]:
results.drop(['Season', '-'], axis=1, inplace=True)

In [7]:
results

Unnamed: 0,Home Team,Home Score,Away Score,Away Team
0,Norwich Knights 1,20.0,13.0,KV 1
1,Bec 1,24.0,17.0,Trojans 1
2,Kingfisher 1,23.0,18.0,Highbury 1
3,Cambridge Tigers 1,19.0,13.0,Bristol Thunder 1
4,Tornadoes 1,31.0,12.0,Nomads 1
...,...,...,...,...
85,Cambridge Tigers 1,,,Tornadoes 1
86,Highbury 1,,,Kingfisher 1
87,Bec 1,,,Tornadoes 1
88,Norwich Knights 1,,,Highbury 1


In [8]:
cols = ['teamA', 'scoredA', 'scoredB', 'teamB']
results.columns = cols

In [9]:
col_order = ['teamA', 'scoredA', 'teamB', 'scoredB']
results = results[col_order].copy()


In [10]:
# Add additional columns
results.loc[:, 'teamAPosition'] = 0
results.loc[:, 'teamBPosition'] = 0
results.loc[:, 'teamARecentScored'] = 0
results.loc[:, 'teamBRecentScored'] = 0
results.loc[:, 'teamARecentConceded'] = 0
results.loc[:, 'teamBRecentConceded'] = 0
results.loc[:, 'teamARecentPoints'] = 0
results.loc[:, 'teamBRecentPoints'] = 0
results

Unnamed: 0,teamA,scoredA,teamB,scoredB,teamAPosition,teamBPosition,teamARecentScored,teamBRecentScored,teamARecentConceded,teamBRecentConceded,teamARecentPoints,teamBRecentPoints
0,Norwich Knights 1,20.0,KV 1,13.0,0,0,0,0,0,0,0,0
1,Bec 1,24.0,Trojans 1,17.0,0,0,0,0,0,0,0,0
2,Kingfisher 1,23.0,Highbury 1,18.0,0,0,0,0,0,0,0,0
3,Cambridge Tigers 1,19.0,Bristol Thunder 1,13.0,0,0,0,0,0,0,0,0
4,Tornadoes 1,31.0,Nomads 1,12.0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...
85,Cambridge Tigers 1,,Tornadoes 1,,0,0,0,0,0,0,0,0
86,Highbury 1,,Kingfisher 1,,0,0,0,0,0,0,0,0
87,Bec 1,,Tornadoes 1,,0,0,0,0,0,0,0,0
88,Norwich Knights 1,,Highbury 1,,0,0,0,0,0,0,0,0


## League Table Class
The following class can be used to create a league table. The constructor takes only the list of teams in the league. Results can then be added to the table by calling the `add_result()` method. This method takes the two teams and the goals they scored.

In [11]:
class LeagueTable:
    def __init__(self, teams):
        self.teams = list(teams)
        self.table = pd.DataFrame(np.array([teams, [0] * len(self.teams), [0] * len(self.teams), [0] * len(self.teams), [0] * len(self.teams), [0] * len(self.teams), [0] * len(self.teams), [0] * len(self.teams), [0] * len(self.teams)]).T, columns=['Team', 'P', 'W', 'D', 'L', 'F', 'A', 'GD', 'Pts'])
        self.positions = np.array(range(1, len(self.teams) + 1))
        self.sort_table()
    
    def sort_table(self):
        self.table.sort_values(by=['Team'], ascending=True, inplace=True)
        self.table.sort_values(by=['Pts', 'GD', 'F'], ascending=False, inplace=True)
        self.table.set_index(self.positions, inplace=True)
        
    def show_table(self):
        return self.table.head(len(self.teams))
    
    def add_result(self, team_a, scored_a, team_b, scored_b):
        # Team A
        self.table.loc[self.table['Team'] == team_a, 'P'] += 1
        self.table.loc[self.table['Team'] == team_a, 'W'] += 1 if int(scored_a) > int(scored_b) else 0
        self.table.loc[self.table['Team'] == team_a, 'D'] += 1 if int(scored_a) == int(scored_b) else 0
        self.table.loc[self.table['Team'] == team_a, 'L'] += 1 if int(scored_a) < int(scored_b) else 0
        self.table.loc[self.table['Team'] == team_a, 'F'] += int(scored_a)
        self.table.loc[self.table['Team'] == team_a, 'A'] += int(scored_b)
        self.table.loc[self.table['Team'] == team_a, 'GD'] += int(scored_a) - int(scored_b)
        self.table.loc[self.table['Team'] == team_a, 'Pts'] += 2 if int(scored_a) > int(scored_b) else 1 if int(scored_a) == int(scored_b) else 0
        # Team B
        self.table.loc[self.table['Team'] == team_b, 'P'] += 1
        self.table.loc[self.table['Team'] == team_b, 'W'] += 1 if int(scored_b) > int(scored_a) else 0
        self.table.loc[self.table['Team'] == team_b, 'D'] += 1 if int(scored_b) == int(scored_a) else 0
        self.table.loc[self.table['Team'] == team_b, 'L'] += 1 if int(scored_b) < int(scored_a) else 0
        self.table.loc[self.table['Team'] == team_b, 'F'] += int(scored_b)
        self.table.loc[self.table['Team'] == team_b, 'A'] += int(scored_a)
        self.table.loc[self.table['Team'] == team_b, 'GD'] += int(scored_b) - int(scored_a)
        self.table.loc[self.table['Team'] == team_b, 'Pts'] += 2 if int(scored_b) > int(scored_a) else 1 if int(scored_b) == int(scored_a) else 0
        # Reorder table
        self.sort_table()

## Creating The Table
The table can be created by calling the constructor, as shown below.

In [12]:
table = LeagueTable(results['teamA'].unique())
table.show_table()

Unnamed: 0,Team,P,W,D,L,F,A,GD,Pts
1,Bec 1,0,0,0,0,0,0,0,0
2,Bristol Thunder 1,0,0,0,0,0,0,0,0
3,Cambridge Tigers 1,0,0,0,0,0,0,0,0
4,Highbury 1,0,0,0,0,0,0,0,0
5,KV 1,0,0,0,0,0,0,0,0
6,Kingfisher 1,0,0,0,0,0,0,0,0
7,Nomads 1,0,0,0,0,0,0,0,0
8,Norwich Knights 1,0,0,0,0,0,0,0,0
9,Tornadoes 1,0,0,0,0,0,0,0,0
10,Trojans 1,0,0,0,0,0,0,0,0


We can then add the results that we extracted before to the table. This is done by iterating through each result and adding it individually. Since we are iterating through each game one at a time, we can also calculate those features that we created earlier. Note that this method isn't perfect as league positions may change between two games, despite them happening at the same time - so this could be improved.

In [13]:
# Remove games no played
not_played = results.loc[results.isna().sum(axis=1).astype(bool)].copy()

In [14]:
results.drop(not_played.index, inplace=True)

In [15]:
game_data_cols = ['teamId', 'scored', 'missed', 'wins', 'draws', 'losses']
game_data = pd.DataFrame(columns=game_data_cols)

In [16]:
for index, row in results.iterrows():
    
    winA = int(row['scoredA'] > row['scoredB'])
    draw = int(row['scoredA'] == row['scoredB'])
    lossA = int(row['scoredA'] < row['scoredB'])
    
    new_rows = pd.DataFrame(columns=game_data_cols, 
                        data=[[row['teamA'], row['scoredA'], row['scoredB'], winA, draw, lossA],
                              [row['teamB'], row['scoredB'], row['scoredA'], lossA, draw, winA]])
    game_data = game_data.append(new_rows, ignore_index=True)


In [17]:
game_data

Unnamed: 0,teamId,scored,missed,wins,draws,losses
0,Norwich Knights 1,20.0,13.0,1,0,0
1,KV 1,13.0,20.0,0,0,1
2,Bec 1,24.0,17.0,1,0,0
3,Trojans 1,17.0,24.0,0,0,1
4,Kingfisher 1,23.0,18.0,1,0,0
...,...,...,...,...,...,...
137,Norwich Knights 1,31.0,26.0,1,0,0
138,KV 1,20.0,24.0,0,0,1
139,Norwich Knights 1,24.0,20.0,1,0,0
140,Highbury 1,13.0,22.0,0,0,1


In [18]:
for index, row in results.iterrows():
    # Update features
    previous_games = game_data[:index*2]
    #teamPosition
    results.iloc[index, 4] = table.show_table()[table.show_table()['Team'] == row['teamA']].index[0].astype('uint16')
    results.iloc[index, 5] = table.show_table()[table.show_table()['Team'] == row['teamB']].index[0].astype('uint16')
    #teamScored
    results.iloc[index, 6] = previous_games[previous_games['teamId'] == row['teamA']][-5:]['scored'].astype('uint16').sum()
    results.iloc[index, 7] = previous_games[previous_games['teamId'] == row['teamB']][-5:]['scored'].astype('uint16').sum()
    results.iloc[index, 8] = previous_games[previous_games['teamId'] == row['teamA']][-5:]['missed'].astype('uint16').sum()
    results.iloc[index, 9] = previous_games[previous_games['teamId'] == row['teamB']][-5:]['missed'].astype('uint16').sum()
    results.iloc[index, 10] = 2*previous_games[previous_games['teamId'] == row['teamA']][-5:]['wins'].astype('uint16').sum() + previous_games[previous_games['teamId'] == row['teamA']][-5:]['draws'].astype('uint16').sum()
    results.iloc[index, 11] = 2*previous_games[previous_games['teamId'] == row['teamB']][-5:]['wins'].astype('uint16').sum() + previous_games[previous_games['teamId'] == row['teamB']][-5:]['draws'].astype('uint16').sum()
    # Add result to table
    table.add_result(row['teamA'], row['scoredA'], row['teamB'], row['scoredB'])

In [19]:
table.show_table()

Unnamed: 0,Team,P,W,D,L,F,A,GD,Pts
1,Bec 1,13,13,0,0,393,197,196,26
2,Tornadoes 1,14,12,1,1,442,238,204,25
3,Trojans 1,13,11,1,1,385,233,152,23
4,Norwich Knights 1,14,9,0,5,348,305,43,18
5,Cambridge Tigers 1,15,7,0,8,330,394,-64,14
6,KV 1,15,6,1,8,264,310,-46,13
7,Kingfisher 1,14,4,1,9,256,355,-99,9
8,Nomads 1,15,3,1,11,269,360,-91,7
9,Bristol Thunder 1,15,3,1,11,288,404,-116,7
10,Highbury 1,14,0,0,14,204,383,-179,0


The dataset does not contain results of games that have not happened. 

## Get Remaining Games
Now we will extract the remaining games. If the order of the games was not important then these could be calculated by looking at which games are possible (every home and away combination), and then looking at which games have already been played. However, our model will use data related to recent form, so we need to know the order in which games are played. For this reason, we are inputting games manually in the correct order.

In [20]:
games_left = not_played[['teamA', 'teamB']].copy()

In [21]:
games_left

Unnamed: 0,teamA,teamB
71,Trojans 1,Bec 1
72,Bristol Thunder 1,Norwich Knights 1
73,KV 1,Tornadoes 1
74,Kingfisher 1,Bec 1
75,Nomads 1,Cambridge Tigers 1
76,Highbury 1,Trojans 1
77,Norwich Knights 1,Trojans 1
78,Tornadoes 1,Bec 1
79,Kingfisher 1,Nomads 1
80,KV 1,Bristol Thunder 1


## Predict Results and Final Table
Once we know what games still need to be played, we will make predictions for them. Our league table class needs home and away goals to add a result, so we need to be able to predict both of these with our model. To do this, we are going to use sklearn's MultiOutputRegressor along with a KNeighborsRegressor. We are using regression as we are predicting the number of goals scored by both teams. 

In [22]:
from sklearn.multioutput import MultiOutputRegressor, MultiOutputClassifier
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor, RandomForestClassifier
from sklearn.model_selection import cross_val_score, GridSearchCV, KFold
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.pipeline import Pipeline
from sklearn.base import BaseEstimator


We will get our training dataset. We will train our model on the features that we extracted earler: the position of each team in the league, the number of goals scored by each team in the past 5 games, and the number of goals conceded by each team in the past 5 games. The outputs (our target variables) will be the number of goals scored by each team.

In [23]:
target_columns = ['scoredA', 'scoredB']
X_train = results.drop(target_columns, axis=1)
X_train.fillna(0, inplace=True)
Y_train = results[target_columns]

In [24]:
sorted(X_train.teamA.unique())

['Bec 1',
 'Bristol Thunder 1',
 'Cambridge Tigers 1',
 'Highbury 1',
 'KV 1',
 'Kingfisher 1',
 'Nomads 1',
 'Norwich Knights 1',
 'Tornadoes 1',
 'Trojans 1']

In [25]:
X_train

Unnamed: 0,teamA,teamB,teamAPosition,teamBPosition,teamARecentScored,teamBRecentScored,teamARecentConceded,teamBRecentConceded,teamARecentPoints,teamBRecentPoints
0,Norwich Knights 1,KV 1,8,5,0,0,0,0,0,0
1,Bec 1,Trojans 1,2,9,0,0,0,0,0,0
2,Kingfisher 1,Highbury 1,6,5,0,0,0,0,0,0
3,Cambridge Tigers 1,Bristol Thunder 1,5,4,0,0,0,0,0,0
4,Tornadoes 1,Nomads 1,6,5,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...
66,Highbury 1,Tornadoes 1,10,2,83,166,125,98,0,9
67,Bristol Thunder 1,Cambridge Tigers 1,8,6,88,119,150,153,1,4
68,Nomads 1,Norwich Knights 1,9,4,93,126,120,128,3,4
69,KV 1,Norwich Knights 1,6,4,95,138,96,126,6,6


We will also create some extra features which are based upon historic team performance. The first of these features will indicate whether the team has topped the National League table in the last five seasons. Similarly, the second feature will indicate whether the team has finished in the top 4 (Play-Offs) in the last five seasons. The third feature indicated that a team has been promoted from a Regional League in the last five seasons.

In [26]:
# Returns 1 if the team topped the National League Table in the previous five seasons
def is_previous_top_1(team):
    top_1 = ['Trojans 1', 
             'Bec 1']
    return 1 if team in top_1 else 0
    
# Returns 1 if the team made the Play-Offs in the previous five seasons    
def is_previous_top_4(team):
    top_4 = ['Trojans 1', 'Bec 1', 'Norwich Knights 1', 'Tornadoes 1', 'Kingfisher 1', 
             'Nomads 1', 'KV 1']
    return 1 if team in top_4 else 0

# Returns 1 if the team was promoted into the National League Table in the previous five seasons
def is_promo(team):
    promo = ['Birmingham City 1', 'Bristol Thunder 1', 'Highbury 1', 'Cambridge Tigers 1', 
             'Bearsted 1', 'Croydon 1']
    return 1 if team in promo else 0

# Returns 1 if the team won the Play-Offs in the previous five seasons
def is_champion(team):
    champions = ['Trojans 1']
    return 1 if team in champions else 0


X_train['teamAPrevChamp'] = X_train['teamA'].apply(lambda x: is_champion(x))
X_train['teamBPrevChamp'] = X_train['teamB'].apply(lambda x: is_champion(x))
X_train['teamAPrevTop1'] = X_train['teamA'].apply(lambda x: is_previous_top_1(x))
X_train['teamBPrevTop1'] = X_train['teamB'].apply(lambda x: is_previous_top_1(x))
X_train['teamAPrevTop4'] = X_train['teamA'].apply(lambda x: is_previous_top_4(x))
X_train['teamBPrevTop4'] = X_train['teamB'].apply(lambda x: is_previous_top_4(x))
X_train['teamAPrevPromo'] = X_train['teamA'].apply(lambda x: is_promo(x))
X_train['teamBPrevPromo'] = X_train['teamB'].apply(lambda x: is_promo(x))

X_train.drop(['teamA', 'teamB'], inplace=True, axis=1)

In [27]:
X_train

Unnamed: 0,teamAPosition,teamBPosition,teamARecentScored,teamBRecentScored,teamARecentConceded,teamBRecentConceded,teamARecentPoints,teamBRecentPoints,teamAPrevChamp,teamBPrevChamp,teamAPrevTop1,teamBPrevTop1,teamAPrevTop4,teamBPrevTop4,teamAPrevPromo,teamBPrevPromo
0,8,5,0,0,0,0,0,0,0,0,0,0,1,1,0,0
1,2,9,0,0,0,0,0,0,0,1,1,1,1,1,0,0
2,6,5,0,0,0,0,0,0,0,0,0,0,1,0,0,1
3,5,4,0,0,0,0,0,0,0,0,0,0,0,0,1,1
4,6,5,0,0,0,0,0,0,0,0,0,0,1,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
66,10,2,83,166,125,98,0,9,0,0,0,0,0,1,1,0
67,8,6,88,119,150,153,1,4,0,0,0,0,0,0,1,1
68,9,4,93,126,120,128,3,4,0,0,0,0,1,1,0,0
69,6,4,95,138,96,126,6,6,0,0,0,0,1,1,0,0


## Trying Out Different Models

So that all of our features have a similar impact on the model we will use the StandardScaler to standardise them. 

In [28]:
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_train[:,0:2] *= 2.5
X_train[:,2:8] *= 1.5

In [31]:
pipe_knr = Pipeline([('reg', MultiOutputRegressor(KNeighborsRegressor()))])

grid_param_knr = {
    'reg__estimator__n_neighbors': list(range(1, 10)),
    'reg__estimator__weights': ['distance', 'uniform'],
    'reg__estimator__leaf_size': list(range(5, 25, 5))
}

gs_knr = (GridSearchCV(estimator=pipe_knr,
                       param_grid=grid_param_knr,
                       cv=KFold(5, shuffle=True),
                       scoring='r2',
                       n_jobs=-1))

gs_knr = gs_knr.fit(X_train, Y_train)
print(gs_knr.best_estimator_)
print(gs_knr.best_score_)

Pipeline(memory=None,
         steps=[('reg',
                 MultiOutputRegressor(estimator=KNeighborsRegressor(algorithm='auto',
                                                                    leaf_size=5,
                                                                    metric='minkowski',
                                                                    metric_params=None,
                                                                    n_jobs=None,
                                                                    n_neighbors=8,
                                                                    p=2,
                                                                    weights='distance'),
                                      n_jobs=None))],
         verbose=False)
0.4946454385036716




In [32]:
print(cross_val_score(gs_knr.best_estimator_, X_train, Y_train, cv=KFold(5, shuffle=True)))

[0.38159325 0.32186373 0.58279549 0.41169716 0.61029534]


In [33]:
pipe_rfr = Pipeline([('reg', MultiOutputRegressor(GradientBoostingRegressor(loss='huber')))])

grid_param_rfr = {
    'reg__estimator__n_estimators': [500, 1000],
    'reg__estimator__learning_rate': [0.01, 0.05],
    'reg__estimator__max_depth': [1, 2],
    'reg__estimator__min_samples_leaf': [5, 10],
    'reg__estimator__min_samples_split': [5,10]
}

gs_rfr = (GridSearchCV(estimator=pipe_rfr,
                       param_grid=grid_param_rfr,
                       cv=3,
                       scoring='r2',
                       n_jobs=-1, return_train_score=False))

gs_rfr = gs_rfr.fit(X_train, Y_train)
print(gs_rfr.best_estimator_)
print(gs_rfr.best_score_)



Pipeline(memory=None,
         steps=[('reg',
                 MultiOutputRegressor(estimator=GradientBoostingRegressor(alpha=0.9,
                                                                          criterion='friedman_mse',
                                                                          init=None,
                                                                          learning_rate=0.01,
                                                                          loss='huber',
                                                                          max_depth=1,
                                                                          max_features=None,
                                                                          max_leaf_nodes=None,
                                                                          min_impurity_decrease=0.0,
                                                                          min_impurity_split=None,
                           

We will then create a model and fit with our training data.

In [34]:
model = MultiOutputRegressor(KNeighborsRegressor(n_neighbors=3, weights='distance'))
model.fit(X_train, Y_train)

MultiOutputRegressor(estimator=KNeighborsRegressor(algorithm='auto',
                                                   leaf_size=30,
                                                   metric='minkowski',
                                                   metric_params=None,
                                                   n_jobs=None, n_neighbors=3,
                                                   p=2, weights='distance'),
                     n_jobs=None)

In [35]:
# cross_val_score(model, X_train, Y_train, cv=5, scoring='explained_variance')

Now we have our model. We can use this to predict results for every remaining game, and then add these onto our table. Since our model makes use fo the recent form of each team we must update this after each prediction. 

In [36]:
for index, row in games_left.iterrows():
    # Get recent form features
    a_position = table.show_table()[table.show_table()['Team'] == row['teamA']].index[0].astype('uint16')
    b_position = table.show_table()[table.show_table()['Team'] == row['teamB']].index[0].astype('uint16')
    a_scored = game_data[game_data['teamId'] == row['teamA']][-5:]['scored'].astype('uint16').sum()
    b_scored = game_data[game_data['teamId'] == row['teamB']][-5:]['scored'].astype('uint16').sum()
    a_conceded = game_data[game_data['teamId'] == row['teamA']][-5:]['missed'].astype('uint16').sum()
    b_conceded = game_data[game_data['teamId'] == row['teamB']][-5:]['missed'].astype('uint16').sum()
    a_points = 2*game_data[game_data['teamId'] == row['teamA']][-5:]['wins'].astype('uint16').sum() + game_data[game_data['teamId'] == row['teamA']][-5:]['draws'].astype('uint16').sum()
    b_points = 2*game_data[game_data['teamId'] == row['teamB']][-5:]['wins'].astype('uint16').sum() + game_data[game_data['teamId'] == row['teamB']][-5:]['draws'].astype('uint16').sum()
    
    
    # Get extra features
    a_chmp = is_champion(row['teamA'])
    b_chmp = is_champion(row['teamB'])    
    a_top1 = is_previous_top_1(row['teamA'])
    b_top1 = is_previous_top_1(row['teamB'])
    a_top4 = is_previous_top_4(row['teamA'])
    b_top4 = is_previous_top_4(row['teamB'])
    a_prom = is_promo(row['teamA'])
    b_prom = is_promo(row['teamB'])
    
    # Make game prediction
    X_pred = np.array([a_position, b_position, a_scored, b_scored, a_conceded, b_conceded, a_points, b_points, a_chmp, b_chmp, a_top1, b_top1, a_top4, b_top4, a_prom, b_prom]).reshape(1, -1)
    X_pred = scaler.transform(X_pred)
    X_pred[:,0:2] *= 2.5
    X_pred[:,2:8] *= 1.5
    goals = model.predict(X_pred)
    
    # Add result to the table
    table.add_result(row['teamA'], round(goals[0][0]), row['teamB'], round(goals[0][1]))
    
    # Get win/draw/loss
    home_win = 1 if round(goals[0][0]) > round(goals[0][1]) else 0
    home_draw = 1 if round(goals[0][0]) == round(goals[0][1]) else 0
    home_loss = 1 if round(goals[0][0]) < round(goals[0][1]) else 0
    
    # Save the result (updating recent form)
    new_row_a = pd.DataFrame([[row['teamA'], round(goals[0][0]), round(goals[0][1]), home_win, home_draw, home_loss]], columns=game_data.columns)
    new_row_b = pd.DataFrame([[row['teamB'], round(goals[0][1]), round(goals[0][0]), home_loss, home_draw, home_win]], columns=game_data.columns)
    game_data = pd.concat([game_data, new_row_a, new_row_b], ignore_index=True)

# Show the final table
table.show_table()

Unnamed: 0,Team,P,W,D,L,F,A,GD,Pts
1,Bec 1,18,16,1,1,526,298,228,33
2,Trojans 1,18,16,1,1,539,315,224,33
3,Tornadoes 1,18,14,2,2,547,324,223,30
4,Norwich Knights 1,18,12,0,6,445,372,73,24
5,KV 1,18,7,1,10,323,391,-68,15
6,Cambridge Tigers 1,18,7,0,11,384,488,-104,14
7,Kingfisher 1,18,6,1,11,315,441,-126,13
8,Nomads 1,18,4,1,13,325,427,-102,9
9,Bristol Thunder 1,18,4,1,13,340,475,-135,9
10,Highbury 1,18,0,0,18,262,475,-213,0


In [37]:
def get_match_result(index):
    return game_data.iloc[index*2: index*2 + 2]

In [38]:
get_match_result(71)

Unnamed: 0,teamId,scored,missed,wins,draws,losses
142,Trojans 1,26.0,22.0,1,0,0
143,Bec 1,22.0,26.0,0,0,1


In [39]:
games_left

Unnamed: 0,teamA,teamB
71,Trojans 1,Bec 1
72,Bristol Thunder 1,Norwich Knights 1
73,KV 1,Tornadoes 1
74,Kingfisher 1,Bec 1
75,Nomads 1,Cambridge Tigers 1
76,Highbury 1,Trojans 1
77,Norwich Knights 1,Trojans 1
78,Tornadoes 1,Bec 1
79,Kingfisher 1,Nomads 1
80,KV 1,Bristol Thunder 1


## Discussion
