# AFL Model - Part 4 - Weekly Predictions
Now that we have explored different algorithms for modelling, we can implement our chosen model and predict this week's AFL games! All you need to do is run the afl_modelling script each Thursday or Friday to predict the following week's games.

In [52]:
# Import Modules
from afl_feature_creation import prepare_afl_features
import afl_data_cleaning
import afl_feature_creation
import afl_modelling
import datetime
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)
from sklearn import svm, tree, linear_model, neighbors, naive_bayes, ensemble, discriminant_analysis, gaussian_process
from xgboost import XGBClassifier
from sklearn.linear_model import LogisticRegression, RidgeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB

## Creating The Features For This Weekend's Games
To actually predict this weekend's games, we need to create the same features that we have created in the previous tutorials for the games that will be played this weekend. This includes all the rolling averages, efficiency features, elo features etc. So the majority of this tutorial will be using previously defined functions to create features for the following weekend's games.

### Create Next Week's DataFrame
Let's first get our cleaned afl_data dataset, as well as the odds for next weekend and the 2018 fixture.

In [53]:
# Grab the cleaned AFL dataset and the column order
afl_data = afl_data_cleaning.prepare_afl_data()
ordered_cols = afl_data.columns

# Define a function which grabs the odds for each game for the following weekend
def get_next_week_odds(path):
    # Get next week's odds
    next_week_odds = pd.read_csv(path)
    next_week_odds = next_week_odds.rename(columns={"team_1": "home_team", 
                                                "team_2": "away_team", 
                                                "team_1_odds": "odds", 
                                                "team_2_odds": "odds_away"
                                               })
    return next_week_odds

# Import the fixture
# Define a function which gets the fixture and cleans it up
def get_fixture(path):
    # Get the afl fixture
    fixture = pd.read_csv(path)

    # Replace team names and reformat
    fixture = fixture.replace({'Brisbane Lions': 'Brisbane', 'Footscray': 'Western Bulldogs'})
    fixture['Date'] = pd.to_datetime(fixture['Date']).dt.date.astype(str)
    fixture = fixture.rename(columns={"Home.Team": "home_team", "Away.Team": "away_team"})
    return fixture

next_week_odds = get_next_week_odds("data/weekly_odds.csv")
fixture = get_fixture("data/afl_fixture_2018.csv")

In [54]:
fixture.tail()

Unnamed: 0,Date,Season,Season.Game,Round,home_team,away_team,Venue
211,2018-08-25,2018,212,,Carlton,Adelaide,Etihad Stadium
212,2018-08-25,2018,213,,Sydney,Hawthorn,SCG
213,2018-08-26,2018,214,,Brisbane,West Coast,Gabba
214,2018-08-26,2018,215,,Melbourne,GWS,MCG
215,2018-08-26,2018,216,,St Kilda,North Melbourne,Etihad Stadium


Now that we have these DataFrames, we will define a function which combines the fixture and next week's odds to create a single DataFrame for the games over the next 7 days. To use this function we will need Game IDs for next week. So we will create another function which creates Game IDs by using the Game ID from the last game played and adding 1 to it.

In [55]:
# Define a function which creates game IDs for this week's footy games
def create_next_weeks_game_ids(afl_data):
    odds = get_next_week_odds("data/weekly_odds.csv")

    # Get last week's Game ID
    last_afl_data_game = afl_data['Game'].iloc[-1]

    # Create Game IDs for next week
    game_ids = [(i+1) + last_afl_data_game for i in range(odds.shape[0])]
    return game_ids

# Define a function which creates this week's footy game DataFrame
def get_next_week_df(afl_data):
    # Get the fixture and the odds for next week's footy games
    fixture = get_fixture("data/afl_fixture_2018.csv")
    next_week_odds = get_next_week_odds("data/weekly_odds.csv")
    next_week_odds['Game'] = create_next_weeks_game_ids(afl_data)

    # Get today's date and next week's date and create a DataFrame for next week's games
    todays_date = datetime.datetime.today().strftime('%Y-%m-%d')
    date_in_7_days = (datetime.datetime.today() + datetime.timedelta(days=7)).strftime('%Y-%m-%d')
    fixture = fixture[(fixture['Date'] >= todays_date) & (fixture['Date'] < date_in_7_days)].drop(columns=['Season', 'Season.Game', 'Round'])
    next_week_df = pd.merge(fixture, next_week_odds, on=['home_team', 'away_team'])

    # Split the DataFrame onto two rows for each game
    h_df = next_week_df[['Date', 'Game', 'home_team', 'away_team', 'odds']]
    h_df['Team'] = h_df['home_team']
    h_df['Opposition'] = h_df['away_team']
    h_df['Home?'] = 1
    a_df = next_week_df[['Date', 'Game', 'home_team', 'away_team', 'odds_away']].rename(columns={'odds_away': 'odds'})
    a_df['Team'] = a_df['away_team']
    a_df['Opposition'] = a_df['home_team']
    a_df['Home?'] = 0
    next_week = a_df.append(h_df).sort_values(by='Game')
    return next_week

In [56]:
next_week_df = get_next_week_df(afl_data)
game_ids_next_round = create_next_weeks_game_ids(afl_data)
next_week_df

Unnamed: 0,Date,Game,home_team,away_team,odds,Team,Opposition,Home?
0,2018-08-24,15390,Port Adelaide,Essendon,2.42,Essendon,Port Adelaide,0
0,2018-08-24,15390,Port Adelaide,Essendon,1.71,Port Adelaide,Essendon,1
1,2018-08-25,15391,Geelong,Gold Coast,38.0,Gold Coast,Geelong,0
1,2018-08-25,15391,Geelong,Gold Coast,1.03,Geelong,Gold Coast,1
2,2018-08-25,15392,Richmond,Western Bulldogs,9.0,Western Bulldogs,Richmond,0
2,2018-08-25,15392,Richmond,Western Bulldogs,1.11,Richmond,Western Bulldogs,1
3,2018-08-25,15393,Fremantle,Collingwood,1.18,Collingwood,Fremantle,0
3,2018-08-25,15393,Fremantle,Collingwood,6.4,Fremantle,Collingwood,1
5,2018-08-25,15394,Sydney,Hawthorn,2.32,Hawthorn,Sydney,0
5,2018-08-25,15394,Sydney,Hawthorn,1.74,Sydney,Hawthorn,1


### Create Each Feature
Now let's append next week's DataFrame to our afl_data DataFrame and then create all the features we used in the [AFL Feature Creation Tutorial](0.2. afl_feature_creation_tutorial.ipynb).

In [57]:
# Append next week's games to our afl_data DataFrame
afl_data = afl_data.append(next_week_df).reset_index(drop=True)
afl_data = afl_data[ordered_cols]

# Create disposal efficiency column
afl_data['disposal_efficiency'] = afl_data['ED'] / afl_data['D']

# Create rolling averages for our statistics
cols_indx_start = afl_data.columns.get_loc("GA")
afl_avgs = afl_feature_creation.create_rolling_averages(afl_data, 6, afl_data.columns[cols_indx_start:])

# Create form between teams feature
afl_avgs = afl_feature_creation.form_between_teams(afl_avgs, 6)

# Apply elos
elos, probs, elo_dict = afl_feature_creation.elo_applier(afl_data, 24)
afl_avgs['home_elo'] = afl_avgs['Game'].map(elos).str[0]
afl_avgs['away_elo'] = afl_avgs['Game'].map(elos).str[1]

# Get elos for next week
afl_avgs.loc[afl_avgs['home_elo'].isna(), 'home_elo'] = afl_avgs[afl_avgs['home_elo'].isna()]['home_team'].map(elo_dict)
afl_avgs.loc[afl_avgs['away_elo'].isna(), 'away_elo'] = afl_avgs[afl_avgs['away_elo'].isna()]['away_team'].map(elo_dict)

# Create Adjusted Margin and then Average it over a 6 game window
afl_avgs = afl_feature_creation.map_elos(afl_avgs)
afl_avgs['Adj_elo_ave_margin'] = afl_avgs['Margin'] * afl_avgs['elo_Opp'] / afl_avgs['elo']
afl_avgs = afl_feature_creation.create_rolling_averages(afl_avgs, 6, ['Adj_elo_ave_margin'])

# Create average elo of opponents beaten/lost to feature
afl_avgs = afl_feature_creation.create_ave_elo_opponent(afl_avgs, 6, beaten_or_lost='beaten')
afl_avgs = afl_feature_creation.create_ave_elo_opponent(afl_avgs, 6, beaten_or_lost='lost')

# Create regular margin rolling average
afl_avgs = afl_feature_creation.create_rolling_averages(afl_avgs, 6, ['Margin'])
afl_avgs.tail()

Unnamed: 0,Team,odds,Date,home_team,away_team,Game,Round,Goals,Behinds,Points,Venue,Home?,Opposition,Opposition Goals,Opposition Behinds,Opposition Points,Season,Status,GA_ave_6,CP_ave_6,UP_ave_6,ED_ave_6,CM_ave_6,MI5_ave_6,One.Percenters_ave_6,BO_ave_6,K_ave_6,HB_ave_6,D_ave_6,M_ave_6,G_ave_6,B_ave_6,T_ave_6,HO_ave_6,I50_ave_6,CL_ave_6,CG_ave_6,R50_ave_6,FF_ave_6,FA_ave_6,AF_ave_6,SC_ave_6,CCL_ave_6,SCL_ave_6,SI_ave_6,MG_ave_6,TO_ave_6,ITC_ave_6,T5_ave_6,disposal_efficiency_ave_6,form_over_opposition_6,home_elo,away_elo,elo,elo_Opp,Adj_elo_ave_margin_ave_6,average_elo_opponents_beaten_6,average_elo_opponents_lost_6,Margin_ave_6
3033,Brisbane,2.58,2018-08-26,Brisbane,West Coast,15396,,,,,,1,West Coast,,,,,,10.0,136.0,248.0,290.333333,10.0,13.833333,52.666667,4.666667,219.5,166.166667,385.666667,106.833333,13.166667,9.166667,52.166667,41.333333,53.666667,37.166667,53.166667,39.833333,22.833333,19.333333,1614.333333,1711.0,11.666667,25.5,106.833333,5848.333333,63.333333,64.0,9.5,0.750879,0,1292.723413,1576.920238,1292.723413,1576.920238,2.927317,1408.781705,1569.76079,4.666667
3034,GWS,2.92,2018-08-26,Melbourne,GWS,15397,,,,,,0,Melbourne,,,,,,8.0,154.0,221.5,266.0,12.5,10.0,54.333333,8.666667,219.166667,153.666667,372.833333,87.666667,12.0,9.833333,62.333333,41.5,53.5,41.833333,55.0,44.0,20.333333,22.166667,1554.333333,1671.666667,12.5,29.333333,108.333333,6106.166667,72.0,72.5,12.166667,0.712588,4,1524.181803,1602.203161,1602.203161,1524.181803,9.743776,1452.398789,1559.438272,11.166667
3035,Melbourne,1.53,2018-08-26,Melbourne,GWS,15397,,,,,,1,GWS,,,,,,10.5,167.833333,219.5,264.333333,10.666667,12.0,53.5,2.333333,210.333333,175.833333,386.166667,78.333333,13.833333,11.0,76.333333,52.5,61.166667,43.833333,61.0,31.0,17.833333,21.666667,1622.333333,1681.0,14.166667,29.666667,110.166667,6236.333333,77.0,79.5,13.833333,0.683642,2,1524.181803,1602.203161,1524.181803,1602.203161,20.183879,1511.490856,1633.148523,20.333333
3036,North Melbourne,1.35,2018-08-26,St Kilda,North Melbourne,15398,,,,,,0,St Kilda,,,,,,8.333333,151.666667,222.833333,264.833333,12.5,11.0,48.833333,4.666667,195.166667,175.333333,370.5,81.833333,11.833333,8.0,61.666667,43.666667,52.5,35.166667,57.333333,35.166667,19.333333,18.0,1516.333333,1594.666667,13.666667,21.5,92.5,5476.833333,72.166667,73.666667,13.333333,0.713409,4,1410.547366,1465.710806,1465.710806,1410.547366,-1.676351,1570.161952,1563.776743,-1.833333
3037,St Kilda,3.8,2018-08-26,St Kilda,North Melbourne,15398,,,,,,1,North Melbourne,,,,,,6.833333,141.833333,246.166667,279.833333,8.5,10.333333,45.666667,7.166667,204.833333,179.666667,384.5,89.833333,10.166667,10.5,64.166667,34.666667,54.0,38.666667,49.666667,30.833333,21.833333,19.166667,1570.5,1652.666667,12.0,26.666667,94.166667,5371.333333,66.666667,64.833333,12.5,0.727069,2,1410.547366,1465.710806,1410.547366,1465.710806,-12.537844,1351.214014,1576.061092,-8.833333


Our DataFrame looks great! Although there are some NaNs, these are only in columns like 'Goals' which obviously don't have a value yet as the game hasn't been played.

Now we need to get our DataFrame on one line so that each row corresponds to one footy game. Let's use our previously defined function for this. Let's also drop the columns with NaNs which we don't need.

In [58]:
# Get each footy match on individual rows and drop irrelevant columns
one_line = afl_modelling.get_df_on_one_line(afl_avgs)

# Drop duplicate columns and unnecessary columns
cols_to_drop = ['Opposition Behinds', 'Goals', 'Behinds', 'Opposition Goals', 'Opposition Points', 'Points', 'Round', 'Venue', 'Season', 'Status',
               'CCL_ave_6', 'SCL_ave_6', 'SI_ave_6', 'MG_ave_6', 'TO_ave_6', 'ITC_ave_6', 'T5_ave_6', 'elo', 'elo_Opp', 'Behinds_away',
               'Goals_away', 'home_elo_away', 'away_elo_away', 'elo_away', 'elo_Opp_away']
one_line = one_line.drop(columns=cols_to_drop)
# Drop all columns where home_elo or away_elo == 1500 exactly, as this is the first game played
one_line = one_line[(one_line['home_elo'] != 1500) & (one_line['away_elo'] != 1500)]

# Drop Na rows from calculating moving averages
one_line = one_line.dropna(axis=0)

Now let's create our differential_df and then filter out DataFrame to only include this weekend's game, based on the Game IDs we created earlier

In [59]:
diff_df = afl_modelling.get_diff_df(one_line)
diff_df = diff_df.drop(columns=['odds', 'odds_away']).select_dtypes(include=[np.number])
prediction_feature_set = diff_df[diff_df['Game'].isin(game_ids_next_round)].dropna(axis=1)

In [60]:
prediction_feature_set

Unnamed: 0,Game,home_elo,away_elo,CCL_ave_6_away,SCL_ave_6_away,SI_ave_6_away,MG_ave_6_away,TO_ave_6_away,ITC_ave_6_away,T5_ave_6_away,GA_ave_6_diff,CP_ave_6_diff,UP_ave_6_diff,ED_ave_6_diff,CM_ave_6_diff,MI5_ave_6_diff,One.Percenters_ave_6_diff,BO_ave_6_diff,K_ave_6_diff,HB_ave_6_diff,D_ave_6_diff,M_ave_6_diff,G_ave_6_diff,B_ave_6_diff,T_ave_6_diff,HO_ave_6_diff,I50_ave_6_diff,CL_ave_6_diff,CG_ave_6_diff,R50_ave_6_diff,FF_ave_6_diff,FA_ave_6_diff,AF_ave_6_diff,SC_ave_6_diff,disposal_efficiency_ave_6_diff,form_over_opposition_6_diff,Adj_elo_ave_margin_ave_6_diff,average_elo_opponents_beaten_6_diff,average_elo_opponents_lost_6_diff,Margin_ave_6_diff,implied_odds_prob,implied_odds_prob_away
1540,15390,1549.809163,1510.090428,13.0,24.166667,103.333333,5841.0,71.333333,74.0,11.0,-0.833333,0.333333,-10.833333,-22.0,-2.833333,-2.166667,6.0,-5.5,-1.0,-6.0,-7.0,-9.333333,-2.5,-2.666667,12.333333,-2.666667,-1.5,-0.5,-0.666667,1.0,-2.166667,-0.5,-14.666667,-53.5,-0.04675,-4,-19.960432,-141.810723,2.38626,-18.833333,0.584795,0.413223
1541,15391,1648.870232,1282.313681,12.5,27.166667,69.166667,5574.666667,74.833333,72.0,12.5,7.833333,-2.0,50.833333,64.0,-1.0,4.333333,-6.833333,-2.333333,14.333333,30.666667,45.0,15.833333,7.666667,-3.333333,-3.0,-11.166667,6.166667,0.0,-7.166667,-4.333333,3.166667,0.0,174.5,229.666667,0.089159,2,53.31655,22.524422,133.261504,56.0,0.970874,0.026316
1542,15392,1633.082758,1442.83549,10.166667,27.833333,75.0,5109.333333,74.5,67.5,10.833333,1.333333,12.333333,-23.666667,-17.666667,4.0,2.833333,5.166667,0.0,-0.166667,-15.0,-15.166667,-0.5,4.333333,1.0,-4.0,-0.833333,7.5,-6.333333,7.0,3.5,-3.333333,2.5,-32.666667,52.166667,-0.016173,-2,49.646388,101.174494,57.096022,48.5,0.900901,0.111111
1543,15393,1421.059791,1515.584772,11.833333,24.833333,101.5,5703.0,73.333333,76.5,8.333333,-3.666667,-13.333333,-36.166667,-47.333333,-4.666667,-3.5,-4.333333,-4.833333,-17.833333,-27.166667,-45.0,-12.833333,-5.333333,-2.333333,-1.833333,-6.333333,-12.5,-5.166667,-0.166667,6.833333,-2.833333,2.666667,-205.166667,-211.333333,-0.041962,0,-57.449487,21.502175,-58.167024,-53.666667,0.15625,0.847458
1544,15394,1642.145056,1606.451621,13.0,23.166667,108.333333,5619.5,68.0,68.666667,12.833333,-3.0,11.833333,-38.5,-36.666667,1.166667,-4.0,-11.333333,-0.833333,-15.666667,-10.5,-26.166667,-11.166667,-3.666667,-0.666667,3.833333,-11.333333,-5.666667,-0.333333,4.166667,15.666667,1.5,4.333333,-131.666667,-52.833333,-0.045689,-2,-32.114831,119.501299,59.136937,-36.833333,0.574713,0.431034
1545,15395,1248.165593,1627.304629,10.833333,24.833333,77.833333,5597.333333,66.5,72.666667,10.666667,-0.5,-10.166667,-1.833333,-4.666667,0.333333,1.166667,-0.166667,2.333333,-25.0,12.5,-12.5,-13.0,-1.5,-3.5,-2.666667,2.166667,-10.0,-3.333333,5.166667,3.666667,3.166667,2.833333,-115.333333,-75.833333,0.013891,-2,-18.734439,85.97415,-30.709888,-11.333333,0.121951,0.877193
1546,15396,1292.723413,1576.920238,10.166667,22.166667,81.833333,5606.5,70.333333,73.166667,9.5,1.5,2.333333,32.5,30.333333,-2.833333,3.166667,5.333333,0.333333,-3.166667,37.0,33.833333,6.833333,2.0,1.0,-17.833333,1.5,4.833333,4.833333,5.5,0.833333,0.166667,1.166667,24.833333,83.166667,0.013697,-6,3.911068,-120.387903,2.386461,5.666667,0.387597,0.613497
1547,15397,1524.181803,1602.203161,12.5,29.333333,108.333333,6106.166667,72.0,72.5,12.166667,2.5,13.833333,-2.0,-1.666667,-1.833333,2.0,-0.833333,-6.333333,-8.833333,22.166667,13.333333,-9.333333,1.833333,1.166667,14.0,11.0,7.666667,2.0,6.0,-13.0,-2.5,-0.5,68.0,9.333333,-0.028946,-2,10.440104,59.092067,73.710251,9.166667,0.653595,0.342466
1548,15398,1410.547366,1465.710806,13.666667,21.5,92.5,5476.833333,72.166667,73.666667,13.333333,-1.5,-9.833333,23.333333,15.0,-4.0,-0.666667,-3.166667,2.5,9.666667,4.333333,14.0,8.0,-1.666667,2.5,2.5,-9.0,1.5,3.5,-7.666667,-4.333333,2.5,1.166667,54.166667,58.0,0.01366,-2,-10.861493,-218.947939,12.284349,-7.0,0.263158,0.740741


Great! We now have our feature DataFrame for this weekend's games. Let's now model this to create our predictions

## Creating Our Predictions
To create our predictions we will use similar code to what we used in our AFL Modelling Tutorial. We will need to train our model on all the data we have (up until last week's games), and then use our trained model to predict this week. Again, we will use Stacking with XGB, like we did in our AFL Modelling Tutorial.

In [61]:
afl = prepare_afl_features(window=6, k_factor=24).dropna().sort_values(by='Game')
afl = afl_modelling.get_df_on_one_line(afl)

# Drop columns which leak data and which we don't need
dropped_cols = ['Behinds', 'Goals', 'Opposition Behinds', 'Opposition Goals', 'Opposition Points', 'Points',
               'elo', 'elo_Opp', 'CCL_ave_6', 'SCL_ave_6', 'SI_ave_6', 'MG_ave_6', 'TO_ave_6', 'ITC_ave_6', 'T5_ave_6', 'home_win_away',
               'home_elo_away', 'away_elo_away']
afl = afl.drop(columns=dropped_cols)
# Get a differential DataFrame - subtracting the away features from the home features
diff_df = afl_modelling.get_diff_df(afl)

In [62]:
# Hard code estimators from our last tutorial
all_estimators = [
    LogisticRegression(C=0.75, class_weight=None, dual=False, fit_intercept=True,
       intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
       penalty='l2', random_state=None, solver='newton-cg', tol=0.0001,
       verbose=0, warm_start=False),
    
    RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True, fit_intercept=True,
     max_iter=None, normalize=False, random_state=None, solver='auto',
     tol=0.001),
    
    RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
         max_depth=15, max_features='sqrt', max_leaf_nodes=None,
         min_impurity_decrease=0.0, min_impurity_split=None,
         min_samples_leaf=1, min_samples_split=10,
         min_weight_fraction_leaf=0.0, n_estimators=750, n_jobs=-1,
         oob_score=False, random_state=5, verbose=0, warm_start=False),
    
    GaussianNB(priors=None),
    
    LinearDiscriminantAnalysis(n_components=None, priors=None, shrinkage=None,
           solver='svd', store_covariance=False, tol=0.0001)
]

# Hard code best cols
best_cols = ['home_elo',
'away_elo',
'GA_ave_6_diff',
'CP_ave_6_diff',
'UP_ave_6_diff',
'ED_ave_6_diff',
'CM_ave_6_diff',
'MI5_ave_6_diff',
'One.Percenters_ave_6_diff',
'BO_ave_6_diff',
'HB_ave_6_diff',
'M_ave_6_diff',
'G_ave_6_diff',
'T_ave_6_diff',
'HO_ave_6_diff',
'I50_ave_6_diff',
'CL_ave_6_diff',
'CG_ave_6_diff',
'R50_ave_6_diff',
'FF_ave_6_diff',
'FA_ave_6_diff',
'AF_ave_6_diff',
'SC_ave_6_diff',
'disposal_efficiency_ave_6_diff',
'R50_efficiency_ave_6_diff',
'I50_efficiency_ave_6_diff',
'Adj_elo_ave_margin_ave_6_diff',
'average_elo_opponents_beaten_6_diff',
'average_elo_opponents_lost_6_diff',
'Margin_ave_6_diff',
'implied_odds_prob',
'implied_odds_prob_away']

In [63]:
# Create our train sets
X = diff_df.drop(columns=['home_win']).select_dtypes(include=[np.number])
y = diff_df['home_win']

features = prediction_feature_set.columns

# Predict Next Round
preds_next_round = afl_modelling.implement_xgb_stacking(X[features].drop(columns='Game'), y, 
                                                        prediction_feature_set.drop(columns='Game'), all_estimators)

  if diff:


Now that we have our predictions, the final step is to put our predictions into a DataFrame so they're easy on the eye, and our predictions correspond to actual home and away teams rather than Game IDs. This is the final step.

In [64]:
preds_df = pd.DataFrame({
    "Game": prediction_feature_set['Game'],
    "Prediction (home_win)": preds_next_round
})

prediction_feature_set['home_odds'] = 1 / prediction_feature_set['implied_odds_prob']
prediction_feature_set['away_odds'] = 1 / prediction_feature_set['implied_odds_prob_away']

final_preds_df = pd.merge(preds_df, next_week_df[['home_team', 'away_team', 'Game']], on='Game').drop_duplicates()
final_preds_df = pd.merge(final_preds_df, prediction_feature_set[['home_odds', 'away_odds', 'home_elo', 'away_elo', 'Game']], on='Game')
final_preds_df['Predicted Winner'] = final_preds_df.apply(lambda x: x['home_team'] if x['Prediction (home_win)'] == 1 else x['away_team'], axis=1)

In [66]:
final_preds_df[['Game', 'home_team', 'away_team', 'Predicted Winner', 'home_odds', 'away_odds', 'home_elo', 'away_elo']]

Unnamed: 0,Game,home_team,away_team,Predicted Winner,home_odds,away_odds,home_elo,away_elo
0,15390,Port Adelaide,Essendon,Port Adelaide,1.71,2.42,1549.809163,1510.090428
1,15391,Geelong,Gold Coast,Geelong,1.03,38.0,1648.870232,1282.313681
2,15392,Richmond,Western Bulldogs,Richmond,1.11,9.0,1633.082758,1442.83549
3,15393,Fremantle,Collingwood,Collingwood,6.4,1.18,1421.059791,1515.584772
4,15394,Sydney,Hawthorn,Sydney,1.74,2.32,1642.145056,1606.451621
5,15395,Carlton,Adelaide,Adelaide,8.2,1.14,1248.165593,1627.304629
6,15396,Brisbane,West Coast,Brisbane,2.58,1.63,1292.723413,1576.920238
7,15397,Melbourne,GWS,Melbourne,1.53,2.92,1524.181803,1602.203161
8,15398,St Kilda,North Melbourne,St Kilda,3.8,1.35,1410.547366,1465.710806


## Conclusion
Congratulations! You have created AFL predictions for this week. If you are beginner to this, don't be overwhelmed. The process gets easier each time you do it. And it is super rewarding. In future iterations we will update this tutorial to predict actual odds, and then integrate this with Betfair's API so that you can create an automated betting strategy using Machine Learning to create your predictions!