# Performance metrics

We are generally interested in the top few picks for any given week - say the top 6 picks.  

The general plan is to add a bunch of features, make a shallow learning predictive pipeline, and test the effectiveness of the features.  The goal is to improve the percent chance of winning prediction component of the simulation.    

Plan of basic features to add:  
* win/loss percent
* streak
* points for/ points against
* home/ away
* team name (one-hot-encode)

These features can be calculated with various parameter settings:
* number of games to look back
* whether to look back to the previous season
* whether to weight the data based on the opponent at the time
* operations on parameters, e.g., squared, log normalized, parameters multiplied or added together  

There are a lot of potential combinations.  All features will end up being doubled (added for the opponent as well).

There are two ways to approach this: either build all the features in a large CSV file, and then test the algorithms, or build them one or a few at a time and test as we go.  The second sounds more fun, so I'll go with that.  

In [1]:
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.metrics import log_loss
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import LinearSVC
from sklearn.svm import SVC

  from numpy.core.umath_tests import inner1d


## Load the data and do some wrangling

In [2]:
path = r'..\processed_data'
data = pd.read_csv(path + '\\spreadspoke_scores_processed.csv')

In [3]:
data.columns

Index(['Unnamed: 0', 'schedule_date', 'schedule_season', 'schedule_week',
       'schedule_playoff', 'team_home', 'score_home', 'score_away',
       'team_away', 'team_favorite_id', 'spread_favorite', 'over_under_line',
       'stadium', 'stadium_neutral', 'weather_temperature', 'weather_wind_mph',
       'weather_humidity', 'weather_detail', 'team_home_id', 'team_away_id',
       'winner', 'favorite_won', 'team_underdog_id'],
      dtype='object')

In [4]:
# base the "left hand" team based on the home team; make the spread negative = favorite   
data.loc[(data.team_home_id == data.team_favorite_id),'spread_home'] = data['spread_favorite']
data.loc[(data.team_home_id != data.team_favorite_id),'spread_home'] = -data['spread_favorite']

In [5]:
data['home_won'] = (data.winner == data.team_home_id)

In [6]:
# replace the playoffs with numbers
playoff_list = data.schedule_week.drop_duplicates().sort_values()[-6::].tolist()
number_playoffs = {'Conference':21, 'Division':20, 'SuperBowl':22, 'Superbowl':22, 'WildCard':19, 'Wildcard':19}
data.loc[data.schedule_week.isin(playoff_list), 'schedule_week'] = data.loc[data.schedule_week.isin(playoff_list), 'schedule_week'].map(number_playoffs)
data.schedule_week = pd.to_numeric(data.schedule_week)
data.score_home = pd.to_numeric(data.score_home)
data.score_away = pd.to_numeric(data.score_away)
data.head()

Unnamed: 0.1,Unnamed: 0,schedule_date,schedule_season,schedule_week,schedule_playoff,team_home,score_home,score_away,team_away,team_favorite_id,...,weather_wind_mph,weather_humidity,weather_detail,team_home_id,team_away_id,winner,favorite_won,team_underdog_id,spread_home,home_won
0,0,09/01/1979,1979,1,False,Tampa Bay Buccaneers,31.0,16.0,Detroit Lions,TB,...,9.0,87.0,,TB,DET,TB,True,DET,-3.0,True
1,1,11/23/1980,1980,12,False,Tampa Bay Buccaneers,10.0,24.0,Detroit Lions,TB,...,9.0,77.0,,TB,DET,DET,False,DET,-3.0,False
2,2,10/04/1981,1981,5,False,Tampa Bay Buccaneers,28.0,10.0,Detroit Lions,TB,...,9.0,76.0,,TB,DET,TB,True,DET,-1.0,True
3,3,12/26/1982,1982,8,False,Tampa Bay Buccaneers,23.0,21.0,Detroit Lions,TB,...,11.0,72.0,,TB,DET,TB,True,DET,-3.5,True
4,4,09/04/1983,1983,1,False,Tampa Bay Buccaneers,0.0,11.0,Detroit Lions,TB,...,7.0,83.0,,TB,DET,DET,False,DET,-3.0,False


## First Benchmark = logistic regression on point spreads

In [7]:
X = data.spread_home
X = X.values.reshape(-1, 1) 
y = data.home_won
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.33, random_state=42)

In [8]:
model = LogisticRegression().fit(X_train, y_train)

In [9]:
percent_correct = sum(model.predict(X_test)==y_test)/len(y_test)
percent_correct

0.6597370834607154

In [10]:
log_loss(y_pred=model.predict_proba(X_test), y_true=y_test)

0.6148705115814553

## Second Benchmark = decision tree on point spreads

In [11]:
model = DecisionTreeClassifier().fit(X_train, y_train)

In [12]:
percent_correct = sum(model.predict(X_test)==y_test)/len(y_test)
percent_correct

0.6502598593702231

In [13]:
log_loss(y_pred=model.predict_proba(X_test), y_true=y_test)

0.6588701109321393

## Third Benchmark = random forest on point spreads

In [14]:
model = RandomForestClassifier().fit(X_train, y_train)

In [15]:
percent_correct = sum(model.predict(X_test)==y_test)/len(y_test)
percent_correct

0.6484255579333538

In [16]:
log_loss(y_pred=model.predict_proba(X_test), y_true=y_test)

0.6590065727064073

## Fourth Benchmark = extra trees on point spreads

In [17]:
model = ExtraTreesClassifier().fit(X_train, y_train)

In [18]:
percent_correct = sum(model.predict(X_test)==y_test)/len(y_test)
percent_correct

0.6502598593702231

In [19]:
log_loss(y_pred=model.predict_proba(X_test), y_true=y_test)

0.6588701109321392

## Fifth Benchmark = support vector machine on point spreads

In [20]:
model = LinearSVC().fit(X_train, y_train)

In [21]:
percent_correct = sum(model.predict(X_test)==y_test)/len(y_test)
percent_correct

0.6597370834607154

In [22]:
# no log loss
#  log_loss(y_pred=model.predict_proba(X_test), y_true=y_test)

## Some more wrangling

In [23]:
# reorganize the DF so each team is listed once (each game is listed twice)
games = data[['schedule_season', 'schedule_week', 'team_home_id', 'team_away_id', 'spread_home', 
              'score_home', 'score_away','home_won']].sort_values(by = ['schedule_season', 'schedule_week', 'team_home_id'])
games['home'] = True
games.rename(columns = {'team_home_id':'team', 'team_away_id': 'opponent', 'spread_home':'spread',
                        'score_home':'pts_for', 'score_away': 'pts_against', 'home_won' : 'won'}, inplace = True)
copy = games.copy()
copy.rename(columns = {'team':'opponent', 'opponent':'team', 'pts_for': 'pts_against', 'pts_against':'pts_for'}, inplace = True)
copy.home = False
copy.spread = -copy.spread
copy.won = -copy.won
games = pd.concat([games, copy]).sort_values(by = ['schedule_season', 'schedule_week', 'team'])
games = games[[ 'schedule_season', 'schedule_week','team','opponent', 'home', 'spread','pts_for', 'pts_against', 'won']]


# add running tally of game counts for each team
teams = games.team.drop_duplicates().sort_values().tolist()
for team in teams:
    games.loc[games.team == team, 'team_game_count'] =  np.arange(sum(games.team == team)) 

games.head()

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  if sys.path[0] == '':


Unnamed: 0,schedule_season,schedule_week,team,opponent,home,spread,pts_for,pts_against,won,team_game_count
7346,1979,1,ARI,DAL,True,4.0,21.0,22.0,False,0.0
3888,1979,1,ATL,NO,False,5.0,40.0,34.0,True,0.0
4784,1979,1,BUF,MIA,True,5.0,7.0,9.0,False,0.0
839,1979,1,CHI,GB,True,-3.0,6.0,3.0,True,0.0
5452,1979,1,CIN,DEN,False,3.0,0.0,10.0,False,0.0


## Prediction functions for use

In [24]:
# this is the prediction function for logistic regression only
def log_reg(X, y):
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.33, random_state=42)
    scaler = StandardScaler()
    X_train = scaler.fit_transform(X_train)
    X_test = scaler.transform(X_test)
    
    lr = LogisticRegression().fit(X_train, y_train)
    
    percent_correct = sum(lr.predict(X_test)==y_test)/len(y_test)
    ll = log_loss(y_pred=lr.predict_proba(X_test), y_true=y_test)
    print('percent_correct = ',percent_correct*100)
    print('log_loss = ', ll)

In [25]:
# this is the prediction function using a few different algorithms
def bench_marks(X, y):
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.33, random_state=42)
    scaler = StandardScaler()
    X_train = scaler.fit_transform(X_train)
    X_test = scaler.transform(X_test)
    
    models = {'logistic regression': LogisticRegression().fit(X_train, y_train), 
              'decision tree': DecisionTreeClassifier().fit(X_train, y_train),
              'random forest': RandomForestClassifier().fit(X_train, y_train),
              'extra trees': ExtraTreesClassifier().fit(X_train, y_train), 
              'linear support vector': LinearSVC().fit(X_train, y_train), 
             'support vector': SVC().fit(X_train, y_train)}
    results_percents = []
    results_ll = []
    
    for model_name, model in models.items():
        print(model_name,'---------')
        percent_correct = sum(model.predict(X_test)==y_test)/len(y_test)
        percent_correct = round(percent_correct*100,1)
        print('     percent_correct = ',percent_correct)
        try: 
            ll = log_loss(y_pred=model.predict_proba(X_test), y_true=y_test)
            ll = round(ll,2)
            print('     log_loss = ', ll)
        except:
            ll = None
            pass
        results_percents.append(percent_correct)
        results_ll.append(ll)
    return models, results_percents, results_ll

In [26]:
def rand_for(X, y):
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.33, random_state=42)

    clf = RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=10, max_features='sqrt', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=4, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=1600, n_jobs=-1,
            oob_score=False, verbose=0,
            warm_start=False, random_state=42)
    clf.fit(X_train, y_train) 
    
    percent_correct = sum(clf.predict(X_test)==y_test)/len(y_test)
    ll = log_loss(y_pred = clf.predict_proba(X_test), y_true=y_test)
    print('    percent_correct = ',percent_correct*100)
    print('    log_loss = ', ll)



## Try and compare feature extraction functions

In [27]:
# add previous pts_for/ pts_against in previous n games

def for_against(df, lookbacks = [1]):
    '''function adds features based on points scored and 
    given up for a number of previous games'''
    # df is the starting dataframe
    # lookbacks is the # of games to lookback 
    # the length of lookbacks is the # of new features 
    # to add 
    
    results  = pd.DataFrame()
    
    # do not modify the original dataframe
    games = df.copy()
    
    average_pts_for = games.pts_for.mean() # for imputing the first value
    
    params = ['pts_for', 'pts_against']
    new_features = []
    
    # loop through the lookbacks
    for lookback in lookbacks:
        # lookbacks for points for and points againe
        for param in params:
            new_feature = param+'_roll_'+str(lookback)
            new_features.append(new_feature)
            # get the rolling average for each team
            for team in teams:
                rolling  = games[games.team == team][param].rolling(window = lookback, min_periods = 1).mean().tolist()
                # rolling average is inclusive - shift back and impute the first value as the global average
                rolling.insert(0,average_pts_for)
                del rolling[-1]
                games.loc[games.team == team, new_feature] = rolling

    # add opponents' pts for and against
    opp_features = []
    for feature in new_features:
        opp_features.append('opp_'+feature)
    col_names = {'team':'opponent'}
    col_names.update(dict(zip(new_features, opp_features)))
    games = games.merge(games[['schedule_season', 'schedule_week', 'team']+new_features
                     ].rename(columns  = col_names), 
                on = ['schedule_season', 'schedule_week', 'opponent'] )

    # keep home games separate out features and targets, and call the prection function
    features = new_features+opp_features
    target = 'won'
    X = games[games.home][features]
    y = games[games.home][target]
    print('lookbacks = ',lookbacks, '--------------------------')
    models, pcts, lls = bench_marks(X, y)
    
# the dataframe stuff wasn't quite working, fix later
#     lookbacks = lookbacks*len(models)
#     results['lookback'] = lookbacks
#     results['model'] = models
#     results['pct'] = pcts
#     results['ll'] = lls
#     return results
    
    

In [28]:
for l in range(5,13):
    lookbacks = [l, 4, 14]
    for_against(games, lookbacks = lookbacks)

lookbacks =  [5, 4, 14] --------------------------
logistic regression ---------
     percent_correct =  65.2
     log_loss =  0.63
decision tree ---------
     percent_correct =  55.0
     log_loss =  15.52
random forest ---------
     percent_correct =  58.1
     log_loss =  0.84
extra trees ---------
     percent_correct =  59.3
     log_loss =  0.91
linear support vector ---------
     percent_correct =  65.1
support vector ---------
     percent_correct =  65.0
lookbacks =  [6, 4, 14] --------------------------
logistic regression ---------
     percent_correct =  65.2
     log_loss =  0.63
decision tree ---------
     percent_correct =  55.2
     log_loss =  15.46
random forest ---------
     percent_correct =  58.6
     log_loss =  0.96
extra trees ---------
     percent_correct =  58.3
     log_loss =  0.89
linear support vector ---------
     percent_correct =  65.2
support vector ---------
     percent_correct =  65.0
lookbacks =  [7, 4, 14] --------------------------
logisti

In [29]:
# try a whole bunch
lookbacks = [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
for_against(games, lookbacks = lookbacks)

lookbacks =  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] --------------------------
logistic regression ---------
     percent_correct =  64.8
     log_loss =  0.63
decision tree ---------
     percent_correct =  56.2
     log_loss =  15.08
random forest ---------
     percent_correct =  59.0
     log_loss =  0.85
extra trees ---------
     percent_correct =  59.4
     log_loss =  0.88
linear support vector ---------
     percent_correct =  64.7
support vector ---------
     percent_correct =  64.5


After trying some combos  - 4 adn 14 seems to be the best.

HOWEVER ,the features do not add any accuracy above the point-spread linear regression model.  

Next steps:  

1.  come up with a new performance metric - only the top 20% of games in each week.  
2.  keep adding features to see if they ever start to improve the accuracy of the predictions.  
3.  tune hyperparameters and iterate again
4.  try only on the more recent seasons

Before changing the performance metric, let's try win/loss record features

In [30]:
# add win/loss in previous n games

def win_loss(df, lookbacks = [1]):
    '''function adds features based on win/loss
    for a number of previous games
    calls the prediction function and 
    prints the results'''
    # df is the starting dataframe
    # lookbacks is the # of games to lookback 
    # the length of lookbacks is the # of new features 
    # to add 
    
    results  = pd.DataFrame()
    
    # do not modify the original dataframe
    games = df.copy()
    
    
    average_val = 0.5   # for imputing the first value
    
    params = ['won']
    new_features = []
    
    # loop through the lookbacks
    for lookback in lookbacks:
        # lookbacks for points for and points againe
        for param in params:
            new_feature = param+'_roll_'+str(lookback)
            new_features.append(new_feature)
            # get the rolling average for each team
            for team in teams:
                rolling  = games[games.team == team][param
                                ].rolling(window = lookback, min_periods = 1).mean().tolist()
                # rolling average is inclusive - shift back and impute the first value as the global average
                rolling.insert(0,average_val)
                del rolling[-1]
                games.loc[games.team == team, new_feature] = rolling

    # add opponents' feature
    opp_features = []
    for feature in new_features:
        opp_features.append('opp_'+feature)
    col_names = {'team':'opponent'}
    col_names.update(dict(zip(new_features, opp_features)))
    games = games.merge(games[['schedule_season', 'schedule_week', 'team']+new_features
                     ].rename(columns  = col_names), 
                on = ['schedule_season', 'schedule_week', 'opponent'] )

    # keep home games separate out features and targets, and call the prection function
    features = new_features+opp_features
    target = 'won'
    X = games[games.home][features]
    y = games[games.home][target]
    print('lookbacks = ',lookbacks, '--------------------------')
    models, pcts, lls = bench_marks(X, y)
    return games
    
# the dataframe stuff wasn't quite working, fix later
#     lookbacks = lookbacks*len(models)
#     results['lookback'] = lookbacks
#     results['model'] = models
#     results['pct'] = pcts
#     results['ll'] = lls
#     return results
    
    

In [31]:
games.head()

Unnamed: 0,schedule_season,schedule_week,team,opponent,home,spread,pts_for,pts_against,won,team_game_count
7346,1979,1,ARI,DAL,True,4.0,21.0,22.0,False,0.0
3888,1979,1,ATL,NO,False,5.0,40.0,34.0,True,0.0
4784,1979,1,BUF,MIA,True,5.0,7.0,9.0,False,0.0
839,1979,1,CHI,GB,True,-3.0,6.0,3.0,True,0.0
5452,1979,1,CIN,DEN,False,3.0,0.0,10.0,False,0.0


In [32]:
lookbacks = [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
win_loss(games, lookbacks = lookbacks)

lookbacks =  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] --------------------------
logistic regression ---------
     percent_correct =  63.6
     log_loss =  0.64
decision tree ---------
     percent_correct =  55.0
     log_loss =  15.48
random forest ---------
     percent_correct =  56.5
     log_loss =  1.03
extra trees ---------
     percent_correct =  58.1
     log_loss =  1.13
linear support vector ---------
     percent_correct =  63.4
support vector ---------
     percent_correct =  63.9


Unnamed: 0,schedule_season,schedule_week,team,opponent,home,spread,pts_for,pts_against,won,team_game_count,...,opp_won_roll_5,opp_won_roll_6,opp_won_roll_7,opp_won_roll_8,opp_won_roll_9,opp_won_roll_10,opp_won_roll_11,opp_won_roll_12,opp_won_roll_13,opp_won_roll_14
0,1979,1,ARI,DAL,True,4.0,21.0,22.0,False,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
1,1979,1,ATL,NO,False,5.0,40.0,34.0,True,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
2,1979,1,BUF,MIA,True,5.0,7.0,9.0,False,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
3,1979,1,CHI,GB,True,-3.0,6.0,3.0,True,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
4,1979,1,CIN,DEN,False,3.0,0.0,10.0,False,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
5,1979,1,CLE,NYJ,False,2.0,25.0,22.0,True,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
6,1979,1,DAL,ARI,False,-4.0,22.0,21.0,True,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
7,1979,1,DEN,CIN,True,-3.0,10.0,0.0,True,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
8,1979,1,DET,TB,False,3.0,16.0,31.0,False,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000
9,1979,1,GB,CHI,False,3.0,3.0,6.0,False,0.0,...,0.5,0.500000,0.500000,0.500,0.500000,0.5,0.500000,0.500000,0.500000,0.500000


13 is the sweet spot - adding values doesn't seem to help

see how both pf/pa and w/l stack up 

In [33]:
# add previous pts_for/ pts_against in previous n games

def for_against_win_loss(df, lookbacks = [1], prediction_function = bench_marks):
    '''function adds features based on points scored and 
    given up for a number of previous games'''
    # df is the starting dataframe
    # lookbacks is the # of games to lookback 
    # the length of lookbacks is the # of new features 
    # to add 
    
    results  = pd.DataFrame()
    
    # do not modify the original dataframe
    games = df.copy()
    
    average_pts_for = games.pts_for.mean() # for imputing the first value
    average_val = 0.5   # for imputing the first value
    
    params = ['won','pts_for', 'pts_against']
    new_features = []
    
    # loop through the lookbacks
    for lookback in lookbacks:
        # lookbacks for points for and points againe
        for param in params:
            new_feature = param+'_roll_'+str(lookback)
            new_features.append(new_feature)
            # get the rolling average for each team
            for team in teams:
                rolling  = games[games.team == team][param
                            ].rolling(window = lookback, min_periods = 1).mean().tolist()
                # rolling average is inclusive - shift back and impute the first value as the global average
                if param == 'won':
                    rolling.insert(0,average_val)
                else:
                    rolling.insert(0,average_pts_for)
                del rolling[-1]
                games.loc[games.team == team, new_feature] = rolling

    # add opponents'
    opp_features = []
    for feature in new_features:
        opp_features.append('opp_'+feature)
    col_names = {'team':'opponent'}
    col_names.update(dict(zip(new_features, opp_features)))
    games = games.merge(games[['schedule_season', 'schedule_week', 'team']+new_features
                     ].rename(columns  = col_names), 
                on = ['schedule_season', 'schedule_week', 'opponent'] )

    # keep home games separate out features and targets, and call the prection function
    features = new_features+opp_features
    target = 'won'
    X = games[games.home][features]
    y = games[games.home][target]
    print('lookbacks = ',lookbacks, '--------------------------')
    prediction_function(X, y)
    
# the dataframe stuff wasn't quite working, fix later
#     lookbacks = lookbacks*len(models)
#     results['lookback'] = lookbacks
#     results['model'] = models
#     results['pct'] = pcts
#     results['ll'] = lls
#     return results
    
    

In [34]:
lookbacks = [4,14]
for_against_win_loss(games, lookbacks = lookbacks)

lookbacks =  [4, 14] --------------------------
logistic regression ---------
     percent_correct =  65.2
     log_loss =  0.63
decision tree ---------
     percent_correct =  55.5
     log_loss =  15.35
random forest ---------
     percent_correct =  58.0
     log_loss =  0.9
extra trees ---------
     percent_correct =  59.4
     log_loss =  0.94
linear support vector ---------
     percent_correct =  65.1
support vector ---------
     percent_correct =  64.4


No difference.  With point for/ points against, try weighting the quality of the other team.  

In [35]:
# add previous pts_for/ pts_against in previous n games

def for_against_weighted(df, lookbacks = [1], prediction_function = bench_marks):
    '''function adds features based on points scored and 
    given up for a number of previous games'''
    # df is the starting dataframe
    # lookbacks is the # of games to lookback 
    # the length of lookbacks is the # of new features 
    # to add 
    
    results  = pd.DataFrame()
    
    # do not modify the original dataframe
    games = df.copy()
    
    average_pts_for = games.pts_for.mean() # for imputing the first value
    
    params = ['pts_for', 'pts_against']
    new_features = []
    
    # initial loop through the lookbacks is for the baseline
    # strength only.  Used 14 as the loopback
    lookback = 14

    # lookbacks for points for and points against
    for param in params:
        new_feature = param+'_roll_'+str(lookback)
        new_features.append(new_feature)
        # get the rolling average for each team
        for team in teams:
            rolling  = games[games.team == team][param
                        ].rolling(window = lookback, min_periods = 1).mean().tolist()
            # rolling average is inclusive of the week's value - shift back and impute the first value as the global average
            rolling.insert(0,average_pts_for)
            del rolling[-1]
            games.loc[games.team == team, new_feature] = rolling

    # add opponents' pts for and against
    opp_features = []
    for feature in new_features:
        opp_features.append('opp_'+feature)
    col_names = {'team':'opponent'}
    col_names.update(dict(zip(new_features, opp_features)))
    games = games.merge(games[['schedule_season', 'schedule_week', 'team']+new_features
                     ].rename(columns  = col_names), 
                on = ['schedule_season', 'schedule_week', 'opponent'] )

    # add the adjusted pts_for and against
    games['pts_for_adj'] =  games.pts_for - games.opp_pts_against_roll_14
    games['pts_against_adj'] = games.pts_against - games.opp_pts_for_roll_14
    
    # add opponents' pts for and against
    opp_features = []
    for feature in new_features:
        opp_features.append('opp_'+feature)
    col_names = {'team':'opponent'}
    col_names.update(dict(zip(new_features, opp_features)))
    games = games.merge(games[['schedule_season', 'schedule_week', 'team']+new_features
                     ].rename(columns  = col_names), 
                on = ['schedule_season', 'schedule_week', 'opponent'] )
    
    
    average_pts_for = games.pts_for_adj.mean() # for imputing the first value
    
    params = ['pts_for_adj', 'pts_against_adj']
    new_features = []

    # loop through the lookbacks
    for lookback in lookbacks:
        # lookbacks for points for and points againe
        for param in params:
            new_feature = param+'_roll_'+str(lookback)
            new_features.append(new_feature)
            # get the rolling average for each team
            for team in teams:
                rolling  = games[games.team == team][param].rolling(window = lookback, min_periods = 1).mean().tolist()
                # rolling average is inclusive - shift back and impute the first value as the global average
                rolling.insert(0,average_pts_for)
                del rolling[-1]
                games.loc[games.team == team, new_feature] = rolling

    
    
    # add opponents' pts for and against (repeated from above, not good - use a separate function)
    opp_features = []
    for feature in new_features:
        opp_features.append('opp_'+feature)
    col_names = {'team':'opponent'}
    col_names.update(dict(zip(new_features, opp_features)))
    games = games.merge(games[['schedule_season', 'schedule_week', 'team']+new_features
                     ].rename(columns  = col_names), 
                on = ['schedule_season', 'schedule_week', 'opponent'] )
    
    
    # keep home games separate out features and targets, and call the prection function
    features = new_features+opp_features
    target = 'won'
    X = games[games.home][features]
    y = games[games.home][target]
    print('lookbacks = ',lookbacks, '--------------------------')
    prediction_function(X, y)
    
    return games
# the dataframe stuff wasn't quite working, fix later
#     lookbacks = lookbacks*len(models)
#     results['lookback'] = lookbacks
#     results['model'] = models
#     results['pct'] = pcts
#     results['ll'] = lls
#     return results
    
    

In [36]:
results = for_against_weighted(games, lookbacks = [4, 10, 14])


lookbacks =  [4, 10, 14] --------------------------
logistic regression ---------
     percent_correct =  65.1
     log_loss =  0.63
decision tree ---------
     percent_correct =  54.9
     log_loss =  15.56
random forest ---------
     percent_correct =  58.9
     log_loss =  1.01
extra trees ---------
     percent_correct =  59.8
     log_loss =  0.8
linear support vector ---------
     percent_correct =  65.0
support vector ---------
     percent_correct =  63.9


In [37]:
results = for_against_weighted(games, lookbacks = [4, 10, 14], 
                               prediction_function = rand_for)


lookbacks =  [4, 10, 14] --------------------------
    percent_correct =  64.3657281259461
    log_loss =  0.6339755441128201


Even the tuned random forest is no better than the logistic regression on the point spreads!  Try with the for against win loss feature engineering

In [38]:
for_against_win_loss(games, lookbacks = [4, 10, 14], 
                               prediction_function = rand_for)

lookbacks =  [4, 10, 14] --------------------------
    percent_correct =  64.27261613691931
    log_loss =  0.630165339492892


Wow - feature engineering and more complicated algorithms are getting me nowhere!  

Win/loss percentage, points for/ against, weighted points for/ points against, nothing seems to be better than logistic regression on the point spreads.   

Revisiting the action items above, many of them are fine-tuning engineered features, but the featrues are not helping.  

Maybe:  
* find new metrics for the quality of the prediction function
* perhaps only train against the most likely 20% of games for any week to improve prediction at that end of the games (e.g., the most likely 20%
* work on the look-ahead path-finding function but we can't use point-spreads in the long run - maybe try to get a percent based on stuff other than point spreads