## Machine Learning Final Project - NFL Betting with Multiple Linear Regression 

### Goal
Discover which features in the NFL data sets tend to have the most correlation with the over under line and the spread favorite

First, I imported the necessary libraries/packages

In [2047]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Here, I read both datasets into dataframes

In [2074]:
data1 = 'spreadspoke_scores_2004-05_to_2020-21.csv'
data2 = 'nfl_games_2-NEW.csv'
df1 = pd.read_csv(data1)
df2 = pd.read_csv(data2)
df1.columns = df1.columns.str.strip()
df2.columns = df2.columns.str.strip()
df1.columns, df2.columns
df1 = df1[0:256] #just 2004 season reg-season for simplicity of understanding operations
df2 = df2[0:256] #just 2004 season reg-season for simplicity of understanding operations
df1.columns, df2.columns

(Index(['schedule_date', 'schedule_season', 'schedule_week', 'schedule_playoff',
        'team_home', 'score_home', 'score_away', 'team_away',
        'team_favorite_id', 'spread_favorite', 'over_under_line', 'stadium',
        'stadium_neutral', 'weather_temperature', 'weather_wind_mph',
        'weather_humidity', 'weather_detail'],
       dtype='object'),
 Index(['Week', 'Day', 'Date', 'Time', 'Winner_tie', 'home_away', 'Loser_tie',
        'Unnamed: 7', 'WPts', 'LPts', 'YdsW', 'TOW', 'YdsL', 'TOL'],
       dtype='object'))

Before creating a key column, I checked if the columns that I was using to create the key had the same elements. (More on this action below.)

In [2075]:
df1.schedule_week.unique(), df2.Week.unique()

(array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
        '13', '14', '15', '16', '17'], dtype=object),
 array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
        '13', '14', '15', '16', '17'], dtype=object))

I corrected the corresponding values in the two data sets that would go into making the keys. For example, both week column's 'Wildcards' which are Wildcard and WildCard. This code is relevant and useful when the dataframe includes playoff weeks which were cut out to reduce complications sake.

In [2076]:
for i, w in enumerate(df1['schedule_week']):
    if w == 'WildCard':
        df1.loc[i,'schedule_week'] = 'Wildcard'
    elif w == 'SuperBowl':
        df1.loc[i,'schedule_week'] = 'Superbowl'

for i, w in enumerate(df2['Week']):
    if w == 'WildCard':
        df2.loc[i,'Week'] = 'Wildcard'
    elif w == 'ConfChamp':
        df2.loc[i,'Week'] = 'Conference'
    elif w == 'SuperBowl':
        df2.loc[i,'Week'] = 'Superbowl'

print(list(df2.Week.unique()) == list(df1.schedule_week.unique()))

two = list(df2.Winner_tie.unique())
one = list(df1.team_home.unique())
print(two.sort() == one.sort())
#we want both of the values below to be True
#when True, it means that the unique set of values of both data sets are the same

True
True


Next, I created new columns in the data frame. Below the '#' I added to the new features to fit with the theme of just home features, away features, and aggregate features, inside the data frame.

In [2077]:
df2['Home_N'] = 0
df2['Away_N'] = 0
df2['Yds_home'] = 0
df2['Yds_away'] = 0
df2['TO_home'] = 0
df2['TO_away'] = 0

above_stats = ['Yds_home', 'Yds_away', 'TO_home', 'TO_away']

#
for i, row in enumerate(df2.iterrows()):
    if df2.loc[i,'home_away'] == '@' or df2.loc[i,'home_away'] == 'N':
        df2.loc[i, 'Home_N'] = df2.loc[i, 'Loser_tie']
        df2.loc[i, 'Away_N'] = df2.loc[i, 'Winner_tie']
        df2.loc[i, 'Yds_home'] = df2.loc[i, 'YdsL']
        df2.loc[i, 'Yds_away'] = df2.loc[i, 'YdsW']
        df2.loc[i, 'TO_home'] = df2.loc[i, 'TOL']
        df2.loc[i, 'TO_away'] = df2.loc[i, 'TOW']
        
    elif df2.loc[i,'home_away'] != '@': #the text inside the string could be anything (not just @) because everything has been covered above. The key joining doesn't work if the expression says == '' though, so that's why it's expressed as != 'x' where x is a random letter, symbol, etc.
        df2.loc[i, 'Away_N'] = df2.loc[i, 'Loser_tie']
        df2.loc[i, 'Home_N'] = df2.loc[i, 'Winner_tie']
        df2.loc[i, 'Yds_away'] = df2.loc[i, 'YdsL']
        df2.loc[i, 'Yds_home'] = df2.loc[i, 'YdsW']
        df2.loc[i, 'TO_away'] = df2.loc[i, 'TOL']
        df2.loc[i, 'TO_home'] = df2.loc[i, 'TOW']

Then I made a separate year column for each data set because neither data set has a column that corresponds to just the year that the game was played. I made sure not to label both new columns the exact same thing; I created these column names with the lowercase and titlecase theme of each df in mind so I could know which data sets they were related to.

In [2078]:
df1.loc[df1['schedule_date'].notnull(), 'year'] = df1['schedule_date'].str[-4:]
df2.loc[df2['Date'].notnull(), 'Year'] = df2['Date'].str[0:4]

### Making a Key
To create a key, I first created a column named, 'key'. Then I let each element of the key be a string of values and dashes that made each row's key unique. the form of the key I made is "[year]-[week]-[home team]." As an example, the key could look like either of these examples: 

2022-15-Baltimore Ravens

2004-Wildcard-Indianapolis Colts

In [2079]:
df1['key'] = ''

for i, row in df1.iterrows():
    df1['key'][i] = str(df1.loc[i,'year'])+'-'+str(df1.loc[i,'schedule_week'])+'-'+str(df1.loc[i,'team_home'])

df2['key'] = ''
for i, row in df2.iterrows():
    df2['key'][i] = str(df2.loc[i,'Year'])+'-'+str(df2.loc[i,'Week'])+'-'+str(df2.loc[i,'Home_N'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df1['key'][i] = str(df1.loc[i,'year'])+'-'+str(df1.loc[i,'schedule_week'])+'-'+str(df1.loc[i,'team_home'])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2['key'][i] = str(df2.loc[i,'Year'])+'-'+str(df2.loc[i,'Week'])+'-'+str(df2.loc[i,'Home_N'])


I then reseted the index of the data set (back to 0,1,2,...) to use dfloc[i,'x'] ahead, where x is the name of a column in the dataframe AKA a feature.

In [2080]:
df = df1.set_index('key').join(df2.set_index('key'))
df.reset_index(inplace = True, drop = True)

###  Creating Additional Statistical Features
The docstring covers what's going on here, but the purpose of creating these new stat columns is to use them in the future to find their correlation to a certain variable and to use them in the linear regression model. I also attempted to not create too much noise (which basically means, too much non-useful data that will reduce the model's accuracy) by creating all these columns. My goal was to add columns of valueable statistics to the dataframe which would help in find greater correlations between the target variables, spread_favorite and over_under_line, and thereby, likely creating a more accurate model.

In [2081]:
def statistor_STD(dateframe):
    '''iterates through each row and creates a new stat column (points, yards, turnovers),
    both FOR and AGAINST both the home and away team, season to date'''
    ha_list = ['_home', '_away']
    stats = [('score', 'P'), ('Yds', 'Y'), ('TO', 'TO')]
    for_away = ['F', 'A']
    for i, row in enumerate(df.iterrows()):
        for ha in ha_list:
            for stat in stats:
                home_sum = df[(df.team_home==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_home'].sum()
                away_sum = df[(df.team_away==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_home'].sum()
                df.loc[i,'S'+stat[1]+'F'+ha] = home_sum+away_sum
                #points against home STD
                home_sum = df[(df.team_home==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_away'].sum()
                away_sum = df[(df.team_away==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_away'].sum()
                df.loc[i,'S'+stat[1]+'A'+ha] = home_sum+away_sum
                #point differential home STD
                df.loc[i,'S'+stat[1]+'D'+ha] = df.loc[i,'S'+stat[1]+'F'+ha]-df.loc[i,'S'+stat[1]+'A'+ha]

statistor_STD(df)

Below is another stat column making function that is slightly different from the first: it creates related but slightly different values than the first function to be specific.

In [2082]:
#difference explained in docstring
def statistor_5_recent(dateframe):
    '''iterates through each row and creates a new stat column (points, yards, turnovers), 
    both FOR and AGAINST both the home and away team, from the five most recent weeks'''
    ha_list = ['_home', '_away']
    stats = [('score', 'P'), ('Yds', 'Y'), ('TO', 'TO')]
    for_away = ['F', 'A']
    for i, row in enumerate(df.iterrows()):
        for ha in ha_list:
            for stat in stats:
                if int(df.loc[i,'Week'])-5 >= 1:
                    home_sum = df[(df.team_home==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.Week.astype('int') > int(df.loc[i,'Week'])-5)&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_home'].sum()
                    away_sum = df[(df.team_away==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.Week.astype('int') > int(df.loc[i,'Week'])-5)&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_home'].sum()
                    df.loc[i,'5G'+stat[1]+'F'+ha] = home_sum+away_sum
                    #points against home STD
                    home_sum = df[(df.team_home==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.Week.astype('int') > int(df.loc[i,'Week'])-5)&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_away'].sum()
                    away_sum = df[(df.team_away==df.loc[i, 'team'+ha])&(df.Week.astype('int') < int(df.loc[i,'Week']))&(df.Week.astype('int') > int(df.loc[i,'Week'])-5)&(df.schedule_season==df.loc[i, 'schedule_season'])][stat[0]+'_away'].sum()
                    df.loc[i,'5G'+stat[1]+'A'+ha] = home_sum+away_sum
                    #point differential home STD
                    df.loc[i,'5G'+stat[1]+'D'+ha] = df.loc[i,'5G'+stat[1]+'F'+ha]-df.loc[i,'5G'+stat[1]+'A'+ha]

statistor_5_recent(df)

Here I dropped all of the irrelavent columns that didn't follow the main framework of home, away, aggregate.

In [2083]:
cols = ['year', 'Week', 'Date', 'Winner_tie', 'home_away', 'Loser_tie', 'Unnamed: 7', 'WPts',
       'LPts', 'YdsW', 'TOW', 'YdsL', 'TOL']

for c in cols:
    df = df.drop(columns=c)

### One-hot matrixing categorical variables

The code below first counts the frequency of the values, selects all of them, then extracts the name of index. I call this the mask. Then get_dummies is used on the filtered categorical values using the mask that was just created. Finally, the new dummy columns are merged with the data frame and the old categorical columns are dropped.

In [2084]:
categorical_columns = ['schedule_date', 'schedule_week', 'schedule_playoff', 'team_home', 'team_away', 'team_favorite_id', 'stadium', 'stadium_neutral', 'weather_detail', 'Day', 'Time', 'Home_N', 'Away_N', 'Year']

for column in categorical_columns:
    vals = df[column]
    
    mask = pd.value_counts(vals).nlargest(len(categorical_columns)).index
    #IRRELEVANT INFO: mask = pd.value_counts(vals).nlargest(100).index #Counts the frequency of the values, selects the 100 largest by scope generosity, then extracts the name of index
    #print(column,mask)

    encoded_column = pd.get_dummies(pd.Categorical(vals, categories=mask), dtype=np.int64)

    df = pd.merge(left=df,right=encoded_column,left_index=True,right_index=True,)

    df = df.drop(columns=column)

Here I filled any NaN values in the data frame with zero.

In [2085]:
#this gives an error, but continue running other cells after running this one, but at least it works in changing
#the NaN values in the 5G (5 most recent games) features.
for col in df.columns:
    df[col]=df[col].fillna(df[col].mode()[0])

KeyError: 0

### Multiple Linear Regression #1

The code below displays the R^2, the adjusted R^2 value, and the first 20 coefficients of the linear regression equation of a linear regression model with target variables, spread_favorite and over_under_line, predictor variables containing every feature from the data frame, except the target variables and any variables that would have been unknown before the game of each row. The correlation between the target variables and the rest of the values in the data frame are shown below each linear regression cell.

In [2061]:
model = LinearRegression()
X = df.drop(['over_under_line', 'spread_favorite', 'score_home', 'score_away', 'Yds_home', 'Yds_away', 'TO_home', 'TO_away'], axis = 1)
y = df['spread_favorite']
model.fit(X, y)
print(f"R^2: {model.score(X,y)}")
print(f"Adjusted R^2: {1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)}")
print(model.coef_[0:20],'...')
#df = df.drop(['over_under_line', 'score_home', 'score_away', 'WPts', 'LPts', 'YdsW', 'YdsL', 'TOW', 'TOL', 'Yds_home', 'Yds_away', 'TO_home', 'TO_away', ], axis = 1)

R^2: 0.5721744705585227
Adjusted R^2: -0.38095582288071794
[ 6.98544507e+10  4.62569825e-02  4.86838852e-02 -3.28479017e-02
  2.60448116e+10 -2.60448116e+10 -2.60448116e+10 -4.19627246e+10
  4.19627246e+10  4.19627246e+10  1.79630294e+10 -1.79630294e+10
 -1.79630294e+10  4.91994076e+10 -4.91994076e+10 -4.91994076e+10
 -4.28065430e+10  4.28065430e+10  4.28065430e+10  2.37440673e+08] ...


In [2062]:
df.corr()['spread_favorite'].sort_values(ascending=False)[0:25]#['STOA_away']

spread_favorite          1.000000
4:15PM                   0.145898
Oakland Coliseum         0.144405
5GPD_away                0.129908
weather_temperature      0.126258
11/14/2004               0.115749
5GPD_home                0.114634
Tennessee Titans_x       0.089580
Tennessee Titans_x       0.089580
8:35PM                   0.087561
1                        0.086557
10/31/2004               0.084715
weather_wind_mph         0.082987
2                        0.075297
Soldier Field            0.073916
EverBank Field           0.073916
2004                     0.073890
Yds_away                 0.072845
score_away               0.071894
8                        0.071266
Baltimore Ravens_y       0.070000
San Francisco 49ers_x    0.070000
Baltimore Ravens_y       0.070000
San Francisco 49ers_x    0.070000
4                        0.068269
Name: spread_favorite, dtype: float64

In [2063]:
y = df['over_under_line']
model.fit(X, y)
print(f"R^2: {model.score(X,y)}")
print(f"Adjusted R^2: {1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)}")
print(model.coef_[0:20],'...')
#df = df.drop(['over_under_line', 'score_home', 'score_away', 'WPts', 'LPts', 'YdsW', 'YdsL', 'TOW', 'TOL', 'Yds_home', 'Yds_away', 'TO_home', 'TO_away', ], axis = 1)

R^2: 0.7603120694676302
Adjusted R^2: 0.22632376853475566
[ 1.17853570e+11  5.96353844e-02  1.31538708e-02 -3.81317800e-02
  4.81131197e+09 -4.81131197e+09 -4.81131197e+09 -7.75186100e+09
  7.75186100e+09  7.75186100e+09  3.31834761e+09 -3.31834761e+09
 -3.31834761e+09  9.08870842e+09 -9.08870842e+09 -9.08870842e+09
 -7.90774132e+09  7.90774132e+09  7.90774132e+09  4.38629072e+07] ...


In [2064]:
df.corr()['over_under_line'].sort_values(ascending=False)[0:25]#['STOA_away']

over_under_line                 1.000000
DOME                            0.459762
Yds_away                        0.434906
IND                             0.395454
1:00PM                          0.371850
Yds_home                        0.355279
5GYA_home                       0.336483
5GYF_away                       0.331813
Indianapolis Colts_x            0.325816
Indianapolis Colts_x            0.325816
5GYA_away                       0.321733
5GPF_home                       0.319426
KC                              0.310738
5GPA_away                       0.308827
5GPA_home                       0.305350
score_home                      0.293524
score_away                      0.291394
5GYF_home                       0.262275
5GPF_away                       0.255978
MIN                             0.251795
SPA_home                        0.244134
SPF_home                        0.228067
Hubert H. Humphrey Metrodome    0.226629
Minnesota Vikings_x             0.226629
Minnesota Viking

### Results #1

The results from the four cells above tell me that clearly, the model is not very accurate in predicting the spread_favorite or the over_under_line for the 2004 season, but the over_underline has a better prediction. The purpose of integrating the extra stat values was to provide more correlation in the model.

I slightly achieved this by having values like 
5GYA_home
5GYF_away                       
5GYA_away                       
5GPF_home                       
5GPA_away
5GPA_home                      
5GYF_home                       
5GPF_away                                                    
SPA_home                        
SPF_home

which all correlated from 0.22-0.33 for the over_under_line. Still these values are not ideal for a linear regression model, which is usually only useful when there are multiple values of high correlation.

### Multiple Linear Regression #2

The code below displays the R^2, the adjusted R^2 value, and the first 20 coefficients of the linear regression equation of a linear regression model with target variables, spread_favorite and over_under_line and predictor variables containing every feature from the data frame, except the target variables and any variables that would have been unknown before the game of each row. There is a slight difference in these linear regression models and the models above: each of the features drawn used in the model do not contain values from the first five weeks of the season. The purpose of this was to find how the '5G' columns would correlate with the target variables and influence the model's R^2 and adjust R^2 values. The correlation between the target variables and the rest of the values in the data frame are shown below each linear regression cell.

In [2091]:
df = df[75:] #restricts the data frame to data past first five weeks.
#Run this instead of the four code cells above and then continue with running cells below for accurate results.

In [2087]:
model = LinearRegression()
X = df.drop(['over_under_line', 'spread_favorite', 'score_home', 'score_away', 'Yds_home', 'Yds_away', 'TO_home', 'TO_away'], axis = 1)
y = df['spread_favorite']
model.fit(X, y)
print(f"R^2: {model.score(X,y)}")
print(f"Adjusted R^2: {1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)}")
print(model.coef_[0:20],'...')
#df = df.drop(['over_under_line', 'score_home', 'score_away', 'WPts', 'LPts', 'YdsW', 'YdsL', 'TOW', 'TOL', 'Yds_home', 'Yds_away', 'TO_home', 'TO_away', ], axis = 1)

R^2: 0.7366969383557912
Adjusted R^2: -10.848637773989395
[-3.15809516e-10  4.24788821e-02  5.30068485e-02 -1.04791659e-02
 -1.13951973e+11  1.13951973e+11  1.13951973e+11 -6.90095857e+10
  6.90095857e+10  6.90095857e+10  1.11760384e+10 -1.11760384e+10
 -1.11760384e+10  2.45597864e+10 -2.45597864e+10 -2.45597864e+10
 -3.45428246e+10  3.45428246e+10  3.45428246e+10  3.47171282e+08] ...


In [2088]:
df.corr()['spread_favorite'].sort_values(ascending=False)[0:25]#['STOA_away']

spread_favorite        1.000000
5GPD_home              0.215888
4:15PM                 0.179694
Oakland Coliseum       0.166177
11/14/2004             0.156600
5GPD_away              0.155893
KC                     0.122399
10/31/2004             0.119870
Soldier Field          0.107907
8                      0.104832
Tennessee Titans_x     0.102609
Tennessee Titans_x     0.102609
10/17/2004             0.101925
8:35PM                 0.094226
11/7/2004              0.094159
Yds_away               0.093842
6                      0.090486
Arizona Cardinals_x    0.086718
Arizona Cardinals_x    0.086718
weather_temperature    0.086619
SYD_home               0.086586
Sun                    0.083520
9                      0.079983
EverBank Field         0.076123
Dallas Cowboys_x       0.070826
Name: spread_favorite, dtype: float64

In [2089]:
y = df['over_under_line']
model.fit(X, y)
print(f"R^2: {model.score(X,y)}")
print(f"Adjusted R^2: {1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)}")
print(model.coef_[0:20],'...')
#df = df.drop(['over_under_line', 'score_home', 'score_away', 'WPts', 'LPts', 'YdsW', 'YdsL', 'TOW', 'TOL', 'Yds_home', 'Yds_away', 'TO_home', 'TO_away', ], axis = 1)

R^2: 0.9445773216755396
Adjusted R^2: -1.4940205246007157
[-2.46879307e-09  7.28007522e-02 -3.79072609e-02 -2.26271170e-02
 -5.35949570e+10  5.35949570e+10  5.35949570e+10 -3.24572335e+10
  3.24572335e+10  3.24572335e+10  5.25641885e+09 -5.25641885e+09
 -5.25641885e+09  1.15511883e+10 -1.15511883e+10 -1.15511883e+10
 -1.62465042e+10  1.62465042e+10  1.62465042e+10  1.63284843e+08] ...


In [2090]:
df.corr()['over_under_line'].sort_values(ascending=False)[0:25]#['STOA_away']

over_under_line                 1.000000
DOME                            0.468888
Yds_away                        0.442880
IND                             0.416175
1:00PM                          0.378664
5GYA_away                       0.372467
5GYA_home                       0.365234
5GYF_away                       0.358366
5GPF_home                       0.354319
Yds_home                        0.354006
5GPA_away                       0.334817
KC                              0.329203
5GPA_home                       0.325814
Indianapolis Colts_x            0.318662
Indianapolis Colts_x            0.318662
SPA_home                        0.308636
5GPF_away                       0.307571
SPF_home                        0.283094
5GYF_home                       0.278855
SPA_away                        0.276424
score_away                      0.273835
score_home                      0.262626
MIN                             0.255169
Minnesota Vikings_x             0.255065
Hubert H. Humphr

Again, the results from the four cells above tell me that clearly, the model is not very accurate in predicting the spread_favorite or the over_under_line for the 2004 season, but again, the over_underline has a better prediction. The purpose of integrating the extra stat values was to provide more correlation in the model.

I slightly achieved this by having values like 
5GYA_home
5GYF_away                       
5GYA_away                       
5GPF_home                       
5GPA_away
5GPA_home                      
5GYF_home                       
5GPF_away                                                    
SPA_home                        
SPF_home
for the over_under_line which all correlated from 0.27-0.37. But this, again, is not ideal for a linear regression model.

Of course the model performed worse in these cases with less data (five weeks worth less),
but it makes sense that there was high correlation for the features starting with '5' because they did not
have any data for the first five weeks of the season

### Final Thoughts
This project did not give me the results I wanted, but I still learned plenty of valuable information about using and manipulating data frames. In specific, I gained a lot of applicable knowledge about keys, the relationship between predictor variables and target variables in a linear regression model (and models in general), and finally creating new numerical variables within a data frame. I'm feel very accomplished about the work I did on this project and the problems I solved throughout the process, and I hope to revisit the project sometime, maybe if I get into the world of sports betting.