# NFL Gambling

### Problem Statement:

I will build a model that predicts the outcome of NFL games in attempt to beat the bookie margin. The guiding metric will be accuracy of the model above the margin and positive betting results.

### Methods/Models:

Going to use historical gambling lines and historical stats and team rankings to create a classification model that predicts 1,0 for whether or not the home team will cover the line. Also will look at using a regression model to predict the line for each game. Using these predictions I will create a model to pick which matchups to gamble on.

### Risks & Assumptions:

Risk: NFL games are random enough and betting lines are sophisticated enough that I won't be able to create a model that beats the bookie margin

### Data

In [23]:
import pandas as pd
import numpy as np
import datetime as dt

#so I can see all the columns
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 500)

import warnings
warnings.filterwarnings('ignore')

In [24]:
df_all = pd.read_csv('./nfl (1).csv')
df_teams = pd.read_csv('./nfl_teams.csv')
df_extras = pd.read_csv('./spreadspoke_scores.csv')

In [25]:
df_all.columns = [x.lower().replace(' ', '_') for x in df_all.columns]

In [26]:
df = df_all[['date', 'home_team', 'away_team', 'home_score', 'away_score', 'home_odds_open',
            'away_odds_open', 'home_line_open', 'total_score_open']]

In [27]:
df['home_money_line'] = [round(((i-1)*100)) if i > 2 else round((-100)/(i-1)) for i in df['home_odds_open'] ]
df['away_money_line'] = [round(((i-1)*100)) if i > 2 else round((-100)/(i-1)) for i in df['away_odds_open'] ]
df['bookmaker_margin'] = round(((1/df['home_odds_open'])*100) + ((1/df['away_odds_open'])*100), 2)
df['home_team_id'] = df['home_team'].map(df_teams.set_index('team_name')['team_id'].to_dict())
df['away_team_id'] = df['away_team'].map(df_teams.set_index('team_name')['team_id'].to_dict())
df['winner'] = np.where(df['home_score'] > df['away_score'], df['home_team_id'], df['away_team_id'])
df['home_line_actual'] = df['away_score'] - df['home_score']
df['home_line_diff'] = df['home_line_actual'] - df['home_line_open']
df['home_line_cover'] = np.where((df['home_line_actual'] < df['home_line_open']), 1, 0)

In [28]:
df['date'] = pd.to_datetime(df['date'])

In [29]:
df['month'] = df.date.map(lambda x: x.month)
df['year'] = df.date.map(lambda x: x.year)
df['day'] = df.date.map(lambda x: x.day)

In [30]:
df_extras=df_extras.rename(columns = {'schedule_date':'date', 'team_home':'home_team', 'team_away':'away_team'})

In [56]:
df_extras[(df_extras['home_team']=='Jacksonville Jaguars') & (df_extras['away_team']=='Baltimore Ravens')] 

Unnamed: 0,date,schedule_season,schedule_week,home_team,away_team,stadium,team_favorite_id,spread_favorite,over_under_line,weather_detail,weather_temperature,weather_wind_mph,weather_humidity,score_home,score_away,stadium_neutral,schedule_playoff
6480,11/10/1996,1996,11,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,JAX,-3.5,44.5,,53.0,8.0,47.0,30.0,27.0,False,False
6778,11/30/1997,1997,14,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,JAX,-8.0,43.5,,71.0,14.0,89.0,29.0,27.0,False,False
6877,09/20/1998,1998,3,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,JAX,-7.0,42.0,,81.0,9.0,90.0,24.0,10.0,False,False
7228,11/14/1999,1999,11,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,JAX,-13.0,37.0,,64.0,6.0,64.0,6.0,3.0,False,False
7432,10/08/2000,2000,6,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,JAX,-2.5,37.5,,71.0,12.0,79.0,10.0,15.0,False,False
7762,11/25/2001,2001,11,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,BAL,-3.0,34.0,,72.0,8.0,80.0,21.0,24.0,False,False
8809,11/13/2005,2005,10,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,JAX,-8.0,34.0,,70.0,6.0,66.0,30.0,3.0,False,False
10377,10/24/2011,2011,7,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,BAL,-10.5,39.0,,69.0,0.0,77.0,12.0,7.0,False,False
11649,09/25/2016,2016,3,Jacksonville Jaguars,Baltimore Ravens,EverBank Field,BAL,-2.0,45.0,,87.0,4.0,,17.0,19.0,False,False
11915,09/24/2017,2017,3,Jacksonville Jaguars,Baltimore Ravens,Wembley Stadium,BAL,-4.0,38.0,,57.0,4.0,,44.0,7.0,True,False


In [None]:
df.loc[df.a == 3, 'b'] = 6

In [55]:
df_extras.loc[11915]['date'] = '9/23/2017'

In [59]:
df_extras.loc[11915]['date'].replace('09/24/2017', '09/23/2017')

TypeError: replace() takes no keyword arguments

In [9]:
df_extras['date'] = pd.to_datetime(df_extras['date'])

In [10]:
df = df.merge(df_extras.drop_duplicates(subset=['date', 'home_team']), how='outer', on=['date', 'home_team', 'away_team'])

In [11]:
df.set_index('date', inplace=True)

In [22]:
df['2017'].sort_values('home_score')

Unnamed: 0_level_0,home_team,away_team,home_score,away_score,home_odds_open,away_odds_open,home_line_open,total_score_open,home_money_line,away_money_line,bookmaker_margin,home_team_id,away_team_id,winner,home_line_actual,home_line_diff,home_line_cover,month,year,day,schedule_season,schedule_week,stadium,team_favorite_id,spread_favorite,over_under_line,weather_detail,weather_temperature,weather_wind_mph,weather_humidity,score_home,score_away,stadium_neutral,schedule_playoff
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1
2017-11-19,Green Bay Packers,Baltimore Ravens,0.0,23.0,2.06,1.85,2.0,38.5,106.0,-118.0,102.6,GB,BAL,BAL,23.0,21.0,0.0,11.0,2017.0,19.0,2017.0,11,Lambeau Field,BAL,-2.0,38.0,,27.0,15.0,,0.0,23.0,False,False
2017-12-23,Green Bay Packers,Minnesota Vikings,0.0,16.0,4.3,1.26,9.0,40.5,330.0,-385.0,102.62,GB,MIN,MIN,16.0,7.0,0.0,12.0,2017.0,23.0,2017.0,16,Lambeau Field,MIN,-8.5,41.0,,10.0,6.0,,0.0,16.0,False,False
2017-09-10,Cincinnati Bengals,Baltimore Ravens,0.0,20.0,1.7,2.29,-3.0,42.5,-143.0,129.0,102.49,CIN,BAL,BAL,20.0,23.0,0.0,9.0,2017.0,10.0,2017.0,1,Paul Brown Stadium,CIN,-3.0,42.5,,71.0,8.0,,0.0,20.0,False,False
2017-09-30,Miami Dolphins,New Orleans Saints,0.0,20.0,2.25,1.72,2.5,50.0,125.0,-139.0,102.58,MIA,NO,NO,20.0,17.5,0.0,9.0,2017.0,30.0,,,,,,,,,,,,,,
2017-10-22,Indianapolis Colts,Jacksonville Jaguars,0.0,27.0,2.44,1.63,3.0,43.5,144.0,-159.0,102.33,IND,JAX,JAX,27.0,24.0,0.0,10.0,2017.0,22.0,2017.0,7,Lucas Oil Stadium,JAX,-3.0,41.5,DOME,72.0,0.0,,0.0,27.0,False,False
2017-12-31,Philadelphia Eagles,Dallas Cowboys,0.0,6.0,2.4,1.65,0.0,43.0,140.0,-154.0,102.27,PHI,DAL,DAL,6.0,6.0,0.0,12.0,2017.0,31.0,2017.0,17,Lincoln Financial Field,DAL,-3.5,41.0,,19.0,17.0,,0.0,6.0,False,False
2017-12-03,Buffalo Bills,New England Patriots,3.0,23.0,4.38,1.26,7.5,49.5,338.0,-385.0,102.2,BUF,NE,NE,20.0,12.5,0.0,12.0,2017.0,3.0,2017.0,13,New Era Field,NE,-9.0,48.0,,48.0,13.0,,3.0,23.0,False,False
2017-10-29,Tampa Bay Buccaneers,Carolina Panthers,3.0,17.0,1.75,2.2,-2.5,44.0,-133.0,120.0,102.6,TB,CAR,CAR,14.0,16.5,0.0,10.0,2017.0,29.0,2017.0,8,Raymond James Stadium,TB,-2.0,46.0,,68.0,18.0,,3.0,17.0,False,False
2017-09-10,San Francisco 49ers,Carolina Panthers,3.0,23.0,3.31,1.38,6.0,48.0,231.0,-263.0,102.68,SF,CAR,CAR,20.0,14.0,0.0,9.0,2017.0,10.0,2017.0,1,Levi's Stadium,CAR,-5.5,48.0,,89.0,4.0,,3.0,23.0,False,False
2017-12-25,Houston Texans,Pittsburgh Steelers,6.0,34.0,4.46,1.25,10.0,44.0,346.0,-400.0,102.42,HOU,PIT,PIT,28.0,18.0,0.0,12.0,2017.0,25.0,2017.0,16,NRG Stadium,PIT,-8.0,45.5,DOME,72.0,0.0,,6.0,34.0,False,False
