# Build Fct Tables

I started by setting up the boxscore and scoring data tables in 'game_stats.ipynb'

This file will be for the other fact tables. Currently one dedicated to betting/gambling odds. Another for the record of each team.

## Imports

In [1]:
import pandas as pd
import json
import os
import requests
from datetime import datetime

from dotenv import load_dotenv

## API Credentials

In [2]:
#Get API Key and host
load_dotenv()
api_token = os.getenv('nfl_api_key')
api_host = os.getenv('rapid_api_host')

#rapidapi headers
headers = {
	"X-RapidAPI-Key": "{key}".format(key=api_token),
	"X-RapidAPI-Host": "{host}".format(host=api_host)
}

## Betting

For this the endpoint is 'Get NFL Betting Odds'. I will use to to save off the different spreads and odds for each game, by vendor.

It takes the game id to pull, so we will have to loop through each game to pull all the records. Here I'll test with one game.

Unfortunately, it looks like only 2023 data for this is available at this point.

### Extract

In [3]:
#re-create sportbook mapping in dim_tables. This will be import in the final product.
sportsbook_dict = {
    "betmgm": 1,
    "bet365": 2,
    "fanduel": 3,
    "wynnbet": 4,
    "unibet": 5,
    "pointsbet": 6,
    "betrivers": 7,
    "ceasars_sportsbook": 8,
    "draftkings": 9   
}

In [4]:
#API Params
#API Endpoint for NFL Team Info
url = "https://tank01-nfl-live-in-game-real-time-statistics-nfl.p.rapidapi.com/getNFLBettingOdds"

#pull in only main team info, none of the rosters, schedules, stats, etc.
querystring = {"gameID":"20231001_ARI@SF"}

In [5]:
#get betting data
response = requests.get(url, headers=headers, params=querystring)

In [6]:
response.json()['body']['20231001_ARI@SF']['betmgm']

{'totalUnder': '44',
 'awayTeamSpread': '+14.5',
 'awayTeamSpreadOdds': '-110',
 'homeTeamSpread': '-14.5',
 'homeTeamSpreadOdds': '-110',
 'totalOverOdds': '-110',
 'totalUnderOdds': '-110',
 'awayTeamMLOdds': '+700',
 'homeTeamMLOdds': '-1100',
 'totalOver': '44'}

### Transform

This will be fun - I think I need to iterrate through the key value pairs in the sportsbook dict to save the primary key and drill into each record.

I will also need to include the game id from the API call itself and the timestamp from the root of the body.

In the future if more get added to the API I'll need to catch it and add to the list. Maybe by checking the number of records compared to our dim_sportsbook table.

In [7]:
#setup body of json
game_odds = response.json()['body']['20231001_ARI@SF']

In [8]:
#view path for updated time
game_odds['last_updated_e_time']

'1696202171.0530152'

In [9]:
#view path for a sportsbook
game_odds['betmgm']

{'totalUnder': '44',
 'awayTeamSpread': '+14.5',
 'awayTeamSpreadOdds': '-110',
 'homeTeamSpread': '-14.5',
 'homeTeamSpreadOdds': '-110',
 'totalOverOdds': '-110',
 'totalUnderOdds': '-110',
 'awayTeamMLOdds': '+700',
 'homeTeamMLOdds': '-1100',
 'totalOver': '44'}

In [10]:
#setup empty list for savings the data in individual records
betting_data_list = []

In [11]:
#loop through sportsbook dict to create data for each game
for key, value in sportsbook_dict.items():
    
    #set updated time
    updated_time = game_odds['last_updated_e_time']

    try: #try to grab data for each sportsbook, but pass if that sportsbook doesn't have a record
        #drill into sportsbook
        sportsbook_data = game_odds['{sportsbook}'.format(sportsbook=key)]

        #create dictionary to append to list
        betting_odds = {
            "game_id": "20231001_ARI@SF",
            "last_update": updated_time,
            "sportsbook_id": value,
            "total_over_under": sportsbook_data['totalOver'],
            "over_odds": sportsbook_data['totalOverOdds'],
            "under_odds": sportsbook_data['totalUnderOdds'],
            "home_spread": sportsbook_data['homeTeamSpread'],
            "home_odds": sportsbook_data['homeTeamSpreadOdds'],
            "away_spread": sportsbook_data['awayTeamSpread'],
            "away_odds": sportsbook_data['awayTeamSpreadOdds'],
            "home_ml_odds": sportsbook_data['homeTeamMLOdds'],
            "away_ml_odds": sportsbook_data['awayTeamMLOdds']
        }

        betting_data_list.append(betting_odds)
    
    except KeyError:
        pass

In [13]:
dtype_mapping = {
    "game_id": 'object',
    "last_update": 'float',  # Update to datetime after mapping
    "sportsbook_id": 'int64', 
    "total_over_under": 'float',
    "over_odds": 'object',
    "under_odds": 'object',
    "home_spread": 'object',
    "home_odds": 'object',
    "away_spread": 'object',
    "away_odds": 'object',
    "home_ml_odds": 'object',
    "away_ml_odds": 'object'
} 

In [14]:
betting_df = pd.DataFrame(betting_data_list)
betting_df = betting_df.astype(dtype_mapping)

In [15]:
betting_df['last_update'] = pd.to_datetime(betting_df['last_update'], unit='s')

In [16]:
betting_df.head()

Unnamed: 0,game_id,last_update,sportsbook_id,total_over_under,over_odds,under_odds,home_spread,home_odds,away_spread,away_odds,home_ml_odds,away_ml_odds
0,20231001_ARI@SF,2023-10-01 23:16:11.053015296,1,44.0,-110,-110,-14.5,-110,14.5,-110,-1100,700
1,20231001_ARI@SF,2023-10-01 23:16:11.053015296,2,44.0,-110,-110,-15.0,-110,15.0,-110,-1100,700
2,20231001_ARI@SF,2023-10-01 23:16:11.053015296,3,43.5,-114,-106,-14.5,-115,14.5,-105,-1200,750
3,20231001_ARI@SF,2023-10-01 23:16:11.053015296,5,44.0,-106,-115,-14.5,-109,14.5,-112,-1000,700
4,20231001_ARI@SF,2023-10-01 23:16:11.053015296,6,44.0,-110,-110,-16.0,-110,16.0,-110,-1099,700


In [12]:
betting_data_list[:2]

[{'game_id': '20231001_ARI@SF',
  'last_update': '1696202171.0530152',
  'sportsbook_id': 1,
  'total_over_under': '44',
  'over_odds': '-110',
  'under_odds': '-110',
  'home_spread': '-14.5',
  'home_odds': '-110',
  'away_spread': '+14.5',
  'away_odds': '-110',
  'home_ml_odds': '-1100',
  'away_ml_odds': '+700'},
 {'game_id': '20231001_ARI@SF',
  'last_update': '1696202171.0530152',
  'sportsbook_id': 2,
  'total_over_under': '44',
  'over_odds': '-110',
  'under_odds': '-110',
  'home_spread': '-15',
  'home_odds': '-110',
  'away_spread': '+15',
  'away_odds': '-110',
  'home_ml_odds': '-1100',
  'away_ml_odds': '+700'}]

That looks pretty good for now.

## Record

The same place we pulled the teams from has the record information. I could also just do a transformation from all of my game data. It allows pulling all the data in one API call. Which seems like it should be less of a processing workload than calculating in the background. I'll go with that.

The weeks of the season covered by this data isn't included. We can't use the record because of varying bye-weeks. My idea is to add in the datetime the data is ran, then when aggregating all the data I can check the run data and only flag the max as active vs. non. Then if I want to join it to another table I can filter to the record data that is flagged as the most recent.

I do need to find a way to flag the season. I geuss I could to this for the week too, but doing this with the season that doesn't change very often seems way easier!

### Extract

In [4]:
#API Params
#API Endpoint for NFL Team Info
url = "https://tank01-nfl-live-in-game-real-time-statistics-nfl.p.rapidapi.com/getNFLTeams"

#pull in only main team info, none of the rosters, schedules, stats, etc.
querystring = {"rosters":"false","schedules":"false","topPerformers":"false","teamStats":"false"}

In [5]:
#get team data
response = requests.get(url, headers=headers, params=querystring)

In [7]:
#view details
response.json()['body'][0]

{'teamAbv': 'ARI',
 'teamCity': 'Arizona',
 'currentStreak': {'result': 'W', 'length': '1'},
 'loss': '2',
 'teamName': 'Cardinals',
 'nflComLogo1': 'https://static.www.nfl.com/image/private/f_auto/league/u9fltoslqdsyao8cpm0k',
 'teamID': '1',
 'tie': '0',
 'pa': '67',
 'pf': '72',
 'espnLogo1': 'https://a.espncdn.com/combiner/i?img=/i/teamlogos/nfl/500/ari.png',
 'wins': '1'}

### Transform

In [10]:
team_data = response.json()['body']

In [12]:
record_list = []

In [14]:
refresh_timestamp = datetime.now()

In [17]:
#loop through each team to save their record data
for team in team_data:
    #save dictionary
    record_dict = {
        "team_id": team['teamID'],
        "updated_datetime": refresh_timestamp,
        "season": 2023,
        "wins": team['wins'],
        "loses": team['loss'],
        "ties": team['tie'],
        "points_for": team['pf'],
        "points_against": team['pa']
    }

    record_list.append(record_dict)

In [18]:
record_list[:2]

[{'team_id': '1',
  'updated_datetime': datetime.datetime(2023, 9, 30, 20, 15, 38, 136250),
  'season': 2023,
  'wins': '1',
  'loses': '2',
  'ties': '0',
  'points_for': '72',
  'points_against': '67'},
 {'team_id': '2',
  'updated_datetime': datetime.datetime(2023, 9, 30, 20, 15, 38, 136250),
  'season': 2023,
  'wins': '2',
  'loses': '1',
  'ties': '0',
  'points_for': '55',
  'points_against': '54'}]