# Scrape ESPN Fantasy Football League Utility Notebook <a id="return"></a>

This notebook contains all the dictionaries and all the functions used in the 00-scrape_espn_ff_api_v3.ipynb notebook.
<br><br/>

**Notebook Sections:**
1. [Import Packages](#section1)
2. [Create Dictionaries for Owner Names, NFL Team Names, Scoring Codes, etc.](#section2)
3. [Functions for Data Ingestion](#section3)
4. [Functions for Rosters Dataframe](#section4)
5. [Functions for Matchups Dataframe](#section5)

## Import Packages <a id="section1"></a>

In [1]:
# increase cell width of this notebook
from IPython.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

In [2]:
# import needed packages
import numpy as np
import pandas as pd
import json
import os
import re, requests, bs4, csv, datetime
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

## Create Dictionaries for Owner Names, NFL Team Names, Scoring Codes, etc. <a id="section2"></a>

This section creates dictionaries for:
1. Fantasy Football Owner Team Codes
2. Lineup Slot Codes
3. Position Codes
4. NFL Team Codes
5. NFL Player Stat Codes
6. Fantasy Football Scoring Codes
7. Fantasy Football Scoring Values (likely won't need to use or modify)
<br><br/>

Need player stat codes for:
* FG 60+ yards
* fumble recovered for td vs. fumble return for TD?
<br><br/>

Source: https://github.com/mkreiser/ESPN-Fantasy-Football-API/blob/master/src/player-stats/player-stats.js
<br><br/>

[Return to Top](#return)

In [None]:
# create dictionary of owner ids, owner team names, and owner names
owner_team_codes = {1:  ['Happy Rock Homewreckers', 'Blainer'],               
                    2:  ["Bench Don't Kill My Vibe", 'Padge'],
                    4:  ['Seattle rainier riot', 'Boob'],
                    6:  ['Sticky Icky', 'T-$'],
                    7:  ['Springfield Atoms', 'Duvi'],
                    8:  ['Beacon Hill Posterizers', 'Bup'],
                    9:  ['Brookside Shokunin', 'Cheese'], 
                    10: ['CoMo FightinCamlToes', 'Doisy'],
                    11: ['Pixel Whippers','Sembower'],
                    15: ['Bud Lathrop Drive', 'Farmer']
}

# create dictionary of lineup slot ids and lineup names
lineup_slot_codes = {0:  'QB',
                     2:  'RB',
                     3:  'Flex',
                     4:  'WR',
                     6:  'TE', 
                     16: 'Def', 
                     17: 'K',
                     20: 'Bench', 
                     21: 'IR',
                     23: 'Flex'
}

# create dictionary of position ids and position names
position_codes = {1:  'QB',
                  2:  'RB',
                  3:  'WR',
                  4:  'TE',
                  5:  'KR',
                  16: 'DEF'
    
}

# create dictionary of team ids and team names
pro_team_codes = {0:  ['Free Agent', np.nan],
                  1:  ['Atlanta Falcons', 'ATL'],
                  2:  ['Buffalo Bills', 'BUF'],
                  3:  ['Chicago Bears', 'CHI'],
                  4:  ['Cincinnati Bengals', 'CIN'],
                  5:  ['Cleveland Browns', 'CLE'],
                  6:  ['Dallas Cowboys', 'DAL'],
                  7:  ['Denver Broncs', 'DEN'],
                  8:  ['Detroit Lions', 'DET'],
                  9:  ['Greenbay Packers', 'GB'],
                  10: ['Tennessee Titans', 'TEN'],
                  11: ['Indianapolis Colts', 'IND'],
                  12: ['Kansas City Chiefs', 'KC'],
                  13: ['Las Vegas Raiders', 'LV'],
                  14: ['Los Angeles Rams', 'LA'],
                  15: ['Miami Dolphins', 'MIA'],
                  16: ['Minnesota Vikings', 'MIN'],
                  17: ['New Engalnd Patriots', 'NE'],
                  18: ['New Orleans Saints', 'NO'],
                  19: ['New York Giants', 'NYG'],
                  20: ['New York Jets', 'NYJ'],
                  21: ['Philadelphia Eagles', 'PHI'],
                  22: ['Arizona Cardinals', 'ARI'],
                  23: ['Pittsburgh Steelers', 'PIT'],
                  24: ['Los Angeles Chargers', 'LAC'],
                  25: ['San Francisco 49ers', 'SF'],
                  26: ['Seattle Seahawks', 'SEA'],
                  27: ['Tampa Bay Buccaneers', 'TB'],
                  28: ['Washington Commanders', 'WAS'],
                  29: ['Carolina Panthers', 'CAR'],
                  30: ['Jacksonville Jaguars', 'JAX'],
                  33: ['Baltimore Ravens', 'BAL'],
                  34: ['Houston Texans', 'HOU']
}

# create dictionary of real game statistics codes
player_stat_codes = {0:   'pass_att',
                     1:   'pass_comp',
                     2:   'pass_incomp',
                     3:   'pass_yrd',
                     4:   'pass_td',
                     5:   'pass_5_yrd',
                     6:   'unk6',
                     7:   'unk7',
                     8:   'unk8',
                     9:   'unk9',
                     10:  'unk10',
                     11:  'unk11',
                     12:  'unk12',
                     13:  'unk13',
                     14:  'unk14',
                     15:  'unk15',
                     16:  'pass_50_yrd_td',
                     17:  'pass_yrd_300_399',
                     18:  'pass_yrd_400+',
                     19:  'unk19',
                     19:  'pass_2pt_con',
                     20:  'pass_int',
                     21:  'unk21',
                     22:  'pass_yrd_dupe',
                     23:  'rush_att',
                     24:  'rush_yrd',
                     25:  'rush_td',
                     26:  'rush_2pt_con',
                     27:  'rush_5_yrd',
                     28:  'unk28',
                     29:  'unk29',
                     30:  'unk30',
                     31:  'unk31',
                     32:  'unk32',
                     33:  'unk33',
                     34:  'unk34',
                     35:  'unk35',
                     36:  'rush_50_yrd_td',
                     37:  'rush_yrd_100_199',
                     38:  'rush_yrd_200+',
                     39:  'unk39',
                     40:  'unk40',
                     41:  'receptions_dupe',
                     42:  'rec_yrd',
                     43:  'rec_td',
                     44:  'rec_2pt_con',
                     45:  'unk45',
                     46:  'rec_50_yrd_td',
                     47:  'rec_5_yrd',
                     48:  'unk48',
                     49:  'unk49',
                     50:  'unk50',
                     51:  'unk51',
                     52:  'unk52',
                     53:  'receptions',
                     54:  'unk54',
                     55:  'unk55',
                     56:  'rec_yrd_100_199',
                     57:  'rec_yrd_200+',
                     58:  'rec_tar',
                     59:  'yac',
                     60:  'yrd_per_rec',
                     61:  'rec_yrd_dupe',
                     62:  'unk62',
                     64:  'unk64',
                     65:  'unk65',
                     66:  'unk66',
                     67:  'unk67',
                     68:  'unk68',
                     69:  'unk69',
                     70:  'unk70',
                     71:  'unk71',
                     72:  'fum_lost',
                     73:  'unk73',
                     74:  'fg_made_50+',
                     75:  'unk75',
                     76:  'unk76',
                     77:  'fg_made_40_49',
                     78:  'unk78',
                     79:  'fg_miss_40_49',
                     80:  'fg_made_0_39',
                     81:  'unk81',
                     82:  'fg_miss_0_39',
                     83:  'fg_con',
                     84:  'fg_att',  
                     85:  'fg_miss_tot',
                     86:  'pat_con',
                     87:  'pat_att',
                     88:  'pat_miss_tot',
                     89:  'def_st_0_pts_alw',
                     90:  'def_st_1_6_pts_alw',
                     91:  'def_st_7_13_pts_alw',
                     92:  'def_st_14_17_pts_alw',
                     93:  'def_st_blk_td',
                     94:  'unk94',
                     95:  'def_st_int',
                     96:  'def_st_fum',
                     97:  'def_st_blk_kick',
                     98:  'def_st_safety',
                     99:  'def_st_sack',
                     100: 'unk100',
                     101: 'def_st_kick_ret_td',
                     102: 'def_st_punt_ret_td',
                     103: 'def_st_int_td',
                     104: 'def_st_fum_ret_td',
                     105: 'unk105',
                     106: 'unk106',
                     107: 'unk107',
                     108: 'unk108',
                     109: 'unk109',
                     110: 'unk110',
                     111: 'unk111',
                     112: 'unk112',
                     113: 'unk113',
                     114: 'unk114',
                     115: 'unk115',
                     116: 'unk116',
                     117: 'unk117',
                     118: 'unk118',
                     119: 'unk119',
                     120: 'def_pts_alw',
                     121: 'unk121',
                     122: 'def_st_22_27_pts_alw',
                     123: 'def_st_28_34_pts_alw',
                     124: 'def_st_35_45_pts_alw',
                     125: 'def_st_46+_pts_alw',
                     127: 'def_tot_yrd_alw',
                     128: 'def_st_0_99_yrd_alw',
                     129: 'def_st_100_199_yrd_alw',
                     130: 'def_st_200_299_yrd_alw',
                     131: 'unk131',
                     132: 'def_st_350_399_yrd_alw',
                     133: 'def_st_400_449_yrd_alw',
                     134: 'def_st_450_499_yrd_alw',
                     135: 'def_st_500_549_yrd_alw',
                     136: 'def_st_550+_yrd_alw',
                     155: 'unk155',
                     156: 'unk156',
                     158: 'unk158',
                     175: 'unk175',
                     176: 'unk176',
                     177: 'unk177',
                     178: 'unk178',
                     179: 'unk179',
                     180: 'unk180',
                     181: 'unk181',
                     182: 'unk182',
                     183: 'unk183',
                     184: 'unk184',
                     185: 'unk185',
                     186: 'unk186',
                     187: 'unk187',
                     188: 'unk188',
                     189: 'unk189',
                     190: 'unk190',
                     191: 'unk191',
                     192: 'unk192',
                     193: 'unk193',
                     194: 'unk194',
                     195: 'unk195',
                     196: 'unk196',
                     197: 'unk197',
                     198: 'fg_made_50_59',
                     199: 'unk199',
                     200: 'unk200',
                     202: 'unk202',
                     203: 'unk203',
                     210: 'unk210',
}

# create dictionary of fantasy football specific statistics codes
ff_scoring_codes = {1:   'pass_comp_ff',
                    2:   'pass_incomp_ff',
                    4:   'pass_td_ff',
                    5:   'pass_5_yrd_ff',
                    16:  'pass_50_yrd_td_ff',
                    17:  'pass_yrd_300_399_ff',
                    18:  'pass_yrd_400+_ff',
                    19:  'pass_2pt_con_ff',
                    20:  'pass_int_ff',
                    25:  'rush_td_ff',
                    26:  'rush_2pt_con_ff',
                    27:  'rush_5_yrd_ff',
                    36:  'rush_50_yrd_td_ff',
                    37:  'rush_yrd_100_199_ff',
                    38:  'rush_yrd_200+_ff',
                    43:  'rec_td_ff',
                    44:  'rec_2pt_con_ff_ff',
                    46:  'rec_50_yrd_td_ff',
                    47:  'rec_5_yrd_ff',
                    53:  'receptions_ff',
                    56:  'rec_yrd_100_199_ff',
                    57:  'rec_yrd_200+_ff',
                    72:  'fum_lost_ff',
                    77:  'fg_made_40_49_ff',
                    79:  'fg_miss_40_49_ff',
                    80:  'fg_made_0_39_ff',
                    82:  'fg_miss_0_39_ff',
                    86:  'pat_made_ff',
                    88:  'pat_miss_ff',
                    89:  'def_st_0_pts_alw_ff',
                    90:  'def_st_1_6_pts_alw_ff',
                    91:  'def_st_7_13_pts_alw_ff',
                    92:  'def_st_14_17_pts_alw_ff',
                    93:  'def_st_blk_td_ff',
                    95:  'def_st_int_ff',
                    96:  'def_st_fum_ff',
                    97:  'def_st_blk_kick_ff',
                    98:  'def_st_safety_ff',
                    99:  'def_st_sack_ff',
                    101: 'def_st_kick_ret_td_ff',
                    102: 'def_st_punt_ret_td_ff',
                    103: 'def_st_int_td_ff',
                    104: 'def_st_fum_ret_td_ff',
                    122: 'def_st_22_27_pts_alw_ff',
                    123: 'def_st_28_34_pts_alw_ff',
                    124: 'def_st_35_45_pts_alw_ff',
                    125: 'def_st_46+_pts_alw_ff',
                    128: 'def_st_0_99_yrd_alw_ff',
                    129: 'def_st_100_199_yrd_alw_ff',
                    130: 'def_st_200_299_yrd_alw_ff',
                    132: 'def_st_350_399_yrd_alw_ff',
                    133: 'def_st_400_449_yrd_alw_ff',
                    134: 'def_st_450_499_yrd_alw_ff',
                    135: 'def_st_500_549_yrd_alw_ff',
                    136: 'def_st_550+_yrd_alw_ff',
                    198: 'fg_made_50_59_ff'
}

# create dictionary of fantasy football scoring values
scoring_dict = {'pass_5_yrd':            0.1,
                'pass_comp':             0.4,
                'pass_incomp':           -0.2,
                'pass_td':                6,
                'pass_50_yrd_td':         3,
                'pass_int':               -2,
                'pass_2pt_con':           2,
                'pass_yrd_300_399':       3,
                'P400':                   5,
                'rush_5_yrd':             0.6,
                'rush_td':                6,
                'RTD50':                  3,
                'rush_2pt_con':           2,
                'rush_yrd_100_199':       3,
                'RY200':                  5,
                'rec_5_yrd':              0.6,
                'receptions':             1,
                'rec_td':                 6,
                'rec_50_yrd_td':          3,
                'rec_2pt_con':            2,
                'rec_yrd_100_199':        3,
                'REY200':                 5,
                'pat_made':               1,
                'pat_miss':               -1,
                'fg_made_0_39':           3,
                'fg_made_40_49':          4,
                'fg_miss_0_39':           -2,
                'fg_miss_40_49':          -1,
                'fg_made_50':             5,
                'FG60':                   5,
                'def_st_kick_ret_td':     6,
                'def_st_punt_ret_td':     6,
                'def_st_int_td':          5,
                'def_st_fum_ret_td':      5,
                'def_st_blk_td':          6,
                'def_st_sack':            1,
                'def_st_blk_kick':        2,
                'def_st_int':             3,
                'def_st_fum':             3,
                'def_st_safety':          2,
                'def_st_0_pts_alw':       10,
                'def_st_1_6_pts_alw':     7,
                'def_st_7_13_pts_alw':    3,
                'def_st_14_17_pts_alw':   1,
                'def_st_22_27_pts_alw':   -1,
                'def_st_28_34_pts_alw':   -3,
                'def_st_35_45_pts_alw':   -5,
                'PA46':                   -7,
                'def_st_0_99_yrd_alw':    7,
                'def_st_100_199_yrd_alw': 3,
                'def_st_200_299_yrd_alw': 1,
                'def_st_400_449_yrd_alw': -1,
                'def_st_450_499_yrd_alw': -1.5,
                'def_st_500_549_yrd_alw': -2,
                'def_st_550+_yrd_alw':    -3,
                'misc_kick_ret_td':       6,
                'misc_punt_ret_td':       6,
                'misc_fum_rec_td':        6,
                'misc_fum_lost':          -2,
                'misc_fum_ret_td':        6
}

## Functions for Data Ingestion <a id="section3"></a>

1. Function to data scrape an ESPN Fantasy Football league and store in json files
2. Function to load scraped data from json files
<br><br/>

[Return to Top](#return)

In [None]:
# create data ingestion class
class data_ingest(object):
    
    # create __init__ function
    def __init__(self, swid, espn_s2, league_id, season, week):
        self.swid = swid
        self.espn_s2 = espn_s2
        self.league_id = league_id
        self.season = season
        self.week = week
    
    # create function to scrape espn data and save to json
    def scrape_espn_data(self):
        
        # check whether the data path exists or not
        isExist = os.path.exists(f'../data/{self.season}')

        # if the data path doesn't exist...
        if not isExist:
            
            # create a new directory
            os.makedirs(f'../data/{self.season}')
        
        # create url.  there are multiple views we can look at but we're interested in matchup data
        url = 'https://fantasy.espn.com/apis/v3/games/ffl/seasons/'+str(self.season)+'/segments/0/leagues/'+str(self.league_id)+'?view=mMatchup&view=mMatchupScore'
        print(url)

        # create JSON file for each week's matchup data
        for i in range(1, self.week + 1):
            print(f'season: {self.season}, week: {i}')
            r = requests.get(url,
                             params = {'scoringPeriodId': i},
                             cookies = {"SWID": self.swid, "espn_s2": self.espn_s2})
            d = r.json()
            with open(f'../data/{self.season}/{self.season}_matchups_week_{i}.json', 'w', encoding = 'utf-8') as f:
                json.dump(d, f, ensure_ascii = False, indent = 4)
                
    # create function to load json data from disk
    def load_data_from_disk(self):
        
        # create empty list to store each week's matchups data
        matchups_list = []
        
        # load JSON file for each's week matchups data
        for i in range(1, self.week + 1):
            f = open(f'../data/{self.season}/{self.season}_matchups_week_{i}.json')

            # returns JSON object as a dictionary 
            data = json.load(f)

            # add each JSON object to list created above
            matchups_list.append(data)
            
        return matchups_list, data

## Functions for Rosters Dataframe <a id="section4"></a>

1. Function to create a dataframe with weekly rosters data and save it to csv
2. Function to add the fantasy football scoring values to the weekly rosters dataframe 
<br><br/>

[Return to Top](#return)

In [None]:
# create rosters creation class
class create_rosters(object):
    
    # create __init__ function
    def __init__(self, matchups_list, season):
        self.matchups_list = matchups_list
        self.season = season
        
    # create function to create a dataframe with weekly roster data and save it to csv
    def create_weekly_rosters(self):

        # initialize list needed to create rosters_df
        data_list = []

        # loop through each JSON object in matchups_list which represents one week's matchup data
        for wk in range(0, len(self.matchups_list)):

            # grab year
            year = self.matchups_list[wk]['seasonId']

            # loop through each team
            for tm in self.matchups_list[wk]['teams']:
                owner_team_id   = tm['id']
                owner_team_name = owner_team_codes[owner_team_id][0]
                owner_name      = owner_team_codes[owner_team_id][1]

                # loop through weekly roster
                for p in tm['roster']['entries']:

                    # grab week number
                    temp_week = self.matchups_list[wk]['scoringPeriodId']

                    # extract roster data
                    player_name   = p['playerPoolEntry']['player']['fullName']
                    slot_id       = p['lineupSlotId']
                    slot_name     = lineup_slot_codes[slot_id]
                    position_id   = p['playerPoolEntry']['player']['defaultPositionId']
                    position_name = position_codes[position_id]

                    # injured status (need try/exc bc of D/ST)
                    current_inj = np.nan
                    try:
                        current_inj = p['playerPoolEntry']['player']['injuryStatus']
                    except:
                        pass

                    # projected/actual points
                    # note:  need to grab team data in different locations within the json object since that data isn't always
                    # in the same location.  data integrity issue
                    proj_points, actual_points = None, None
                    for stat in p['playerPoolEntry']['player']['stats']:
                        if stat['scoringPeriodId'] != temp_week:
                            continue
                        if stat['statSourceId'] == 0:
                            actual_points = stat['appliedTotal']
                            if stat['proTeamId'] != 0:

                                # grab team info
                                pro_team_id = stat['proTeamId']
                                pro_team_name = pro_team_codes[pro_team_id][0]
                                pro_team_name_abv = pro_team_codes[pro_team_id][1]                        
                        elif stat['statSourceId'] == 1:
                            proj_points = stat['appliedTotal']
                            if stat['proTeamId'] != 0:

                                # grab team info
                                pro_team_id = stat['proTeamId']
                                pro_team_name = pro_team_codes[pro_team_id][0]
                                pro_team_name_abv = pro_team_codes[pro_team_id][1]
                            elif (proj_points < 1) & (not actual_points):

                                # grab team info
                                pro_team_id   = p['playerPoolEntry']['player']['proTeamId']
                                pro_team_name = pro_team_codes[pro_team_id][0]
                                pro_team_name_abv = pro_team_codes[pro_team_id][1] 
                            elif not pro_team_id:

                                # grab team info
                                pro_team_id   = p['playerPoolEntry']['player']['proTeamId']
                                pro_team_name = pro_team_codes[pro_team_id][0]
                                pro_team_name_abv = pro_team_codes[pro_team_id][1]                       

                    # add data to list created above
                    data_list.append([year, temp_week, owner_team_name, owner_name, player_name, pro_team_name, pro_team_name_abv, 
                                      current_inj, slot_name, position_name, proj_points, actual_points, slot_id])

        # create rosters_df using data_list
        rosters_df = pd.DataFrame(data_list, 
                                  columns=['year', 'week', 'owner_team', 'owner', 'player', 'pro_team', 'pro_team_abv',
                                           'current_inj_status', 'lineup_slot_name', 'position_name', 'proj_points', 
                                           'actual_points', 'slot_id'
                                          ]
                                 )

        # save to csv
        rosters_df.to_csv(f"../data/{self.season}/rosters_df_{self.season}.csv", index = False)

        # explore data frame
        print(f'Weekly Rosters Shape: {rosters_df.shape}\n')
        display(rosters_df.info())
        display(rosters_df.describe())
        display(rosters_df.head())
        
        return rosters_df
    
    # create function to add the player stats and the fantasy football scoring stats to the rosters dataframe 
    def create_weekly_rosters_w_scoring(self, rosters_df):
        
        # create pandas dataframe using the fantasy football scoring dictionary's values as column names
        ff_scoring_df = pd.DataFrame(np.zeros((len(rosters_df), len(ff_scoring_codes.values())))
                                    ,columns = list(ff_scoring_codes.values()))

        # create pandas dataframe using the player stat dictionary's values as column names
        player_stat_df = pd.DataFrame(np.zeros((len(rosters_df), len(player_stat_codes.values())))
                                     ,columns = list(player_stat_codes.values()))

        # combine dataframe created above to the rosters dataframe
        rosters_df_w_scoring = pd.concat([rosters_df, ff_scoring_df, player_stat_df], axis = 1)
        
        # loop through each JSON object in matchups_list which represents one week's matchup data
        for wk in range(0, len(self.matchups_list)):

            # grab year
            year = self.matchups_list[wk]['seasonId']

            # loop through each team
            for tm in self.matchups_list[wk]['teams']:

                # loop through weekly roster
                for p in tm['roster']['entries']:
                    #temp_week = wk + 1
                    temp_week = self.matchups_list[wk]['scoringPeriodId']

                    # grab player name
                    player_name = p['playerPoolEntry']['player']['fullName']

                    # loop through each set of stats
                    for stat in p['playerPoolEntry']['player']['stats']:
                        if stat['scoringPeriodId'] != temp_week:
                            continue
                        if stat['statSourceId'] == 0:

                            # loop through the fantasy scoring stats
                            for i in [int(s) for s in stat['appliedStats'].keys()]:

                                # if the scoring code exists in the dictionary above then add the stat to rosters_df_w_scoring
                                if i in ff_scoring_codes.keys():
                                    rosters_df_w_scoring.loc[(rosters_df_w_scoring['player'] == player_name) & (rosters_df_w_scoring['week'] == temp_week) &\
                                                             (rosters_df_w_scoring['year'] == self.season), ff_scoring_codes[i]] = stat['appliedStats'][str(i)]

                            # loop through the player stats
                            for j in [int(r) for r in stat['stats'].keys()]:

                                # if the scoring code exists in the dictionary above then add the stat to rosters_df_w_scoring
                                if j in player_stat_codes.keys():
                                    rosters_df_w_scoring.loc[(rosters_df_w_scoring['player'] == player_name) & (rosters_df_w_scoring['week'] == temp_week) &\
                                                             (rosters_df_w_scoring['year'] == self.season), player_stat_codes[j]] = stat['stats'][str(j)]

        # replace nulls with 0
        rosters_df_w_scoring.replace(np.nan, 0, inplace=True)
        
        # save to csv
        rosters_df_w_scoring.to_csv(f"../data/{self.season}/rosters_df_w_scoring_{self.season}.csv", index = False)

        # explore data frame
        print(f'Weekly Rosters Shape w/ Scoring: {rosters_df_w_scoring.shape}\n')
        display(rosters_df_w_scoring.info())
        display(rosters_df_w_scoring.describe())
        display(rosters_df_w_scoring.head())
        
        return rosters_df_w_scoring

## Functions for Matchups Dataframe <a id="section5"></a>

1. Function to create a dataframe with weekly matchups data and save it to csv
2. Function to create a dataframe with total wins/losses through the most recent NFL week and save it to csv
<br><br/>

[Return to Top](#return)

In [None]:
# create data ingestion class
class create_matchups(object):
    
    # create __init__ function
    def __init__(self, season):
        self.season = season
    
    # create function to each fantasy football teams' weekly matchups and save it to a csv file
    def create_weekly_matchups(self, data):
        
        # initialize list needed to create matchups dataframe
        data_list = []

        # loop through each matchup from week 1 to current week
        for i in range(0, len(data['schedule'])):
            
            # check if there was actually a winner
            if data['schedule'][i]['winner'] == 'UNDECIDED':
                continue

            # create zipped dictionary for each scoring period since there may be multiple scoring periods within each matchup period (i.e 2 week playoff matchups)
            zip_dict = zip(enumerate(data['schedule'][i]['away']['pointsByScoringPeriod'].items()),\
                           enumerate(data['schedule'][i]['home']['pointsByScoringPeriod'].items())
                          )

            # loop through each item in zip_dict
            for (index_away, (key_away, value_away)), (index_home, (key_home, value_home)) in zip_dict:

                # build row for away team
                away_week = int(key_away)
                away_owner_team_id = data['schedule'][i]['away']['teamId']
                away_owner_team_name = owner_team_codes[away_owner_team_id][0]
                away_owner_name = owner_team_codes[away_owner_team_id][1]
                away_score = float(value_away)
                away_opp_id = data['schedule'][i]['home']['teamId']
                away_opp_team_name = owner_team_codes[away_opp_id][0]
                away_opp_name = owner_team_codes[away_opp_id][1]
                away_opp_score = float(value_home)

                # check if is more than one scoring period
                if len(data['schedule'][i]['away']['pointsByScoringPeriod']) > 1:

                    # determine if away team won and only assign the win when looping through the last scoring period in a matchup period
                    if index_away == 1 and data['schedule'][i]['winner'] == 'AWAY':
                        away_win = 1
                    else:
                        away_win = 0

                else:

                    # determine if away team won
                    if data['schedule'][i]['winner'] == 'AWAY':
                        away_win = 1
                    else:
                        away_win = 0

                # append away row to data_list
                data_list.append([away_week, away_owner_team_name, away_owner_name, away_score, away_win, away_opp_team_name, 
                                  away_opp_name, away_opp_score])

                # build row for home team
                home_week = int(key_home)
                home_owner_team_id = data['schedule'][i]['home']['teamId']
                home_owner_team_name = owner_team_codes[home_owner_team_id][0]
                home_owner_name = owner_team_codes[home_owner_team_id][1]
                home_score = float(value_home)
                home_opp_id = data['schedule'][i]['away']['teamId']
                home_opp_team_name = owner_team_codes[home_opp_id][0]
                home_opp_name = owner_team_codes[home_opp_id][1]
                home_opp_score = float(value_away)

                # check if is more than one scoring period
                if len(data['schedule'][i]['home']['pointsByScoringPeriod']) > 1:

                    # determine if home team won and only assign the win when looping through the last scoring period in a matchup period  
                    if index_home == 1 and data['schedule'][i]['winner'] == 'HOME':
                        home_win = 1
                    else:
                        home_win = 0

                else:

                    # determine if home team won    
                    if data['schedule'][i]['winner'] == 'HOME':
                        home_win = 1
                    else:
                        home_win = 0

                # append home row to data_list
                data_list.append([home_week, home_owner_team_name, home_owner_name, home_score, home_win, home_opp_team_name, 
                                  home_opp_name, home_opp_score])

        # create matchups_df using data_list
        matchups_df = pd.DataFrame(data_list, 
                                   columns=['week', 'owner_team_name', 'owner', 'score', 'win', 'opp_owner_team_name', 'opp_owner', 
                                            'opp_score'])
        # save to csv
        matchups_df.to_csv(f"../data/{self.season}/matchups_df_{self.season}.csv", index = False)
        
        # explore data frame
        print(f'Weekly Matchups Shape: {matchups_df.shape}\n')
        display(matchups_df.info())
        display(matchups_df.describe())
        display(matchups_df.head())
        
        return matchups_df
    
    # create function to create dataframe of total wins/losses by fantasy football team
    def create_wins_losses(self, matchups_df: object):
    
        # subset matchups_df by wins
        wins = matchups_df.loc[matchups_df['win'] == 1]

        # create total_wins dataframe of wins per team
        total_wins = pd.DataFrame(wins.groupby(['owner_team_name'])['win'].value_counts().reset_index(0).reset_index(drop=True))
        total_wins.columns = ['owner_team_name', 'wins']

        # subset matchups_df by losses
        losses = matchups_df.loc[matchups_df['win'] == 0]

        # create total_losses dataframe of losses per team
        total_losses = pd.DataFrame(losses.groupby(['owner_team_name'])['win'].value_counts().reset_index(0).reset_index(drop=True))
        total_losses.columns = ['owner_team_name', 'losses']

        # merge total_wins and total_losses
        win_loss_df = total_wins.merge(total_losses, on = 'owner_team_name', how = 'left')

        # replace any null values with 0 which means one or more teams have either 0 wins or 0 losses
        win_loss_df.fillna(0, inplace=True)

        # fillna function casts dtype to float so change dtype back to int
        win_loss_df['losses'] = win_loss_df['losses'].astype('int')
        win_loss_df['wins'] = win_loss_df['wins'].astype('int')

        # # create total_points dataframe of wins per team
        total_points_df = pd.DataFrame(matchups_df.groupby(['owner_team_name'])[['score', 'opp_score']].sum().reset_index(0).reset_index(drop=True))
        total_points_df.columns = ['owner_team_name', 'points_for', 'points_against']

        # merge win_loss_df with total_points_df
        win_loss_df = win_loss_df.merge(total_points_df, on = 'owner_team_name', how = 'left')

        # save to csv
        win_loss_df.to_csv(f"../data/{self.season}/win_loss_df_{self.season}.csv", index = False)
        
        # explore data frame
        print(f'Total Wins/Losses Shape: {win_loss_df.shape}\n')
        display(win_loss_df.info())
        display(win_loss_df.describe())
        display(win_loss_df.head())
        
        return win_loss_df