# Web Scrape Private ESPN Fantasy Football League <a id="return"></a>

This notebook will scrape a private ESPN Fantasy Football league and return league/roster data.
<br><br/>

**Notebook Sections:**
1. [Import Packages and Set User-Defined Fields](#section1)
2. [Web Scrape Private ESPN Fantasy Football League and Load Data ](#section2)
3. [Create Rosters Dataframes](#section3)
4. [Create Matchups Dataframes](#section4)

**Outputs:**
1. Web-Scraped Weekly Matchup Raw Data: **json**
2. Weekly Roster Data: **csv**
3. Weekly Roster Data with Player/Fantasy Stats: **csv**
4. Weekly Matchup Data: **csv**
5. Total Wins/Losses by Fantasy Team: **csv**

## Import Packages <a id="section1"></a>

In [28]:
# increase cell width of this notebook
from IPython.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

In [29]:
# import needed packages
import pandas as pd

# set pandas display options
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 100)

# run the 00-scrape_espn_ff_api_v3_util.ipynb notebook
%run 00-scrape_espn_ff_api_v3_util.ipynb

## Set User-Defined Fields

In order to web scrape data from a private ESPN Fantasy Football League, we need to pass the SWID and the espn_s2 cookies values to the cookies parameter of the requests.get command.
<br><br/>

In Chrome the SWID and the espn_s2 cookies values can be found here, chrome://settings/cookies/detail?site=espn.com.
<br><br/>

To manually find the cookies values follow these instructions: Settings > Privacy and security > Cookies and other site data > All cookies and site data > espn.com > SWID/espn_s2.

<img src="pictures/settings_cookies.PNG">
<br><br/>

Please note the cookies values differ from machine to machine.  My cookies values won't work on another comptuer.

<img src="pictures/SWID_cookie.PNG">

<img src="pictures/espn_s2_cookie.PNG">
<br><br/>

Additionally, we need to pass the league id into the url.
<br><br/>

To find the league id look in the url while logged into a private ESPN Fantasy Football league.

<img src="pictures/league_id.PNG">
<br><br/>

Finally, to review any matchup:
<br><br/>

Set teamId to the team id value and seasonId to 2021, 2020, 2019, or 2018 in the web URL.  ESPN's API v3 doesn't return data for years prior to 2018.
<br><br/>

If the matchup under review is a **non-playoff matchup** then set both matchupPeriodId and scoringPeriodId to the same week number value.
<br><br/>

If the matchup under review is a **playoff matchup** then the matchupPeriodId and scoringPeriodId week values are different.  For example, if we're reviewing a matchup during NFL weeks 15 and 16, which correspond to the last 2 playoff games in our league, then matchupPeriodId = 14 and scoringPeriodId = 15 or 16 for NFL weeks 15 or 16.
<br><br/>

For example, https://fantasy.espn.com/football/boxscore?leagueId=169073&matchupPeriodId=1&scoringPeriodId=1&seasonId=2020&teamId=10&view=scoringperiod
<br><br/>

Source: https://stmorse.github.io/journal/espn-fantasy-v3.html
<br><br/>

[Return to Top](#return)

In [30]:
# set user-defined fields below: cookie values, league id, season, and current NFL week number
swid      = '{F3E3A4D2-1203-4B11-BDAD-00C92AAA3F48}'
espn_s2   = 'AECf2TISP%2BRluNYIj24%2FqkuJBHTzAa7Br0BVIoUnBhHQCygnkN7hvyvU8wIm6XAT58otHnOZW4HjyImLJ14Rk8L9%2BwObWXeIa8kCosNFNMSc79r24KSiJK9jtqUQ2V1lvIncFM1qDhV9xL7E5jnutCP9rZfiJv8h%2Bq6WNvFb4YmyHICgx3QW68obk3wqBrGrXmw0vbq2bf367%2BsuL%2BOJ2ioRnxi1yY7LgbtiJnAXSKZj7kdIGjBiCPjUHgeXzDLHPunkLchEGlOrr%2FhG1ORgOsUX'
league_id = 169073
season    = 2018
week      = 17

# set flag to decide whether or not to web scrape ESPN fantasy football league
scrape_espn = True

# set flag to decide whether or not to combine rosters from multiple seasons into a single dataframe and save it to csv
combine_years = True

## Web Scrape Private ESPN Fantasy Football League and Load Data <a id="section2"></a>

In this section, we'll scrape a private ESPN Fantasy Football league and store each week's matchups data in a json file as well as in a list variable.
<br><br/>

**NOTE:** Only need to scape ESPN once, after which the scrape_espn flag should be set to False.
<br><br/>

[Return to Top](#return)

In [31]:
%%time

# instantiate data_ingest object
data_ingest = data_ingest(swid, espn_s2, league_id, season, week)

# check scrape_espn flag
if scrape_espn:
    
    #scrape ESPN FF league and save to json
    data_ingest.scrape_espn_data()

# load json files with each's week matchup data
matchups_list, data = data_ingest.load_data_from_disk()

https://fantasy.espn.com/apis/v3/games/ffl/seasons/2018/segments/0/leagues/169073?view=mMatchup&view=mMatchupScore
season: 2018, week: 1
season: 2018, week: 2
season: 2018, week: 3
season: 2018, week: 4
season: 2018, week: 5
season: 2018, week: 6
season: 2018, week: 7
season: 2018, week: 8
season: 2018, week: 9
season: 2018, week: 10
season: 2018, week: 11
season: 2018, week: 12
season: 2018, week: 13
season: 2018, week: 14
season: 2018, week: 15
season: 2018, week: 16
season: 2018, week: 17
Wall time: 20.3 s


## Create Rosters Dataframes <a id="section3"></a>

In this section, we'll create dataframes consisting of...

1. each fantasy football teams' weekly rosters and save it to a csv file
2. each fantasy football teams' weekly rosters (including player stats and fantasy scoring stats) and save it to a csv file.
<br><br/>

[Return to Top](#return)

In [32]:
%%time

# instantiate create_rosters object
create_rosters = create_rosters(matchups_list, season)

# create dataframe of fantasy football teams' weekly rosters and save it to a csv file
rosters_df = create_rosters.create_weekly_rosters()

# create dataframe of fantasy football teams' weekly rosters (including player stats and fantasy scoring stats) and save it to a csv file
rosters_df_w_scoring = create_rosters.create_weekly_rosters_w_scoring(rosters_df)

Weekly Rosters Shape: (3383, 13)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3383 entries, 0 to 3382
Data columns (total 13 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   year                3383 non-null   int64  
 1   week                3383 non-null   int64  
 2   owner_team          3383 non-null   object 
 3   owner               3383 non-null   object 
 4   player              3383 non-null   object 
 5   pro_team            3383 non-null   object 
 6   pro_team_abv        3379 non-null   object 
 7   current_inj_status  3054 non-null   object 
 8   lineup_slot_name    3383 non-null   object 
 9   position_name       3383 non-null   object 
 10  proj_points         3364 non-null   float64
 11  actual_points       3186 non-null   float64
 12  slot_id             3383 non-null   int64  
dtypes: float64(2), int64(3), object(8)
memory usage: 343.7+ KB


None

Unnamed: 0,year,week,proj_points,actual_points,slot_id
count,3383.0,3383.0,3364.0,3186.0,3383.0
mean,2018.0,8.982264,10.619167,11.454488,13.669524
std,0.0,4.899672,7.04731,10.539986,7.997459
min,2018.0,1.0,0.0,-8.0,0.0
25%,2018.0,5.0,6.805369,3.4,4.0
50%,2018.0,9.0,9.979369,9.0,20.0
75%,2018.0,13.0,13.990306,17.0,20.0
max,2018.0,17.0,34.885069,63.2,20.0


Unnamed: 0,year,week,owner_team,owner,player,pro_team,pro_team_abv,current_inj_status,lineup_slot_name,position_name,proj_points,actual_points,slot_id
0,2018,1,Happy Rock Homewreckers,Blainer,David Johnson,Arizona Cardinals,ARI,ACTIVE,RB,RB,21.371488,15.8,2
1,2018,1,Happy Rock Homewreckers,Blainer,Melvin Gordon,Los Angeles Chargers,LAC,ACTIVE,RB,RB,16.62425,27.8,2
2,2018,1,Happy Rock Homewreckers,Blainer,Rob Gronkowski,New Engalnd Patriots,NE,ACTIVE,TE,TE,14.862558,24.2,6
3,2018,1,Happy Rock Homewreckers,Blainer,Allen Robinson,Chicago Bears,CHI,ACTIVE,WR,WR,13.913777,8.8,4
4,2018,1,Happy Rock Homewreckers,Blainer,Marvin Jones Jr.,Detroit Lions,DET,INJURY_RESERVE,WR,WR,11.560917,7.6,4


Weekly Rosters Shape w/ Scoring: (3383, 236)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3383 entries, 0 to 3382
Columns: 236 entries, year to unk210
dtypes: float64(225), int64(3), object(8)
memory usage: 6.1+ MB


None

Unnamed: 0,year,week,proj_points,actual_points,slot_id,pass_comp_ff,pass_incomp_ff,pass_td_ff,pass_5_yrd_ff,pass_50_yrd_td_ff,pass_yrd_300_399_ff,pass_yrd_400+_ff,pass_2pt_con_ff,pass_int_ff,rush_td_ff,rush_2pt_con_ff,rush_5_yrd_ff,rush_50_yrd_td_ff,rush_yrd_100_199_ff,rush_yrd_200+_ff,rec_td_ff,rec_2pt_con_ff_ff,rec_50_yrd_td_ff,rec_5_yrd_ff,receptions_ff,rec_yrd_100_199_ff,rec_yrd_200+_ff,fum_lost_ff,fg_made_40_49_ff,fg_miss_40_49_ff,fg_made_0_39_ff,fg_miss_0_39_ff,pat_made_ff,pat_miss_ff,def_st_0_pts_alw_ff,def_st_1_6_pts_alw_ff,def_st_7_13_pts_alw_ff,def_st_14_17_pts_alw_ff,def_st_blk_td_ff,def_st_int_ff,def_st_fum_ff,def_st_blk_kick_ff,def_st_safety_ff,def_st_sack_ff,def_st_kick_ret_td_ff,def_st_punt_ret_td_ff,def_st_int_td_ff,def_st_fum_ret_td_ff,def_st_22_27_pts_alw_ff,def_st_28_34_pts_alw_ff,...,unk118,unk119,def_pts_alw,unk121,def_st_22_27_pts_alw,def_st_28_34_pts_alw,def_st_35_45_pts_alw,def_st_46+_pts_alw,def_tot_yrd_alw,def_st_0_99_yrd_alw,def_st_100_199_yrd_alw,def_st_200_299_yrd_alw,unk131,def_st_350_399_yrd_alw,def_st_400_449_yrd_alw,def_st_450_499_yrd_alw,def_st_500_549_yrd_alw,def_st_550+_yrd_alw,unk155,unk156,unk158,unk175,unk176,unk177,unk178,unk179,unk180,unk181,unk182,unk183,unk184,unk185,unk186,unk187,unk188,unk189,unk190,unk191,unk192,unk193,unk194,unk195,unk196,unk197,fg_made_50_59,unk199,unk200,unk202,unk203,unk210
count,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,...,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0,3383.0
mean,2018.0,8.982264,10.559526,10.787467,13.669524,0.936447,-0.237836,1.122672,0.535767,0.039019,0.078924,0.02956,0.021874,-0.14898,0.627845,0.009459,1.555779,0.015075,0.083358,0.004434,0.964824,0.01478,0.043453,2.807035,0.85167,0.159622,0.005912,-0.096364,0.09932,-0.007981,0.180018,-0.002956,0.150163,-0.007685,0.017736,0.022761,0.041679,0.011824,0.001774,0.208395,0.141886,0.007094,0.00473,0.207804,0.003547,0.005321,0.035471,0.016258,-0.017145,-0.045226,...,0.132722,0.030446,1.84097,0.013597,0.017145,0.015075,0.00739,0.000887,29.019214,0.000296,0.005321,0.02217,0.016553,0.018327,0.012415,0.005616,0.003252,0.000887,0.444871,0.404375,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
std,0.0,4.899672,7.072166,10.574398,7.997459,2.910747,0.766947,4.038991,1.674206,0.369942,0.48022,0.383363,0.224452,0.740186,2.229477,0.153509,3.344997,0.224353,0.49315,0.148851,2.545088,0.171318,0.358479,4.182808,1.080827,0.673439,0.171853,0.452533,0.783974,0.088993,0.977174,0.076843,0.707257,0.090664,0.420827,0.398566,0.351193,0.108109,0.103157,1.080454,0.783081,0.118922,0.097157,0.850247,0.145865,0.178621,0.419704,0.28469,0.129829,0.365613,...,0.680341,0.228035,6.71469,0.11583,0.129829,0.121871,0.085659,0.02977,98.914053,0.017193,0.07276,0.147257,0.127609,0.134151,0.110745,0.074742,0.056938,0.02977,0.497025,0.490843,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
min,2018.0,1.0,0.0,-8.0,0.0,0.0,-5.6,0.0,0.0,0.0,0.0,0.0,0.0,-8.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-4.0,0.0,-1.0,0.0,-2.0,0.0,-2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,-3.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,2018.0,5.0,6.738974,2.2,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,2018.0,9.0,9.941403,8.0,20.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.6,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,2018.0,13.0,13.953441,16.0,20.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.2,0.0,0.0,0.0,0.0,0.0,0.0,4.2,1.6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,2018.0,17.0,34.885069,63.2,20.0,16.8,0.0,36.0,9.5,6.0,3.0,5.0,4.0,0.0,24.0,4.0,28.2,6.0,3.0,5.0,18.0,2.0,3.0,25.8,6.4,3.0,5.0,0.0,12.0,0.0,12.0,0.0,6.0,0.0,10.0,7.0,3.0,1.0,6.0,12.0,9.0,2.0,2.0,11.0,6.0,6.0,5.0,5.0,0.0,0.0,...,13.0,5.0,51.0,1.0,1.0,1.0,1.0,1.0,576.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Unnamed: 0,year,week,owner_team,owner,player,pro_team,pro_team_abv,current_inj_status,lineup_slot_name,position_name,proj_points,actual_points,slot_id,pass_comp_ff,pass_incomp_ff,pass_td_ff,pass_5_yrd_ff,pass_50_yrd_td_ff,pass_yrd_300_399_ff,pass_yrd_400+_ff,pass_2pt_con_ff,pass_int_ff,rush_td_ff,rush_2pt_con_ff,rush_5_yrd_ff,rush_50_yrd_td_ff,rush_yrd_100_199_ff,rush_yrd_200+_ff,rec_td_ff,rec_2pt_con_ff_ff,rec_50_yrd_td_ff,rec_5_yrd_ff,receptions_ff,rec_yrd_100_199_ff,rec_yrd_200+_ff,fum_lost_ff,fg_made_40_49_ff,fg_miss_40_49_ff,fg_made_0_39_ff,fg_miss_0_39_ff,pat_made_ff,pat_miss_ff,def_st_0_pts_alw_ff,def_st_1_6_pts_alw_ff,def_st_7_13_pts_alw_ff,def_st_14_17_pts_alw_ff,def_st_blk_td_ff,def_st_int_ff,def_st_fum_ff,def_st_blk_kick_ff,...,unk118,unk119,def_pts_alw,unk121,def_st_22_27_pts_alw,def_st_28_34_pts_alw,def_st_35_45_pts_alw,def_st_46+_pts_alw,def_tot_yrd_alw,def_st_0_99_yrd_alw,def_st_100_199_yrd_alw,def_st_200_299_yrd_alw,unk131,def_st_350_399_yrd_alw,def_st_400_449_yrd_alw,def_st_450_499_yrd_alw,def_st_500_549_yrd_alw,def_st_550+_yrd_alw,unk155,unk156,unk158,unk175,unk176,unk177,unk178,unk179,unk180,unk181,unk182,unk183,unk184,unk185,unk186,unk187,unk188,unk189,unk190,unk191,unk192,unk193,unk194,unk195,unk196,unk197,fg_made_50_59,unk199,unk200,unk202,unk203,unk210
0,2018,1,Happy Rock Homewreckers,Blainer,David Johnson,Arizona Cardinals,ARI,ACTIVE,RB,RB,21.371488,15.8,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6.0,0.0,4.2,0.0,0.0,0.0,0.0,0.0,0.0,3.6,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2018,1,Happy Rock Homewreckers,Blainer,Melvin Gordon,Los Angeles Chargers,LAC,ACTIVE,RB,RB,16.62425,27.8,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,7.2,0.0,0.0,0.0,0.0,0.0,0.0,12.0,3.6,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2018,1,Happy Rock Homewreckers,Blainer,Rob Gronkowski,New Engalnd Patriots,NE,ACTIVE,TE,TE,14.862558,24.2,6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6.0,0.0,0.0,14.4,2.8,3.0,0.0,-2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2018,1,Happy Rock Homewreckers,Blainer,Allen Robinson,Chicago Bears,CHI,ACTIVE,WR,WR,13.913777,8.8,4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.2,1.6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2018,1,Happy Rock Homewreckers,Blainer,Marvin Jones Jr.,Detroit Lions,DET,INJURY_RESERVE,WR,WR,11.560917,7.6,4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6.0,1.6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Wall time: 1min 38s


In [33]:
# combine rosters from multiple seasons into single dataframe
if combine_years:
    
    if season == 2021:

        rosters_df_2021 = rosters_df.copy()
        rosters_df_w_scoring_2021 = rosters_df_w_scoring.copy()
        
    if season == 2020:

        rosters_df_2020 = rosters_df.copy()
        rosters_df_w_scoring_2020 = rosters_df_w_scoring.copy()
        
    if season == 2019:

        rosters_df_2019 = rosters_df.copy()
        rosters_df_w_scoring_2019 = rosters_df_w_scoring.copy()
        
    if season == 2018:

        rosters_df_2018 = rosters_df.copy()
        rosters_df_w_scoring_2018 = rosters_df_w_scoring.copy()
        
    try:
        rosters_df_2018, rosters_df_2019, rosters_df_2020, rosters_df_2021
        
    except:
        print("haven't ran web scraper for all years yet...")
    
    else:
        
        rosters_df_all = pd.concat([rosters_df_2021, rosters_df_2020, rosters_df_2019, rosters_df_2018]).reset_index(drop=True)        
        rosters_df_all.to_pickle("../data/rosters_df_all.pkl")
        rosters_df_all.info()
        
        rosters_df_w_scoring_all = pd.concat([rosters_df_w_scoring_2021, rosters_df_w_scoring_2020, rosters_df_w_scoring_2019, rosters_df_w_scoring_2018]).reset_index(drop=True)
        rosters_df_w_scoring_all.to_pickle("../data/rosters_df_w_scoring_all.pkl")
        rosters_df_w_scoring_all.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13576 entries, 0 to 13575
Data columns (total 13 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   year                13576 non-null  int64  
 1   week                13576 non-null  int64  
 2   owner_team          13576 non-null  object 
 3   owner               13576 non-null  object 
 4   player              13576 non-null  object 
 5   pro_team            13576 non-null  object 
 6   pro_team_abv        13547 non-null  object 
 7   current_inj_status  12440 non-null  object 
 8   lineup_slot_name    13576 non-null  object 
 9   position_name       13576 non-null  object 
 10  proj_points         13516 non-null  float64
 11  actual_points       12817 non-null  float64
 12  slot_id             13576 non-null  int64  
dtypes: float64(2), int64(3), object(8)
memory usage: 1.3+ MB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13576 entries, 0 to 13575
Columns: 236 entri

## Create Matchups Dataframes <a id="section4"></a>

In this section, we'll create dataframes consisting of...

1. each fantasy football teams' weekly matchups and save it to a csv file
2. each fantasy football teams' total wins/losses through the most recent NFL week and save it to a csv file.
<br><br/>

[Return to Top](#return)

In [34]:
%%time

# instantiate create_rosters object
create_matchups = create_matchups(season)

# create dataframe of weekly matchups
matchups_df = create_matchups.create_weekly_matchups(data)

# create dataframe of total wins/losses
win_loss_df = create_matchups.create_wins_losses(matchups_df)

Weekly Matchups Shape: (160, 8)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 160 entries, 0 to 159
Data columns (total 8 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   week                 160 non-null    int64  
 1   owner_team_name      160 non-null    object 
 2   owner                160 non-null    object 
 3   score                160 non-null    float64
 4   win                  160 non-null    int64  
 5   opp_owner_team_name  160 non-null    object 
 6   opp_owner            160 non-null    object 
 7   opp_score            160 non-null    float64
dtypes: float64(2), int64(2), object(4)
memory usage: 10.1+ KB


None

Unnamed: 0,week,score,win,opp_score
count,160.0,160.0,160.0,160.0
mean,8.5,133.669375,0.4375,133.669375
std,4.624246,31.481613,0.497636,31.481613
min,1.0,44.6,0.0,44.6
25%,4.75,110.6,0.0,110.6
50%,8.5,129.4,0.0,129.4
75%,12.25,155.625,1.0,155.625
max,16.0,239.8,1.0,239.8


Unnamed: 0,week,owner_team_name,owner,score,win,opp_owner_team_name,opp_owner,opp_score
0,1,Sticky Icky,T-$,82.8,0,Happy Rock Homewreckers,Blainer,126.7
1,1,Happy Rock Homewreckers,Blainer,126.7,1,Sticky Icky,T-$,82.8
2,1,Bench Don't Kill My Vibe,Padge,122.2,1,Bud Lathrop Drive,Farmer,101.1
3,1,Bud Lathrop Drive,Farmer,101.1,0,Bench Don't Kill My Vibe,Padge,122.2
4,1,Springfield Atoms,Duvi,119.8,0,Pixel Whippers,Sembower,202.9


Total Wins/Losses Shape: (10, 5)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 5 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   owner_team_name  10 non-null     object 
 1   wins             10 non-null     int32  
 2   losses           10 non-null     int32  
 3   points_for       10 non-null     float64
 4   points_against   10 non-null     float64
dtypes: float64(2), int32(2), object(1)
memory usage: 400.0+ bytes


None

Unnamed: 0,wins,losses,points_for,points_against
count,10.0,10.0,10.0,10.0
mean,7.0,9.0,2138.71,2138.71
std,2.108185,2.108185,240.610393,68.105889
min,4.0,6.0,1827.1,2030.9
25%,5.25,7.5,1955.825,2111.75
50%,7.0,9.0,2078.95,2118.95
75%,8.5,10.75,2359.625,2178.575
max,10.0,12.0,2468.7,2275.2


Unnamed: 0,owner_team_name,wins,losses,points_for,points_against
0,Beacon Hill Posterizers,10,6,2359.8,2113.1
1,Bench Don't Kill My Vibe,4,12,2013.8,2123.0
2,Brookside Shokunin,10,6,2408.2,2197.1
3,Bud Lathrop Drive,7,9,2052.4,2114.9
4,CoMo FightinCamlToes,5,11,1827.1,2030.9


Wall time: 108 ms


In [35]:
# combine matchups from multiple seasons into single dataframe
if combine_years:
    
    columns = ['year','week','owner_team_name','owner','score','win','opp_owner_team_name','opp_owner','opp_score']
    
    if season == 2021:

        matchups_df_2021 = matchups_df.copy()
        matchups_df_2021['year'] = 2021
        matchups_df_2021 = matchups_df_2021[columns].sort_values('week')
        matchups_df_2021.info()
        
    if season == 2020:

        matchups_df_2020 = matchups_df.copy()
        matchups_df_2020['year'] = 2020
        matchups_df_2020 = matchups_df_2020[columns].sort_values('week')
        matchups_df_2020.info()
        
    if season == 2019:

        matchups_df_2019 = matchups_df.copy()
        matchups_df_2019['year'] = 2019
        matchups_df_2019 = matchups_df_2019[columns].sort_values('week')
        matchups_df_2019.info()
        
    if season == 2018:

        matchups_df_2018 = matchups_df.copy()
        matchups_df_2018['year'] = 2018
        matchups_df_2018 = matchups_df_2018[columns].sort_values('week')
        matchups_df_2018.info()
        
    try:
        matchups_df_2018, matchups_df_2019, matchups_df_2020, matchups_df_2021
        
    except:
        print("\nhaven't ran web scraper for all years yet...")
    
    else:
        
        matchups_df_all = pd.concat([matchups_df_2021, matchups_df_2020, matchups_df_2019, matchups_df_2018]).reset_index(drop=True)
        matchups_df_all.to_pickle(f"../data/matchups_df_all.pkl")
        matchups_df_all.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 160 entries, 0 to 145
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   year                 160 non-null    int64  
 1   week                 160 non-null    int64  
 2   owner_team_name      160 non-null    object 
 3   owner                160 non-null    object 
 4   score                160 non-null    float64
 5   win                  160 non-null    int64  
 6   opp_owner_team_name  160 non-null    object 
 7   opp_owner            160 non-null    object 
 8   opp_score            160 non-null    float64
dtypes: float64(2), int64(3), object(4)
memory usage: 12.5+ KB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 648 entries, 0 to 647
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   year                 648 non-null    int64  
 1   week                 648