# Day 13
Today I'm going to do something a little different. Instead of doing some sort of analysis I'm going to create a script to scrape [Fantasy Football Tiers](http://www.borischen.co) created by Boris Chen. All credit to him for providing this free data every week.

Eventually I plan on adding code to connect to your Yahoo Fntasy Football league so you can see at a glance who you should be starting based on the Boris Chen rankings. I'll try to create a Web App to make this easier so instead of downloading this notebook you can visit the app, connect to your Fantasy League (or imput your players manually), and see how they rank.

## Set Up

In [118]:
from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
import dataframe_image as dfi

In [15]:
# Scrape
url = f"http://www.borischen.co/p/half-05-5-ppr-running-back-tier-rankings.html"
response = requests.get(url)

# Parse
soup = Soup(requests.get(url).content, "html.parser")

In [25]:
# Look for where the table is stored
soup.find("object")

<object data="https://s3-us-west-1.amazonaws.com/fftiers/out/text_RB-HALF.txt" style="height: 100%; margin: 1%; width: 100%;" type="text/html"></object>

In [29]:
# Get url from data tag
url = soup.find("object")['data']
url

'https://s3-us-west-1.amazonaws.com/fftiers/out/text_RB-HALF.txt'

## Extracting the Data

In [113]:
def get_player_tiers(position, scoring):

    """
    position: 'RB', 'WR', 'QB', or 'FLX
    scoring: 
        "STAN": standard
        "HALF": half-ppr
        "PPR: ppr
    """

    # Build URL
    if position == "QB":
        url = f"https://s3-us-west-1.amazonaws.com/fftiers/out/text_{position}.txt"
    else:
        if scoring == "STAN":
            url = f"https://s3-us-west-1.amazonaws.com/fftiers/out/text_{position}.txt"
        else:   
            url = f"https://s3-us-west-1.amazonaws.com/fftiers/out/text_{position}-{scoring}.txt"
    
    # Get table with player info
    table = requests.get(url).text

    # Clean up
    temp = [x.strip() for x in table.replace("\n",",").split(",")]

    # Get data into containers for saving into a DataFrame
    data = {}
    player_names = []
    tiers = []

    current_tier = 1

    for i in temp[:-1]:
        if i[:4] == "Tier":
            
            current_tier = int(i.split(":")[0].split(" ")[1])

            player_names.append(i.split(":")[1].strip())
            tiers.append(current_tier)
        else:
            player_names.append(i)
            tiers.append(current_tier)

    data['player_name'] = player_names
    data['position'] = [position for i in list(range(1,len(player_names)+1))]
    data['scoring'] = [scoring for i in list(range(1,len(player_names)+1))]
    data['tier'] = tiers
    

    return pd.DataFrame(data)

def get_my_players(ds, player_list, scoring_list):
    
    f_players = ds['player_name'].isin(player_list)
    f_scoring = ds['scoring'].isin(scoring_list)

    # Clean up index for easier legibility
    _ = ds[f_players & f_scoring].sort_values(['position', 'scoring', 'tier'], ascending=[False, True, True])
    _.index = _.index + 1
    
    return _

In [97]:
get_player_tiers('RB', 'HALF').head()

Unnamed: 0,player_name,position,scoring,tier
0,Austin Ekeler,RB,HALF,1
1,Alvin Kamara,RB,HALF,1
2,Derrick Henry,RB,HALF,1
3,Josh Jacobs,RB,HALF,1
4,Aaron Jones,RB,HALF,2


## Get player rankings

Imagine you are playing .5 PPR and want to know where your QBs, RBs, WRs, TE's and FLEX players stand. You can do the following:

In [115]:
# Positions you have on your team
positions = ['QB', 'RB', 'WR', 'FLX', 'TE']

# League scoring
# Add more if you are in multiple leagues with different scoring systems
scoring_systems = ['HALF']

# Get data
datasets = []

for scoring in scoring_systems:
    for pos in positions:
        datasets.append(get_player_tiers(pos, scoring))
    

all_players = pd.concat(datasets)
all_players.head()

Unnamed: 0,player_name,position,scoring,tier
0,Josh Allen,QB,HALF,1
1,Jalen Hurts,QB,HALF,1
2,Patrick Mahomes II,QB,HALF,1
3,Lamar Jackson,QB,HALF,2
4,Kyler Murray,QB,HALF,2


### Only your players

In [119]:
my_players = [
        'Josh Allen', 
        'Chris Godwin', 
        'Josh Palmer', 
        'Josh Jacobs', 
        'Khalil Herbert', 
        'Taysom Hill', 
        'Michael Pittman Jr.', 
        'Justin Herbert', 
        'T.J. Hockenson']

# Index will be the overall rank per position, scoring system
my_player_rankings = get_my_players(all_players, my_players, ['HALF'])
my_player_rankings

Unnamed: 0,player_name,position,scoring,tier
12,Chris Godwin,WR,HALF,3
24,Michael Pittman Jr.,WR,HALF,4
6,Taysom Hill,TE,HALF,2
12,T.J. Hockenson,TE,HALF,3
4,Josh Jacobs,RB,HALF,1
24,Khalil Herbert,RB,HALF,6
1,Josh Allen,QB,HALF,1
6,Justin Herbert,QB,HALF,2
8,Chris Godwin,FLX,HALF,2
22,Michael Pittman Jr.,FLX,HALF,5


In [120]:
# Save table for Twitter post
my_player_rankings.dfi.export('./twitter/day13_table.png')

objc[9540]: Class WebSwapCGLLayer is implemented in both /System/Library/Frameworks/WebKit.framework/Versions/A/Frameworks/WebCore.framework/Versions/A/Frameworks/libANGLE-shared.dylib (0x23c0972e0) and /Applications/Google Chrome.app/Contents/Frameworks/Google Chrome Framework.framework/Versions/107.0.5304.87/Libraries/libGLESv2.dylib (0x10e4480d8). One of the two will be used. Which one is undefined.
[1106/145354.991640:INFO:headless_shell.cc(657)] Written to file /var/folders/pr/phs5jp1d143fx1t05hqzwt580000gn/T/tmpbfbg89mu/temp.png.
