My friends and I are avid OWL fans and participated in a fantasy league during season 1. Unfortunately, my drafted team wasn't the best as I mostly drafted from the heart and chose a bunch of SF Shock players. This season, I've decided to try using data analysis to try to get the edge up on my friends!

In [149]:
from bs4 import BeautifulSoup
import requests
from codecs import open
from re import sub
import sqlite3 as sql
import pandas as pd
import pandas.io.sql as psql

# cell indices of the following attributes
NAME = 1
ROLE = 2
TEAM = 3
KILLS = 4
DEATHS = 5
ULTS = 6
FK = 7
FD = 8
RES = 9
PTS = 10
PTS10 = 11


def clean(text):
    # remove all HTML code and surrounding whitespace to get the clean attribute
    try:
        return re.sub('<.*?>', '', text).strip()
    except:
        return text.strip()

class Player:
    def __init__(self, info):
        self.info = info
        self.name = clean(self.info[NAME].text)
        self.role = clean(self.info[ROLE].text)
        self.team = clean(self.info[TEAM].text)
        self.kills = clean(self.info[KILLS].text)
        self.deaths = clean(self.info[DEATHS].text)
        self.ults = clean(self.info[ULTS].text)
        self.fk = clean(self.info[FK].text)
        self.fd = clean(self.info[FD].text)
        self.res = clean(self.info[RES].text)
        self.pts = clean(self.info[PTS].text)
        self.pts10 = clean(self.info[PTS10].text)

        self.attrs = []
        self.attrs.append(self.name)
        self.attrs.append(self.role)
        self.attrs.append(self.team)
        self.attrs.append(self.kills)
        self.attrs.append(self.deaths)
        self.attrs.append(self.ults)
        self.attrs.append(self.fk)
        self.attrs.append(self.fd)
        self.attrs.append(self.res)
        self.attrs.append(self.pts)
        self.attrs.append(self.pts10)

In [150]:
# this function takes a URL and makes a dataframe out of it
def create_week_dataframe(url):

    content = requests.get(url).content
    #page = open(url, 'r', 'utf-8')
    soup = BeautifulSoup(content, 'html.parser')

    # get the table headers
    table_header = soup.find_all('th')
    columns = ['Week']
    for cell in table_header[1:]: # skip 0 because that just has the player picture
        columns.append(clean(cell.text))

    # loop through the rows within the table for each player
    table = soup.find('tbody')
    player_table = table.find_all('tr')
    players = []
    for row in player_table[0:]: # skip first row because there is no data there
        person = row.find_all('td')
        players.append(Player(person))

    week_num = url.split('week=')[1]

    # add the data into the dataframe table
    df = pd.DataFrame(columns=columns)
    for player in players:
        df = df.append(pd.DataFrame(columns=columns, data=[[week_num] + player.attrs]))

    df = df.set_index('Player')
    
    return df

In [151]:
OWL_URL_BASE = "https://www.winstonslab.com/fantasy/result.php?cID=22&league=2171&week="
weekly_data = []
for i in range(0, 20):
    weekly_data.append(create_week_dataframe(OWL_URL_BASE + str(i + 1)))

In [152]:
# begin conversion into SQL databases
import sqlite3
conn = sqlite3.connect('owl.db')
c = conn.cursor()

In [153]:
i = 0
for week in weekly_data:
    i += 1
    week[week.columns[3:]] = week[week.columns[3:]].apply(pd.to_numeric)
    week['Week'] = week['Week'].apply(pd.to_numeric)
    week.to_sql('week' + str(i), con=conn, if_exists='replace')

Now I need to think of the things I want to explore.

- Who are the best players?
    - Which players tend to be within the top of their roles? Of all roles?
- Which roles should I prioritize?
    - Which roles have only have a few great players? I need to draft at least two people of each role, so this is important.
- Which players seemed to improve as time went on? Which players got worse? Which were consistent?
    - Some players might not have been top players in the beginning of the season but have improved enough to be worth early drafting. Others may have started off great but were subject to fatigue or suffered due to patch differences.
- How much do teams influence the amount of points a player gets?
    - Should I focus more are getting good individual players or on which teams I feel will be successful?
    - If I find that teams are big indicators of individual performance, I may need to be pull in team win/loss data from another source.
    
Because the winner in the fantasy league is determined by who gets most point each week and **not** who gets most points cumulatively, I think it makes the most sense to look at the data longitudinally. I will categorize "top" players as those that are within the etop echelon for mosts weeks. The exact numbers will depend on my data exploration. 

Here are the steps I will take to determine my defintion of "top" players:

1. Rank each player's performance per week
 * Per role
 * Overall
2. Plot each's players ranks against the amount of points they got that week
3. Observe whether there is a natural falloff point (i.e., a rank where the point difference between that and the next rank is noteable)