This notebook checks if the fantsy point calculations using `FbDbScraper.get_fantasy_df()` are accurate. The `calc_df` dataset was created using the `get_fantasy_df()` method. The `yahoo_df` dataset was scraped from my Yahoo league's player data using the `yahoo_scraper.py` module. It contains player names and their correct fantasy point totals to test against. The object using the `get_fantasy_df()` method has the same fantasy settings as my Yahoo league.

Add `..` to the path and import the necessary modules:

In [1]:
import os
import sys

sys.path.append('..')

from football_db_scraper import FbDbScraper
from pro_football_ref.pro_football_ref_scraper2 import ProFbRefScraper
import pandas as pd
import numpy as np

Create an object:

In [2]:
fb_db = FbDbScraper()

Get the data frames:

In [3]:
calc_df = fb_db.get_fantasy_df(start_year=2017, end_year=2017)
yahoo_df = pd.read_csv('yahoo_data.csv')

Take a quick look at how each data frame is shaped (mainly concerned with the number of rows):

In [4]:
calc_df.shape

(613, 36)

In [5]:
yahoo_df.shape

(1050, 3)

Sort the values in case the data sets need to be looked at:

In [6]:
calc_df.sort_values(by='fantasy_points', axis=0, ascending=False, inplace=True)
yahoo_df.sort_values(by='fantasy_points', axis=0, ascending=False, inplace=True)

Get a list of player names that are in each data set.

***Note:*** There are names in one data frame that are not in another data frame. In this case, we are not concerned with which names are in one data frame compared to the other. We are only interested in if the fantasy point calculations are correct.

In [7]:
names = [name for name in calc_df['name'].values if name in yahoo_df['name'].values]

The number of players in each data set:

In [8]:
len(names)

526

Get a list of players whose fantasy point calculations are not the same as the actual amount. An empty list means all calculations are correct:

In [9]:
incorrect = []
for name in names:
    if name == 'Chris Thompson':
        # There are two Chris Thompsons in the dataset, which leads to issues when grabbing from the name column.
        # Therefore, we will leave those players out.
        continue
    calculated = round(calc_df[calc_df['name'] == name]['fantasy_points'].values[0], 2)
    yahoo = yahoo_df[yahoo_df['name'] == name]['fantasy_points'].values[0]
    if calculated != yahoo:
        incorrect.append(name)

All calculations are correct! (so it seems at least)

In [10]:
len(incorrect)

0

In [11]:
incorrect

[]