### Is the performance of fantasy teams in our league better than random at a statistically significant level?

Hank and I have been managing a team for 4 years so we want to assess whether we are skilled or lucky. We will define skilled as the sum of our ranks having less than a 5% chance of occuring when sampling randomly from a uniform distribution. 

Formally, we define this as:

$T = \sum_{i=1}^n x_i$ where $X \sim\ \lceil$Unif$(0, 12)\rceil$ and $n$ is the number of years the team has been in the league

$H_0: (T >= \tau)$ vs $(H_1: T < \tau)$ where $P(T< \tau) = 0.05$ or $F^{-1}(0.05) = \tau$ where $F$ is the CDF for $T$

In [11]:
import numpy as np

from numpy import random as rand
rand.seed(69)

num_years = 4
N = int(1e6)
draws = np.ceil(rand.uniform(0, 12, size=(N,num_years)))
ranks = draws.sum(axis=1)

place = [2, 5, 4, 4]  # 2020 season hasn't finished yet but we're in final 4 so this is our worse outcome for place
pts = [4, 8, 4, 3]
mediocre = [6, 7, 6, 7]

T_chance = sum(mediocre)
T = sum(pts); T

19

In [12]:
from scipy.stats import percentileofscore
crit_val = 0.05
print('The rejection region is where T <= {} using a critical value of {}'.format(np.percentile(ranks, crit_val*100), crit_val))
print('The p value of our pts T={} statistic is {}'.format(T, np.round(percentileofscore(ranks, T)/100, 4)))
print('The p value of our rank T={} statistic is {}'.format(sum(place), np.round(percentileofscore(ranks, sum(place))/100, 4)))
print('For calibration, the p value of our mediocre T={} statistic is {}'.format(T_chance, np.round(percentileofscore(ranks, T_chance)/100, 4)))

The rejection region is where T <= 15.0 using a critical value of 0.05
The p value of our pts T=19 statistic is 0.1629
The p value of our rank T=15 statistic is 0.0572
For calibration, the p value of our mediocre T=26 statistic is 0.4998


As a result, we fail to reject the null hypothesis concluding **our performance in points and place rank does not provide sufficient evidence that it is different from luck** based on our decision threshold $\alpha = 0.05$. It should be noted if we finish higher than 4, we can reject on place. 

Next let's test the performance from the team that won the championship 3/4 and could win a 4th this year.

In [13]:
num_years = 5
N = int(1e6)
draws = np.ceil(rand.uniform(0, 12, size=(N,num_years)))
ranks = draws.sum(axis=1)

br_place = [1, 1, 1, 9, 4]  # 2020 season hasn't finished yet but they're in final 4 so this is their worse outcome for place
br_pts = [2, 2, 4, 8, 5]

T = sum(br_pts); T

crit_val = 0.05
print('The rejection region is where T <= {} using a critical value of {}'.format(np.percentile(ranks, crit_val*100), crit_val))
print('The p value for pts with a T={} statistic is {}'.format(T, np.round(percentileofscore(ranks, T)/100, 4)))
print('The p value for pts with a T={} statistic is {}'.format(sum(br_place), np.round(percentileofscore(ranks, sum(br_place))/100, 4)))

The rejection region is where T <= 20.0 using a critical value of 0.05
The p value for pts with a T=21 statistic is 0.07
The p value for pts with a T=16 statistic is 0.0146


We reject the null hypothesis, concluding that **we have sufficient evidence to conclude their performance is better than luck** using the rankings based on our decision threshold $\alpha = 0.05$. However with points, which is arguably a better indicator of performance, we fail to reject. 