# Standard Deviation vs. Interquartile Range

The standard deviation and interquartile range are two measures of the spread of a distribution. It is potentially misleading to rely on standard deviation to assess player consistency in fantasy football for a couple of reasons.

Standard deviation is the square root of variance, which is the sum of squares of the deviation from the mean. In small sample sizes, such as a NFL season, a single outlier game can distort the mean. There is a related problem that the reader may assume that 68% of the values are within one standard deviation of the mean, which does not hold if the values are not normally distributed. 

Consider the following player, who has a random score between 10-15 in 15 games and then a 40-point explosion in week 17. This player is extremely consistent, and the "inconsistent" game is not problematic from a fantasy perspective, as there is no downside from a player having an occasional big game.

In [28]:
gamelog1 <- c(sample(10:15, 15, replace = TRUE), 40)
gamelog1

In [29]:
library(psych)
describe(gamelog1)

Unnamed: 0,vars,n,mean,sd,median,trimmed,mad,min,max,range,skew,kurtosis,se
X1,1,16,13.6875,7.217744,12,12.07143,2.2239,10,40,30,2.983278,8.060159,1.804436


Using standard deviation as a measure of consistency presents a misleading picture in this case because the mean is artifically inflated by one outlier. It also suggests that 11 of the player's games (16 x .68) fall between 6.5 (13.7-7.2) and 20.9 (13.7 + 7.2) fantasy points, which overstates the amount of week-to-week variance because the minimum value is 10 and only 1 game falls outside of the range of 10-15.

Interquartile range (IQR) shows that 50% of the values are between 10.75 and 13.25 which is more accurate, and probably says more about consistency than deviation from the mean that is skewed by an outlier.

In [30]:
summary(gamelog1)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   10.75   12.00   13.69   13.25   40.00 

Consider another player who is more inconsistent on a weekly basis, but does not have a big game. Standard deviation depicts this player as much more consistent than our previous player, despite this not really being the case. IQR is higher for player 2 than for player 1, despite a much lower standard deviation.  

In [35]:
gamelog2 <- sample(7:17, 16, replace = TRUE)
gamelog2

In [36]:
describe(gamelog2)

Unnamed: 0,vars,n,mean,sd,median,trimmed,mad,min,max,range,skew,kurtosis,se
X1,1,16,12.5625,3.119161,11.5,12.64286,2.9652,7,17,10,0.1109729,-1.302346,0.7797903


In [37]:
summary(gamelog2)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   7.00   10.75   11.50   12.56   15.25   17.00 