<h2> Exercise 8 - Measurement Error </h2>

We now demonstrate how measurement error in the dependent variable is typically not a concern, while measurement error in the explanatory variable which is uncorrelated with its true value biases estimates and increases the chance of a Type II error.  For this exercise, I'll samples from an exponential distribution, which is a special case of the $\Gamma$ distribution giving the time between arrivals in a Poisson process.

In [4]:
using Distributions;
using PyPlot;

d = Exponential();
N = 10000;
beta = 2.0;

x = rand(d,N);
eps = rand(Normal(),N);
y = beta * x + eps;
betaHat = (x'*y)/(x'*x);

println("Estimate using true values: ", betaHat)

Estimate using true values: 2.003328474256487


We begin by introducing measurement error $\epsilon_{\text{meas}}$ into $y$, resulting in an observed variable $y_{\text{obs}}\neq y$.  First we'll consider the case where the measurement error is correlated with the true value by setting it equal to the residual in the DGP for $y$.

In [5]:
epsMeas = eps;
yObs = y + epsMeas;
println("Covariance between measurement error and true value of y: ", cov(y,epsMeas))

Covariance between measurement error and true value of y: 0.9863863186621228


Now we regress $y_{\text{obs}}$ on $x$ to see how this affects our estimate of $\beta$.

In [6]:
betaHat1 = (x'*yObs)/(x'*x)
println("Estimate using yObs: ", betaHat1)

Estimate using yObs: 2.006656948512974


The estimated coefficient is only affected minimally.  Likewise, if we repeat the exercise with measurement errors uncorrelated with the true value of $y$:

In [7]:
epsMeas = rand(Normal(),N);
yObs = y + epsMeas;
println("Covariance between measurement error and true value of y: ", cov(y,epsMeas))
betaHat2 = (x'*yObs)/(x'*x)
println("Estimate using yObs: ", betaHat2)

Covariance between measurement error and true value of y: 0.0021257177236058944
Estimate using yObs: 2.0035028125287533


However, if we consider measurement errors in the explanatory variable $x$ which are uncorrelated with the true value:

In [8]:
epsMeas = rand(Normal(),N);
xObs = x + epsMeas;
println("Covariance between measurement error and true value of x: ", cov(x,epsMeas))
betaHat3 = (xObs'*y)/(xObs'*xObs)
println("Estimate using xObs: ", betaHat3)

Covariance between measurement error and true value of x: -0.0033819304899209007
Estimate using xObs: 1.3280087904244329


As theory suggests, the error is now biased downwards, increasing the chance of a Type II error.