<h2> Exercise 6 - Estimation of the Least Squares Predictor </h2>

We will now estimate the optimal least squares predictor of a variable $y$ given a variable $x$ when $y$ is a linear function of $x$ plus an error term:
\begin{align}
y = \beta x + \epsilon
\end{align}
We will consider two possible residuals in this specification:
\begin{align}
\epsilon &\sim \text{N}(0,1)\notag\\
\epsilon &\sim e^x + u,\ \ u\sim\text{N}(0,1)\notag
\end{align}

In [12]:
using Distributions
using PyPlot

For fun I'll draw the values of $x$ from a $\chi_3^2$ distribution.  Generally speaking $\chi_q^2$ distributions frequently arise as the asymptotic distributions of test statistics.  They have mean $q$ and variance $2q$.

In [13]:
d = Chisq(3);
println("Mean: ", mean(d))
println("Variance: ", var(d))

Mean: 3.0
Variance: 6.0


Next, we draw several values of $x$ and $\epsilon$ and compute the corresponding values of $y$.

In [14]:
N = 10000;
beta = 2;
x = rand(d,N);
eps1 = rand(Normal(),N);
eps2 = exp.(x) + rand(Normal(),N);
y1 = beta*x+eps1;
y2 = beta*x+eps2;

Finally, we can compute the least squares predictor using the sample analogue of the first order condition solution.

In [15]:
betaHat1 = (x'*y1)/(x'*x);
betaHat2 = (x'*y2)/(x'*x);
println("Standard normal errors: Estimated beta is ", betaHat1)
println("Nonlinear errors: Estimated beta is ", betaHat2)

Standard normal errors: Estimated beta is 2.0023314822300837
Nonlinear errors: Estimated beta is 107534.30805260035


When the residuals are independent of $x$, we get back the linear coefficient $\beta$.  However, when they contain a nonlinear function of $x$ (and in this case a function which can drastically impact the value of $y$), the best linear predictor of $y$ is no longer $\beta x$.  Rather, we get a very large coefficient which attempts to correct for the large impact of the $e^x$ term.

Repeating with a Cauchy distribution for $x$ or $\epsilon$:

In [25]:
CauchyDraws = rand(Cauchy(),N);
yCx = beta*CauchyDraws + eps1;
yCeps = beta*x+CauchyDraws;
betaHatCx = (CauchyDraws'*yCx)/(CauchyDraws'*CauchyDraws);
betaHatCeps = (x'*yCeps)/(x'*x);
println("Cauchy distribution for x: Estimated beta is ", betaHatCx)
println("Cauchy errors: Estimated beta is ", betaHatCeps)

Cauchy distribution for x: Estimated beta is 1.9999671121058735
Cauchy errors: Estimated beta is 2.0004583616675777


Despite not satisfying the requirements of the law of large numbers (finite variance) in the population, the least squares predictor estimated in the finite sample still gives a close approximation to $\beta$.