###### <h2> Exercise 6 - Endogeneity</h2>

We now consider a linear specification $y=\alpha + \beta_1x_1 + \beta_2x_2 +\epsilon$ in which the error term $\epsilon$ depends multiplicatively on $x_1$ with an additional $N(1,1)$ noise factor.  We first attempt to estimate parameters by OLS:

In [1]:
using Distributions
using PyPlot
using LinearAlgebra
using Printf
using Random


d = Normal(0,1);
N = 50;

alpha = 0.78;
beta1 = 2.34;
beta2 = 5.61;

x1 = rand(d,N);
x2 = rand(d,N);
eps = x1.*rand(Normal(1.0,1.0),N);

y = alpha .+ beta1*x1 + beta2*x2 + eps;

X = [ones(N) x1 x2];

betaHat = (X'*X)\(X'*y);

println("Parameter estimate: ", betaHat)

Parameter estimate: [0.7567675318130073, 3.7542089968680523, 5.567687525973848]


While the estimates for $\alpha$ and $\beta_2$ are reasonable, that for $\beta_1$ is far from the specified value.  Recall our main assumptions underlining OLS:
\begin{align}
\mathbb{E}[\epsilon]&=0\notag\\
\mathbb{E}[\epsilon|\boldsymbol{x}]&=0\notag\\
&\implies \mathbb{E}[\boldsymbol{x}\epsilon] = 0\notag
\end{align}
where the third inequality follows from the combination of the first two.  In the specification given, we have
\begin{align}
\mathbb{E}[\epsilon] &= \mathbb{E}[x_1u],\ \ u\sim N(1,1)\notag\\
&= \mathbb{E}[\mathbb{E}[x_1u|x_1]]\notag\\
&= \mathbb{E}[x_1\mathbb{E}[u|x_1]]\ \ \ \text{(Law of Iterated Expectations)}\notag\\
&= \mathbb{E}[x_1\mathbb{E}[u]]\ \ \ \ \ \ \text{(Independence of $x_1$ and $u$)}\notag\\
&= \mathbb{E}[x_1]\notag\\
&= 0
\end{align}
Hence the first main assumption holds.  However, we also have
\begin{align}
\mathbb{E}[x_1\epsilon] &= \mathbb{E}[x_1^2u]\notag\\
&= \mathbb{E}[x_1^2\mathbb{E}[u|x_1]]\notag\\
&= \mathbb{E}[x_1^2]\notag\\
&= 1
\end{align}
so that the variable $x_1$ is endogenous and the exogeneity requirement fails.