<h2> Exercise 5 - Hypothesis Testing 2</h2>

We will now use the specifications in the previous example and consider the hypothesis $H_0:\ \beta_1\neq 0$.  This time I'll use a $\Gamma$ distribution for $x$, which is a distribution commonly used to model waiting times.

In [109]:
using Distributions
using PyPlot
using LinearAlgebra
using Printf
using Random


d = Normal(0,1);#Gamma(0.5,0.05);
N = 20;

alpha = 10.24;
beta1 = 0.0;
beta2 = 0.0;

x1 = rand(d,N);
x2 = rand(d,N);
eps = rand(Normal(0.0,0.00005),N);

y = alpha .+ beta1*x1 + beta2*x2 + eps;

X = [ones(N) x1 x2];

betaHat = (X'*X)\(X'*y);

println("Parameter estimate: ", betaHat)

Parameter estimate: [10.240011819311386, -4.751711835709744e-6, -5.8033989880139674e-5]


We now need to calculate the test statistic
\begin{align}
				\widehat{F} &= \frac{\left[\sum_{j=1}^N(y_j - \overline{y})^2-\sum_{j=1}^N(y_j - \widehat{y}_j)^2\right]/2}{\sum_{j=1}^N(y_j - \widehat{y}_j)^2/(N-3)}\notag
\end{align} 
where
\begin{align}
\widehat{y}_j &= \widehat{\alpha} + \widehat{\beta}_1x_{1,j} + \widehat{\beta}_2 x_{2,j}\notag\\ 
\overline{y} &= \frac{1}{N}\sum_{j=1}^N y_j\notag
\end{align}

In [110]:
yBar = mean(y);
yHat = betaHat[1] .+ betaHat[2]*x1 + betaHat[3]*x2;
Fhat = (sum((y.-yBar).^2) - sum((y-yHat).^2)/2)/(sum((y-yHat).^2)/(N-3))

9.170433448353762

We can now compute the $p$ values as the likelihood of obtaining a value at least as large as $\widehat{F}$ from a $F(2,N-3)$ draw.  Since such a variable can only take values in $[0,\infty)$, if $G$ is the corresponding CDF we can compute the $p$ value as $1 - G(\widehat{F})$.

In [111]:
p = 1-cdf(FDist(2,N-3),Fhat)

println("p value: ", p)

p value: 0.01107816942682227


Even with extreme values for error variance and sample size, the test gives a Type I error.