## PART 2 - HETEROSKEDASTICITY

---
**20. Explain the problem of heteroskedasticity with an example of the course.**

Firstly, under homoskedasticity, the variances of the error term $u$ and the dependent variable $y$ are assumed to be constant, *i.e.*, 

$$
Var(u|x) = Var(y|x) = \sigma^{2}
$$

However, under heteroskedasticity conditions, what happens is that the variance of the error $Var(u|x)$ depends on the independent variable $x$. Consequently, the variance of $y$ $Var(y|x)$ will also depend on $x$. This violation of the homoskedasticity assumption can lead to issues, such as biased standard errors and inefficient parameter estimates in Ordinary Least Squares regression.

In the example studied in course, the examination of the education-wage relationship underscores the challenge posed by heteroskedasticity. The quest for an unbiased estimation of education's impact on wages necessitates the assumption $E(u|educ) = 0$, accompanied by the assumption of homoskedasticity $Var(u∣educ) = \sigma^{2}$. This implies constant wage variance $Var(wage∣educ) = \sigma^{2}$ across education levels, allowing for varying mean wages while assuming consistent variance. However, the realism concern is acknowledged: higher education levels may introduce greater wage variability due to diverse job opportunities, contrasting with lower variability at lower education levels.

Expanding on this, assume we build a model assuming homoskedasticity, implying constant error term variance across education levels. In the real world, individuals with higher education may experience more diverse work opportunities, leading to increased wage variability. Conversely, those with lower education levels may face fewer opportunities, resulting in reduced wage variability. Thus, insisting on homoskedasticity introduces bias into the model, as it overlooks the varying wage volatility associated with different education levels.

---
**21. Suppose that $ E(u u')= \sigma^2 \Omega$. Show that the GLS estimator is the best linear unbiased estimator.**

Fistly, considering the model $y = X \beta + u$, where $E(u|X) = 0$, we derive the expression of the OLS estimator of $\beta$, called $b$. Considering a set of $n$ i.i.d. observations, the collected data is represented by the following esquation:

$$
y = X b + u \quad \quad \quad \quad (12)
$$

By expliciting the matrix notation of the equation (12), we obtain the following equivalent equation:

$$
\begin{bmatrix}
y_{1}\\
y_{2}\\
\vdots\\
y_{N}
\end{bmatrix} = \begin{bmatrix}
x_{11} & x_{12} & \cdots & x_{1K}\\ 
x_{21} & x_{21} & \cdots & x_{2K}\\ 
\vdots & \vdots & \ddots & \vdots\\ 
x_{N1} & x_{N2} & \cdots & x_{NK} 
\end{bmatrix}
\begin{bmatrix}
b_{1}\\
b_{2}\\
\vdots\\
b_{K}
\end{bmatrix}
+
\begin{bmatrix}
u_{1}\\
u_{2}\\
\vdots\\
u_{N}
\end{bmatrix}
$$

In order to derive an expression for $b$, we minimize sum of the squared residuals, i.e., we minimize

$$
u' u = \begin{bmatrix}
u_{1}\ u_{2}\ \cdots\ u_{N}
\end{bmatrix} \begin{bmatrix}
u_{1}\\
u_{2}\\
\vdots\\
u_{N}
\end{bmatrix}= \sum_{i=1}^{N} u_{i}^{2}
$$

To that, it is first necessary to isolate the expression of $u$ and then, compute its transpose, so that the product $u' u$ can be minimized. From the equation (12), is given by:

$$
u = y - X b \Rightarrow u' = (y - X b)' = y' - b' X'
$$

Thus, the product to be minimized is:

$$
u' u = (y' - b' X')(y - X b) = y'y - y'Xb - b'X'y + b'X'Xb 
$$

By noting that $b'X'y = (b'X'y)' = y'Xb$, we arrive in the following expression:

$$
\text{min } u'u = y'y - 2b'X'y + b'X'Xb \quad \quad (13)
$$

Finally, what is necessary to do is to take the partial derivative of the equation $(13)$ with respect to $b$ and solve it when it is equal to zero. To that, we use the two following results:

$$
\left\{\begin{matrix}
\frac{\partial}{\partial b} (b'X'y) = X'y \\ 
\frac{\partial}{\partial b} (b'X'Xb) = 2X'Xb 
\end{matrix}\right.
$$

With the above, the equation to be solved for $b$ will be:

$$
\frac{\partial}{\partial b} (u'u) = 0 \iff -2X'y + 2X'Xb = 0 \iff X'Xb = X'y \iff b = (X'X)^{-1}X'y \quad (14)
$$

Now that the estimator has been computed as shown in the equation $(14)$, to shown that it is the best linear unbiased estimator (BLUE), we first check that it is linear. Indeed, the linearity is checked immediately by seeing the linear relatioship between $b$ and the dependent variable $y$. The second step is to show its unbiasedness. To that, we compute its expected value.

$$
E(b) = E\left((X'X)^{-1}X'y\right) = (X'X)^{-1}X'E(y) \quad \quad (15)
$$

Since the assumption $E(u|X) = 0$ holds, then $E(y) = X \beta$ and the equation $(15)$ can be written as follows, proving that the estimator $b$ is unbiased.

$$
E(b) = (X'X)^{-1}(X'X) \beta \iff E(b) = \beta
$$

Now, for the next steps, it is useful to rewrite the expression of $b$ as follows:

$$
\left.\begin{matrix}
y = X \beta + u \\ 
b = (X'X)^{-1}X'y
\end{matrix}\right\}\Rightarrow 
b = \beta + (X'X)^{-1}X'u \quad \quad (16)
$$

With the equation $(16)$, we compute the variance of the estimator considering the Heteroskedasticity supposition that $Var(u) = E(uu') = \sigma^{2}\Omega$.

$$
\begin{align*}
Var(b) &= Var\left(\beta + (X'X)^{-1}X'u\right) \\
       &= Var\left((X'X)^{-1}X'u\right) \\
       &= (X'X)^{-1}X' Var(u) X(X'X)^{-1} \\
       &= (X'X)^{-1}X' \sigma^{2} \Omega X(X'X)^{-1}\\
\end{align*}
$$

Therefore, the equation $(17)$ contains the result of the variance of the estimator, considering Heteroskedasticity.

$$
Var(b) = \sigma^{2} (X'X)^{-1}X' \Omega X(X'X)^{-1} \quad \quad (17)
$$





## PART 3 - TIME SERIES DATA

---
**27. Define strict and weak stationarity.**

* **Strict stationarity:** A time series process is considered strictly stationary if the joint probability distribution of its observations remains invariant under shifts in time. In other words, for any set of time points, the entire probability distribution of the data, including the mean, variance, and higher-order moments, remains constant. This implies that all statistical properties of the time series are unchanged over time, making strict stationarity a more stringent condition compared to covariance stationarity, for example.


* **Weak stationarity:** A time series process is said to be weak stationary if it presents constants mean, variance and autocorrelation structutre over time, but not its entire probability distribution. 

---
**28. Explain ergodicity and state the ergodic theorem. Illustrate with an example.**

---
**29. Why do we need both stationarity and ergodicity?**

---
**30. Explain “spurious regression”.**

---
**31. Define a moving average and explain the trade-off involved in the choice of the size of the window and of whether to center or not the moving average.**