### Chow test (forecast) ###

If we're able to split our data as **in-sample** and **out-of-sample**, we can test whether we can use the in-sample to calculate the effects and use those estimators to obtain a reliable forecast in the out-of-sample. To do this, assume our data is generated by the following DGP:

$$\begin{bmatrix}
y_{1, n \times 1} \\
y_{2, g \times 1}
\end{bmatrix} =
\begin{bmatrix}
X_{1, n \times k} & 0_{n \times g} \\
X_{2, g \times k} & I_g 
\end{bmatrix}
\begin{bmatrix}
\beta_{k \times 1} \\
\gamma_{g \times 1}
\end{bmatrix}+
\begin{bmatrix}
\epsilon_{1, n \times 1} \\
\epsilon_{2, g \times 1}
\end{bmatrix}
$$

The hypothesis to be tested is $H_0: \gamma = 0$, $H_a: \gamma \neq 0$. Why? $\gamma$ can be understood as the **forecast errors**.  If $\gamma \neq 0$, we need the **full** sample to reliably estimate $\beta$ (otherwise we will have ommited variable bias).

***

Our estimated model will be:

$$\begin{bmatrix}
y_{1, n \times 1} \\
y_{2, g \times 1}
\end{bmatrix} =
\begin{bmatrix}
X_{1, n \times k} & 0_{n \times g} \\
X_{2, g \times k} & I_g 
\end{bmatrix}
\begin{bmatrix}
b_{k \times 1} \\
c_{g \times 1}
\end{bmatrix}+
\begin{bmatrix}
e_{1, n \times 1} \\
e_{2, g \times 1}
\end{bmatrix}
$$

Which implies:

\begin{align}
    y_{1, n \times 1} &= X_{1, n \times k}b_{k \times 1} + e_{1, n \times 1}  \\
    y_{2, g \times 1} &= X_{2, g \times k}b_{k \times 1} + c_{g \times 1} + e_{2, g \times 1}
\end{align}

The criterion function will be the sum of squares of the errors:

$$S(b,c)= SSR = e_{1}^Te_{1} + e_{2}^Te_{2} = (y_{1} - X_{1}b)^T(y_{1} - X_{1}b) + (y_{2} - X_{2}b - c)^T(y_{2} - X_{2}b - c) $$

Under $H_0: \gamma = 0$ (**restricted model**), our $SSR$ becomes

$$SSR_R = (y_{1} - X_{1}b_R)^T(y_{1} - X_{1}b_R) + (y_{2} - X_{2}b_R)^T(y_{2} - X_{2}b_R) $$

The $SSR$ is calculated with the residuals of an OLS model that uses the full sample data ($X1$ and $X2$).

Under $H_a : \gamma \neq 0 $ (**non restricted model**), our $SSR$ becomes

$$ SSR = (y_{1} - X_{1}b)^T(y_{1} - X_{1}b) + (y_{2} - X_{2}b - c)^T(y_{2} - X_{2}b - c) $$

Take the derivative with respect to $c$:

$$ \frac{dSSR}{dc} = -(y_{2} - X_{2}b - c) = 0 $$

The $SSR$ becomes 

$$ SSR = (y_{1} - X_{1}b)^T(y_{1} - X_{1}b) $$

The $SSR$ is calculated with the residuals of an OLS model that only uses the in-sample data $X_1$. If there is a forecast error, it will "absorb" the $SSR$ in the out-of-sample. It seems counterintuitive, but we need the $SSR$ to be high so we can conclude that there is no forecast errors reducing it.

In both cases, we only estimate the vector $b_{k \times 1}$, so we need to take the inverse of a matrix with dimensions $k \times k$. This is more convenient than trying to evaluate the forecast errors, which would need the inverse of a $g \times g$ matrix.

***

For the $F$ statistic: The degrees of freedom for the Chow test are $g$, because $\gamma$ is a $g \times 1$ vector (the variables being restricted), and $n+g$ rows minus $k+g$ estimators in the unrestricted model: $n-k$. 

$$F = \frac{(SSR_R - SSR)/g}{SSR/(n-k)}   $$

Note that this works only becase we make the assumption that

$$ \begin{bmatrix}
\epsilon_{1, n \times 1} \\
\epsilon_{2, g \times 1}
\end{bmatrix} \sim N_{n+g}(0, \sigma^2I_{n+g})
$$

And therefore we can use our previously derived $F$ statistic that eliminates the unknown $\sigma$ parameter.


***
**Conclusion:**

To test whether we can use only the in-sample to calculate the effects and use those estimators to obtain a reliable forecast in the out-of-sample, compare how much does the $SSR$ increase for imposing the "restriction" that we can forecast the data.