We are calculating variances and covariances as:

$$
var(\hat{y}^i_{h,t}) = \big[\sigma^2_{\alpha} + 2 \sigma_{\alpha \beta} h + \sigma^2_{\beta}h^2 \big] + var(z_{h,t}^i) +  \phi_t^2 \sigma^2_{\epsilon}
$$


$$
covar(\hat{y}^i_{h,t}, \hat{y}^i_{h+n,t+n}) = \big[\sigma^2_{\alpha} + 2 \sigma_{\alpha \beta} (h+n) + \sigma^2_{\beta}h(h+n) \big] + \rho^2 var(z_{h,t}^i) 
$$

Where $n = 1,...,min(H-h, T-t)$ is the minimum of the distance of an individual to the maximum experience and the maximum time period - e.g. an individual who is age 20 in 1995 could potentially have 45 years of experience, but as the sample is truncated in 1996, $n$ will be 1 ($T-t$), while an individual who is 60 years old in 1970 could stay in the sample for 26 more years, but as age is capped at 64, $n$ will be 4 ($H-h$).

The variance of $z_{h,t}^i$ will be obtained recursively by:

\begin{align}
var(z_{1,t}^i) &= \pi^2_t \sigma^2_{\eta}, \ h=1, t>1 \\
var(z_{h,1}^i) &= \pi^2_1 \sigma^2_{\eta} \sum_{j=0}^{h-1}\rho^{2j}, \ t=1, h>1 \\
var(z_{h,t}^i) &= \rho^2 var(z_{h-1,t-1}) + \pi_t^2 \sigma^2_{\eta}, \ t,h>1
\end{align}

Where the first line implicitly assumes that the initial value of the persistent shock is zero for all individuals (i.e. at $h=1$, the first period of working life, there is no $\rho z_0$ entering the individuals income process).

What the above equations tell us is that we will have two slightly different ways of calculating an individual's variance for any $t,h>1$, depending on whether $h \leq t$ or $h > t$. Intuitively, this is due to the fact that when $h > t$, the individual has already accumulated a history of shocks at $t=1$, the first period of observation. In this case, the recursion for $var(z_{h,t}^i)$ will end with the $var(z_{h,1}^i)$ term, so individuals will differ based on how many working years they accrued before the start of the sample (these are individuals who started work prior to 1968). If $h \leq t$, the recursion ends in $var(z_{1,t}^i)$, so individuals will only differ in the first shock they experienced upon entering the sample at the start of their working life (these are individuals who start work after 1968).

To illustrate, let's compare $var(z_{4,5})$ and $var(z_{5,4})$. In the first case, $h<t$, so we end up at the $var(z_{1,t}^i)$ term. Writing out the recursion gives us $\rho^6 \pi^2_2 \sigma^2 + \rho^4 \pi^2_3 \sigma^2 + \rho^2 \pi^2_4 \sigma^2 + \pi^2_5 \sigma^2$, or, in matrix notation:

$$
\sigma^2 \left( \begin{bmatrix} \rho^6 & \rho^4 & \rho^2 & 1 \end{bmatrix} 
\begin{bmatrix} \pi^2_2 \\ \pi^2_3 \\ \pi^2_4 \\ \pi^2_5 \end{bmatrix} \right)
$$

while in the second case, we end up at the $var(z_{h,1}^i)$ term, which will add information on the shocks accumulated during the working life prior to entering the sample. Writing out the recursion gives $\rho^6 \pi^2_1 \sigma^2 \sum_{j=0}^{h-1} \rho^{2j} + \rho^4 \pi^2_2 \sigma^2 + \rho^2 \pi^2_3 \sigma^2 + \pi^2_4 \sigma^2$, or, in matrix notation:

$$
\sigma^2 \left( \begin{bmatrix} \rho^6 & \rho^4 & \rho^2 & 1 \end{bmatrix} 
\begin{bmatrix} \pi^2_1 \sum_{j=0}^{h-1} \rho^{2j}  \\ \pi^2_2 \\ \pi^2_3 \\ \pi^2_4 \end{bmatrix} \right)
$$




In the [construction of the empirical covariance matrix](https://github.com/nilshg/psidJulia/blob/master/prepCovMat.ipynb) we are allowing for a maximum lang length of 29. The maximum age that a 4-year-cohort could reach (without considering the length of the sample) is 41 - upon entering the age-midpoint is 22 (ages 20-24), while the cohort exits the sample with an age midpoint of 62 (ages 60-64). For this reason, every cohort will have a $(41x41)$ matrix holding all possible variances and covariances for each year of life, although only a maximum of 29 values will be found in each column (the maximum lag length). 

For the cohort with an age-midpoint of 22 in 1968, this matrix would look as follows:

\begin{pmatrix}
v^1_{68} & 0           & \cdots &        & & 0 \\
c^2_{68,69} & v^2_{69} & 0      & \cdots & & \vdots \\
\vdots      &   \ddots    & \ddots &        & &  \\
            &             &        &        & &  \\
NA          & c^{29}_{69,96}& \cdots &     & v^{29}_{96} & & \\
0            &    \cdots  &        &        & &  0 \\
\vdots      &             &        &        & &  \vdots \\
0           &   \cdots    &       &         & \cdots & 0
 \end{pmatrix}

where in a slight abuse of notation we use superscripts to indicate experience ($h$) and subscripts to indicate years ($t$). As we see, the bottom and right hand parts of the matrix are left empty, as the cohort cannot reach the maximum age of 41 due to sample limitations. 

Cohorts entering the sample (i.e. having an age-midpoint of 22) after 1968 will have a similarly structured covariance matrix, with observations below the diagonal from $(1,1)$ to $(T-t, T-t)$, where $T$ is the last year of observations (1996) and $t$ the year that the cohohrt is entering. In the extreme case, a cohort entering in 1996 would just have one entry in its variance-covariance matrix, but since we are restricting the sample to cohorts with at least 20 years of observations, practically the last cohort to enter does so in 1977 (leaving 20 years to 1996). Their covariance matrix will then b:

\begin{pmatrix}
v^1_{77} & 0           & \cdots &        & & 0 \\
c^2_{77,78} & v^2_{78} & 0      & \cdots & & \vdots \\
\vdots      &   \ddots    & \ddots &        & &  \\
            &             &        &        & &  \\
c^{20}_{77,96}   & c^{20}_{78,96}& \cdots &     & v^{20}_{96} & & \\
0            &    \cdots  &        &        & &  0 \\
\vdots      &             &        &        & &  \vdots \\
0           &   \cdots    &       &         & \cdots & 0
 \end{pmatrix}
 
 so that there are only 20 non-zero off-diagonal elements.
 
 On the other hand, we will also have cohorts that were older than 22 in 1968, i.e. that entered the labor market prior to the start of our sample period. These cohorts will then have missing observations in the first rows and columns of the matrix, e.g. the cohort that was 22 in 1967:
 
\begin{pmatrix}
0        & 0       & \cdots &        & & 0 \\
0        & v^2_{68}& 0      & \cdots & & \vdots \\
\vdots   &     & \ddots &        & &  \\
         &             &        &        & &  \\
         & c^{29}_{69,96}& \cdots &     & v^{29}_{96} & & \\
0        &    \cdots  &        &        & &  0 \\
\vdots   &             &        &        & &  \vdots \\
0        &   \cdots    &       &         & \cdots & 0
 \end{pmatrix} 

Again, in the extreme we could have a cohort that was 62 in 1968, and would hence only have one observed variance (the $(41,41)$ element of the covariance matrix. And again, as we are restricting the sample to cohorts with at least 20 years of valid observations, the oldest possible cohort to remain in the sample is the one which is 43 in 1968. The covariance matrix will then be:

\begin{pmatrix}
0        & 0   &          & \cdots      &            &        &             & 0      \\
0        & 0   &          &             &            & \cdots &             & \vdots \\
\vdots   &     & \ddots   &             &            &        &             &        \\
         &     &          & v^{21}_{68} &            &        &             &        \\
         &     &          &             & v^{22}_{69}&        &             &        \\
         &     &          & \vdots      &            & \ddots &             &  \vdots \\
\vdots   &     &          &             &            &        & v^{40}_{86} &  0     \\
0        &     &          & c^{21}_{68,87} &            &        & \cdots      & v^{41}_{87}
 \end{pmatrix} 


This implies that we only have some cohorts for which we can actually observe all covariances up to the 29 lags we're allowing in the estimation; these are the cohorts entering from 1956 to 1968, while those entering after 1968 will have fewer observations as the sample ends before their retirement, while the cohorts entering before 1956 will retire before the sample ends. 

To hammer the point home, let's try a markdown table listing cohorts by entry year:

| Entry Year | Age 1968 | Age 1996 | Years in sample | Obs in Var-Cov Matrix |
|----------------------------------------------------------------------------|
|  1977      |  12      |   40     |     20          |        210            |
|  1978      |  13      |   41     |     21          |        231            |
|  1979      |  14      |   42     |     22          |        253            |
