# Average of a quantity with correlations between measurements

## In the 1D case

In this example, we have a set of parallaxes:
\begin{equation}
P = 
     \begin{bmatrix}
        \varpi_1 \\
        \varpi_2 \\
        \vdots \\
        \varpi_N \\
     \end{bmatrix}
\end{equation}

and $S$ is the diagonal matrix made with their standard errors:
\begin{equation}
S = 
     \begin{bmatrix}
        \sigma_{\varpi_1} & 0 & ... & 0 \\
        0 & \sigma_{\varpi_1} &  ... & 0 \\
        ... & ... & ... & ...\\
        0 & 0 & ...  &  \sigma_{\varpi_N}\\
     \end{bmatrix}
\end{equation}

If the measurements errors are correlated as given in matrix $\Sigma$, the complete covariance matrix is $C = S \Sigma S$. In the case of a Gaia-like catalogue, the correlations will depend on the angular separation of stars. In the notation of Holl (2011):
\begin{equation}
    \mathrm{Cov}[x_i,x_j] = E[e_i,e_j] = \rho_{ij} \sigma_i \sigma_j
\end{equation}
where $e_i=x_i - x^{\mathrm{true}}_i$ is the error, and $\sigma_i$ and $\sigma_j$ the standard uncertainty.

How do we get the mean value of $P$, and its associated uncertainty?

Any quantity $y$ derived from the vector of estimated parameters $(x_1,...,x_N)$ can be written as a function $y=f(x)$. If the errors are small the variance of $f$ will be linear, and can be developed as:
\begin{equation}
\begin{aligned}
    \sigma^{2}_y &= \left( \frac{\partial y}{\partial x_i} \right) ' \mathrm{Cov}(x) \left( \frac{\partial y}{\partial x} \right) \\
             &=  \displaystyle\sum_{i} \left( \frac{\partial y}{\partial x} \right)^2 \sigma^{2}_i   +  \displaystyle\sum_{i} \displaystyle\sum_{i\neq j} \frac{\partial y}{\partial x_i} \frac{\partial y}{\partial x_j}  \rho_{ij} \sigma_i \sigma_j \\
\end{aligned}
\end{equation}

In our case, where $f(x)$ is the mean of $x$, whe have:

\begin{equation}
    y = f(x) = \frac{1}{N} \displaystyle\sum_{i} x_i
\end{equation}
and

\begin{equation}
    \frac{\partial y}{\partial x_i} = \frac{1}{N}
\end{equation}

In the special case where all uncertainties $\sigma_i$ are equal, and correlation $\rho_{ij}$ between all pairs is equal too, the variance on the mean parallax is:
	\begin{equation}
		\sigma^{2}_y = \sigma^2 \left( \frac{1}{N} + \frac{N-1}{N} \rho \right)
	\end{equation}
which is similar to the classical result where the uncertainty on the mean value is $\frac{\sigma}{\sqrt{N}}$, but with an additional correlation term that does not go away for large values of $N$.

The final variance in the general case us: $\sigma^2_{\varpi} = (D^T C^{-1} D)^{-1}$ where $C$ is the full covariance matrix, and $D$ the design matrix containing ones.

In the case where there are no correlations and the uncertainty is $e$ for all points, $(D^T C^{-1} D)^{-1} = \frac{e^2}{N}$ 
and $\sigma=e/\sqrt{N}$.

**Once you have calculated this** you can find the mean value as: $\sigma^2 (D^T \Sigma^{-1} X)$



## In the 2D case

The equation is still $\sigma^2 = (D^T \Sigma^{-1} D)^{-1}$ for the variance of the mean, which is now a $2\times 2$ matrix (it is diagonal only if there are no correlations on the final mean value).

The mean is still $\sigma^2 (D^T \Sigma^{-1} X)$. In 2D there can be correlations between points, but also between the two measurements in each point. Typically for *Gaia*, proper motion components are correlated for each star, and proper motions between stars are correlated.

If the observations are listed as $X=(x_1,y_1,x_2,y_2,\ldots,x_N,y_N)$ then the design matrix $D$ should have shape $2N\times 2$:
\begin{equation}
D = 
     \begin{bmatrix}
        1 & 0 \\
        0 & 1 \\
        1 & 0 \\
        0 & 1 \\
        \ldots & \ldots \\
        1 & 0 \\
        0 & 1 \\
     \end{bmatrix}
\end{equation}

The shape of $D$ is built so the $x_i$ and and $y_i$ terms do not mix. If $\Sigma$ is diagonal, then there are no correlations (all the measurements are independent) and the contribution of each observation only depends on its associated uncertainty (those with small uncertainties, and therefore large inverse variance, contribute more). However, if we have off-diagonal terms, then the contribution of some measurements do mix.

If there are correlations between quantities for a given point (but not between points), then $\Sigma$ is diagonal by blocks, and its inverse as well. We end up with off-diagonal terms in the covariance matrix $\sigma^2$, which tells us correlations between the mean $x$ and mean $y$ we computed. 