In [1]:
X = c(3.764, 5.059, 5.684, 5.807, 6.102, 6.216, 6.394, 7.366, 9.201, 10.02)
Y = c(8.695, 8.725, 9.181, 9.381, 9.837, 10.6, 11.86, 14.2, 14.25, 14.61)
Z = c(16.49, 19.64, 22.04, 22.88, 23.86, 24.05, 24.84, 25.72, 26.05, 26.58)

# Variance-Covariance matrix

 * **Variance**: is a measure of the variability or spread in the set of data, it can be computed with the following formula:
 
 $$ \textrm{Var}(X) = \dfrac{\sum \left(X_i - \bar{X}\right)^2}{N} $$
 
 * **Covariance**: is a measure of the extent to which corresponding elements from two sets of ordered data move in the same direction. We use the following formula to computer the covariance:
 
 $$ \textrm{Cov}(X,Y) = \dfrac{\sum \left( X_i - \bar{X} \right)\left(Y_i - \bar{Y} \right)}{N} $$

The variance and covariance are often displayed together in a variance-covariance matrix (aka the _covariance matrix_), the $m \times n$ matrix is defined as:

$$ \mathbf{V} = \begin{bmatrix} \textrm{Var}(X_1) && \textrm{Cov}(X_1, X_2) && \ldots && \textrm{Cov}(X_1, X_m) \\ \textrm{Cov}(X_2, X_1) && \textrm{Var}(X_2) && \ldots && \textrm{Cov}(X_2, X_m) \\ \vdots && \vdots && \ddots && \vdots \\ \textrm{Cov}(X_n, X_1) && \textrm{Cov}(X_n, X_2) && \ldots && \textrm{Var}(X_n) \end{bmatrix} $$

## Example

In [2]:
n = length(X); n

In [3]:
A = cbind(X,Y,Z)

In [4]:
A

X,Y,Z
3.764,8.695,16.49
5.059,8.725,19.64
5.684,9.181,22.04
5.807,9.381,22.88
6.102,9.837,23.86
6.216,10.6,24.05
6.394,11.86,24.84
7.366,14.2,25.72
9.201,14.25,26.05
10.02,14.61,26.58


First we transform the _raw_ scores in matrix $\mathbf{A}$ to _deviation scores_ in matrix $\mathbf{a}$ with the formula:

$$ \mathbf{a} = \mathbf{A} - \mathbf{11^TA} \cdot \frac{1}{n} $$

In [5]:
rep(1, 5) %*% t(rep(1, 5))

0,1,2,3,4
1,1,1,1,1
1,1,1,1,1
1,1,1,1,1
1,1,1,1,1
1,1,1,1,1


In [6]:
a = A - rep(1, n) %*% t(rep(1, n)) %*% A / n

In [7]:
mean(X)

In [8]:
a

X,Y,Z
-2.7973,-2.4389,-6.725
-1.5023,-2.4089,-3.575
-0.8773,-1.9529,-1.175
-0.7543,-1.7529,-0.335
-0.4593,-1.2969,0.645
-0.3453,-0.5339,0.835
-0.1673,0.7261,1.625
0.8047,3.0661,2.505
2.6397,3.1161,2.835
3.4587,3.4761,3.365


Then to find the _deviation score_ sums of squares matrix, we compute $\mathbf{a^ta}$:

In [9]:
t(a) %*% a

Unnamed: 0,X,Y,Z
X,31.35676,36.85091,45.74749
Y,36.85091,52.32605,56.0047
Z,45.74749,56.0047,88.88845


And finally to create the _variance-covariance_ matrix, we divide each element in the deviation sum of squares matrix by $n$, i.e.: $\mathbf{V} = \mathbf{a'a} \cdot \frac{1}{n}$:

In [27]:
V = t(a) %*% a * (1/n)
V

Unnamed: 0,X,Y,Z
X,6.271352,7.370182,9.149499
Y,7.370182,10.46521,11.200941
Z,9.149499,11.200941,17.77769


In [22]:
var(X)

In [12]:
sum(  (X - mean(X))^2 )

In [13]:
sum(  (X - mean(X))^2 ) / (10)

In [14]:
sqrt(8.888)

## Example 2

In [15]:
B = cbind(
    c(90,90,60,60,30),
    c(60,90,60,60,30),
    c(90,30,60,90,30)
)

In [16]:
n = length(B)/3

In [17]:
b = B - rep(1,n) %*% t(rep(1,n)) %*% B / n

In [18]:
b

0,1,2
24,0,30
24,30,-30
-6,0,0
-6,0,30
-36,-30,-30


In [19]:
t(b) %*% b / n

0,1,2
504,360,180
360,360,0
180,0,720


In [20]:
var(c(90,90,60,60,30))