__correlation__

$corr(X, Y) = \frac{cov(X,Y)}{\sqrt{VarX \cdot Var Y}}$

Y = kX+b

$corr(X, Y) = corr(X, kX+b) = \frac{cov(X,kX+b)}{\sqrt{VarX \cdot Var(kX+b)}} = \frac{k \cdot Var X}{\sqrt{Var X \cdot Var(kX)}}$  
$= \frac{k \cdot Var X}{\sqrt{k^2(Var)^2}} = \frac{k \cdot Var X}{|k| \cdot Var X} = \frac{k}{|k|}$  
$=\begin{cases}
1\; \mbox{if} \; k>0 \\
-1\; \mbox{if} \; k<0
\end{cases}$

__Properties:__

1. $corr(k\cdot X, Y) = corr(X, Y)$ if $k>0$

2. $corr(X, Y) \in [-1, 1]$   
    $corr(X,Y) \implies Y = kX+b$  
    $corr(X,Y)=0 \implies Cov(X,Y) = 0$ uncorrelated

__example__

- if X and Y are independent then $corr(X, Y) = 0$
- the closer absolute value of correlation of random variables to 1, the more intense is linear dependence between them
- $corr(X,-X) = \frac{-1VarX}{\sqrt{VarX \cdot Var(-X)}} = \frac{1}{-1} = -1$

__example__

There are 2 yes/no questions in a questionnaire. Let X and Y be random variables that correspond to indicators of positive answer for the questions. Joint probability distribution of X and Y is given by the table:

|     | X=0 | X=1 |
|-----|-----|-----|
| Y=0 | 0.8 | 0.1 |
| Y=1 | 0.05| 0.05|


$\mathbb{E}X = (0.8 + 0.05) \times 0 + (0.1+0.05) \times 1 = 0.15$

$\mathbb{E}Y = (0.8 + 0.1) \times 0 + (0.05+0.05) \times 1 = 0.1$

In [10]:
import math

ex = 0.15
ey = 0.1
x = {0: 0.85, 1: 0.15}
y = {0: 0.81, 1: 0.1}
p = {(0,0):0.8, (1,0):0.1, (0,1):0.05, (1,1):0.05}

varx = sum([(xv-ex)**2 * xp for xv, xp in x.items()])
vary = sum([(yv-ey)**2 * yp for yv, yp in y.items()])

print(varx, vary)

cov = sum([(xy[0] - 0.15) * (xy[1] - 0.1) * prob for xy, prob in p.items()])

print(cov)

corr = cov/ math.sqrt(varx * vary)

print(corr)

0.12749999999999997 0.08910000000000001
0.035
0.32837803516058267
