# Expectation and Variance

<a href="#Expectation">Expectation</a>

<a href="#Properties-of-Expectation">Properties of Expectation</a>

<a href="#Variance">Variance</a>

<a href="#Properties-of-Variance">Properties of Variance</a>

<a href="#Mean-and-Variance-of-Sum-of-Random-Variables">Mean and Variance of Sum of Random Variables</a>

<a href="#Mean-and-Variance-of-Weighted-Sum-of-Random-Variables">Mean and Variance of Weighted Sum of Random Variables</a>

<a href="#Mean-and-Variance-of-Weighted-Sum-of-Random-Variables---Matrix-Form">Mean and Variance of Weighted Sum of Random Variables - Matrix Form</a>

<a href="#Standard-Deviation">Standard Deviation</a>

<a href="#Standardization-and-Reverse-Standardization">Standardization and Reverse Standardization</a>

<a href="#scipy.stats">scipy.stats</a>

<a href="#Skewness-and-Kurtosis">Skewness and Kurtosis</a>

# Expectation

### Definition

$$\begin{array}{lllllllll}
\mbox{Discrete}&EX&=&\displaystyle\sum_x x * P(X=x)\\
\mbox{Continuous}&EX&=&\displaystyle\int_{-\infty}^{\infty} x f(x)dx\\
\end{array}$$

### Interpretation 1
\begin{eqnarray}
\underbrace{\mathbb{E}(X)}_{\mbox{Expected payoff}}
&=&\quad\quad\sum_{x_i}\quad \underbrace{x_i}_{\mbox{Payoff}}\quad\times \underbrace{p_{x_i}}_{\mbox{ Probability}}\nonumber
\end{eqnarray}

### Interpretation 2
\begin{eqnarray}
\underbrace{\mathbb{E}(X)}_{\mbox{Area under curve}}
&=&\quad\quad\sum_{x_i}\quad \underbrace{x_i}_{\mbox{Height}}\quad\times\quad \underbrace{P(X=x_i)}_{\mbox{Width}}\nonumber
\end{eqnarray}

<div align="center"><img src="img/interpretation 2.png" width="50%"></div>

[<a href="#Expectation-and-Variance">Back to top</a>]

# Properties of Expectation

### Expectation as a linear operator
\begin{eqnarray}
(1)&&\quad\displaystyle \mathbb{E}(X+Y)=\mathbb{E}(X)+\mathbb{E}(Y)\nonumber\\
(2)&&\quad\displaystyle \mathbb{E}(aX)=a\mathbb{E}(X)\nonumber\\
(3)&&\quad\displaystyle \mathbb{E}(a)=a\nonumber
\end{eqnarray}

### Change of variable 
\begin{eqnarray}
(4)&&\quad\displaystyle \mathbb{E}[g(X)]=\sum_{x_i}g(x_i)p_{x_i}\nonumber\\
(5)&&\quad\displaystyle \mathbb{E}[g(X,Y)]=\sum_{x_i}\sum_{y_j}g(x_i,y_j)p_{x_i,y_j}\nonumber
\end{eqnarray}

### Product of independent random variables
\begin{array}{llllllll}
(6)&&\quad\displaystyle \mathbb{E}\left[XY\right]=\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right]&&\mbox{if $X$ and $Y$ are independent}\nonumber\\
\nonumber\\
(7)&&\quad\displaystyle \mathbb{E}\left[g(X)h(Y)\right]=\mathbb{E}\left[g(X)\right]\mathbb{E}\left[h(Y)\right]&&\mbox{if $X$ and $Y$ are independent}\nonumber
\end{array}

### No free lunch
\begin{eqnarray}
(8)\ &&\quad\displaystyle X\ge 0\  \quad\Rightarrow\quad \mathbb{E}(X)\ge 0\nonumber\\
\nonumber\\
(9)\ &&\quad\displaystyle X\ge Y\quad\Rightarrow\quad \mathbb{E}(X)\ge \mathbb{E}(Y)\nonumber\\
\nonumber\\
(10)&&\quad\displaystyle |\mathbb{E}(X)|\le \mathbb{E}(|X|)\nonumber
\end{eqnarray}

### Cauchy-Schwartz inequality
\begin{eqnarray}
(11)&&\quad\displaystyle \mathbb{E}|XY|\le (\mathbb{E}X^2)^{1/2}(\mathbb{E}Y^2)^{1/2}\nonumber
\end{eqnarray}

\begin{eqnarray}
\mathbb{E}(X+Y)
&=&\sum_{x_i}\sum_{y_j}(x_i+y_j)p_{x_i,y_j}\nonumber\\
&=&\sum_{x_i}\sum_{y_j}x_ip_{x_i,y_j}+\sum_{x_i}\sum_{y_j}y_jp_{x_i,y_j}\nonumber\\
&=&\sum_{x_i}x_i\left(\sum_{y_j}p_{x_i,y_j}\right)+\sum_{y_j}y_j\left(\sum_{x_i}p_{x_i,y_j}\right)\nonumber\\
&=&\sum_{x_i}x_ip_{x_i}+\sum_{y_j}y_jp_{y_j}\nonumber\\
&=&\mathbb{E}(X)+\mathbb{E}(Y)\nonumber
\end{eqnarray}

<div align="center"><img src="img/Change of variable.png" width="50%"></div>

\begin{eqnarray}
\mathbb{E}[g(X)]
&=&\sum_{g_k}g_kP(g(X)=g_k)\nonumber\\
&=&\sum_{g_k}g_k\left(\sum_{x_i,\ g(x_i)=g_k}p_{x_i}\right)\nonumber\\
%&=&\sum_{g_k}\left(\sum_{x_i,\ g(x_i)=g_k}{\color{red}g_k}p_{x_i}\right)\nonumber\\
&=&
\sum_{g_k}\left(\sum_{x_i,\ g(x_i)=g_k}g(x_i)p_{x_i}\right)\nonumber\\
%&=&\sum_{g_k}\sum_{x_i,\ g(x_i)=g_k}g(x_i)p_{x_i}\nonumber\\
&=&\sum_{x_i}\ g(x_i)p_{x_i}\nonumber
\end{eqnarray}

\begin{eqnarray}
\mathbb{E}[XY]
&=&
\sum_{x_i}\sum_{y_j}x_iy_jp_{x_i,y_j}
=
\sum_{x_i}\sum_{y_j}x_iy_jp_{x_i}p_{y_j}\nonumber\\
&=&
\left(\sum_{x_i}x_ip_{x_i}\right)
\left(\sum_{y_j}y_jp_{y_j}\right)
=\mathbb{E}[X]\mathbb{E}[Y]\nonumber
\end{eqnarray}

$$
X\ge 0
\ \ \Rightarrow\ \ 
x_i\ge 0
\ \ \Rightarrow\ \ 
\mathbb{E}(X)=\sum_{x_i}x_ip_{x_i}\ge 0
$$

\begin{eqnarray}
X\ge Y
\quad\Rightarrow\quad
X-Y\ge 0
&\Rightarrow&
\mathbb{E}(X-Y)\ge 0\nonumber\\
&\Rightarrow&
\mathbb{E}(X)-\mathbb{E}(Y)\ge 0
\quad\Rightarrow\quad
\mathbb{E}(X)\ge \mathbb{E}(Y)\nonumber
\end{eqnarray}

### Proof of Cauchy-Schwartz inequality

If $\mathbb{E}X^2=0$, $X=0$ with probability 1 and hence $\mathbb{E}|XY|=0$.
By the same token, 
if $\mathbb{E}Y^2=0$, then $\mathbb{E}|XY|=0$.
In these two extreme cases (11) holds trivially.
So, without loss of generality
we assume that $\mathbb{E}X^2>0$ and $\mathbb{E}Y^2>0$.

With $t=-\mathbb{E}|XY|/\mathbb{E}Y^2$
\begin{eqnarray}
\displaystyle 
(|X|+t|Y|)^2
=X^2+2t|XY|+t^2Y^2\ge 0
&\quad\Rightarrow\quad&\displaystyle 
\mathbb{E}X^2+2t\mathbb{E}|XY|+t^2\mathbb{E}Y^2\ge 0\nonumber\\
&\quad\Rightarrow\quad&\displaystyle 
(\mathbb{E}X^2)(\mathbb{E}Y^2)\ge (\mathbb{E}|XY|)^2\nonumber
\end{eqnarray}

### Example - Expectation of coin related random variables

\begin{array}{ccc}\hline
\mbox{Distribution}&\mbox{Expectation}&\mbox{Variance}\\\hline
B(p)&p&pq\\
B(n,p)&np&npq\\
Geo(p)&\frac{1}{p}&\frac{q}{p^2}\\
NB(r,p)&\frac{r}{p}&\frac{rq}{p^2}\\\hline
\end{array}

$$
X\sim B(p)\ \ \Rightarrow\ \ \mathbb{E}(X)=1\times p+0\times q=p
$$


\begin{eqnarray}
X\sim B(n,p)
&\Rightarrow&X=\sum_{i=1}^nX_i,\ X_i\ \mbox{iid}\ B(p)\nonumber\\
&\Rightarrow&\mathbb{E}(X)=\sum_{i=1}^n\mathbb{E}(X_i)=np\nonumber
\end{eqnarray}



\begin{eqnarray}
1+x+x^2+\cdots=\frac{1}{1-x}
&\stackrel{\mbox{Diff wrt $x$}}{\Rightarrow}&1+2x+3x^2+\cdots=\frac{1}{(1-x)^2}\nonumber\\
&\stackrel{\mbox{Let $x=q$}}{\Rightarrow}&1+2q+3q^2+\cdots=\frac{1}{p^2}\nonumber\\
&\Rightarrow&X\sim Geo(p)\ \ \Rightarrow\ \  \mathbb{E}(X)=\left(\sum_{k=1}^{\infty} kq^{k-1}\right)p=\frac{1}{p}\nonumber
\end{eqnarray}


\begin{eqnarray}
X\sim NB(r,p)
&\Rightarrow&X=\sum_{i=1}^rX_i,\ X_i\ \mbox{iid}\ Geo(p)\nonumber\\
&\Rightarrow&\mathbb{E}(X)=\sum_{i=1}^r\mathbb{E}(X_i)=\frac{r}{p}\nonumber
\end{eqnarray}

### Example - Maximization of expected profit

A newsboy purchases papers at 10 cents and sells them at 15 cents. However, he is not allowed to return unsold papers. 
If his daily demand $X$ is $B(n,p)$ 
with $n=10$, $p=0.4$, 
approximately how many papers should he purchase 
so as to maximize his expected profit?

With $t$ purchase,
his profit $Y(t)$ and expected profit $f(t)=EY(t)$ are
\begin{eqnarray}
Y(t)
&=&
\left\{\begin{array}{ll}
5t&\mbox{if}\ X\ge t\\ 
5X-10(t-X)&\mbox{if}\ X< t
\end{array}\right.\nonumber\\
f(t)
&=&
5tP(X\ge t)+\sum_{i=0}^{t-1}(15i-10t)P(X=i)\nonumber
\end{eqnarray}


To find an optimal $t_0$ that maximize $f(t)$,
differentiate $f(t)$ discretely:
Find $t_0$ with
\begin{eqnarray}
f(t_0)-f(t_0-1)&=&15P(X\ge t_0)-10\ge 0\nonumber\\
f(t_0+1)-f(t_0)&=&15P(X\ge t_0+1)-10\le 0\nonumber
\end{eqnarray}
or
$$
P(X\le t_0-1)\le \frac{1}{3}\ \ \ \ \ \mbox{and}\ \ \ \ \ P(X\le t_0)\ge \frac{1}{3}
$$

[<a href="#Expectation-and-Variance">Back to top</a>]

# Variance

### Variance
$$Var(X)=\mathbb{E}(X-\mathbb{E}X)^2=\mathbb{E}X^2-\left(\mathbb{E}X\right)^2$$

### Covariance
$$Cov(X,Y)=\mathbb{E}[(X-\mathbb{E}X)(Y-\mathbb{E}Y)]
=\mathbb{E}(XY)-(\mathbb{E}X)(\mathbb{E}Y)$$

### Correlation coefficient
$$-1\le \rho=\frac{Cov(X,Y)}{\sqrt{Var(X)}\sqrt{Var(Y)}}\le 1$$

\begin{eqnarray}
Var(X)
&=&\mathbb{E}(X-\mathbb{E}X)^2\nonumber\\
&=&\mathbb{E}(X^2-2(\mathbb{E}X)X+(\mathbb{E}X)^2)\nonumber\\
&=&\mathbb{E}X^2-2(\mathbb{E}X)\mathbb{E}X+(\mathbb{E}X)^2\nonumber\\
&=&\mathbb{E}X^2-(\mathbb{E}X)^2\nonumber
\end{eqnarray}


\begin{eqnarray}
Cov(X,Y)
&=&\mathbb{E}[(X-\mathbb{E}X)(Y-\mathbb{E}Y)]\nonumber\\
&=&\mathbb{E}[XY-(\mathbb{E}Y)X-(\mathbb{E}X)Y+(\mathbb{E}X)(\mathbb{E}Y)]\nonumber\\
&=&\mathbb{E}(XY)-(\mathbb{E}Y)(\mathbb{E}X)-(\mathbb{E}X)(\mathbb{E}Y)+(\mathbb{E}X)(\mathbb{E}Y)\nonumber\\
&=&\mathbb{E}(XY)-(\mathbb{E}Y)(\mathbb{E}X)\nonumber
\end{eqnarray}


\begin{eqnarray}
|Cov(X,Y)|
&\le& \mathbb{E}\left|(X-\mathbb{E}X)(Y-\mathbb{E}Y)\right|\nonumber\\
&\le& (\mathbb{E}(X-\mathbb{E}X)^2)^{1/2}(\mathbb{E}(Y-\mathbb{E}Y)^2)^{1/2}=\sqrt{Var(X)}\sqrt{Var(Y)}\nonumber
\end{eqnarray}

$$\begin{array}{lllllllll}
\mbox{Discrete}&EX^2&=&\displaystyle\sum_x x^2 * P(X=x)\\
\mbox{Continuous}&EX^2&=&\displaystyle\int_{-\infty}^{\infty} x^2 f(x)dx\\
\end{array}$$

$$\begin{array}{lllllllll}
\mbox{Discrete}&Var(X)&=&E(X-EX)^2&=&\displaystyle\sum_x (x-EX)^2 * P(X=x)\\
\mbox{Continuous}&Var(X)&=&E(X-EX)^2&=&\displaystyle\int_{-\infty}^{\infty} (x-EX)^2 f(x)dx\\
\end{array}$$

### Covariance and correlation coefficient

### np.cov

- Sample variance

### np.var

- Population variance

### np.corrcoef

In [7]:
import numpy as np

n = 100
x = np.random.uniform(0, 1, (1, n))
y = x + np.random.uniform(0, 1, (1, n))
z = x + y + np.random.uniform(0, 1, (1, n))
X = np.vstack([x,y,z])

# covariance
print(np.cov(X)) # sample covariance
print()

# correlation coefficient
print(np.corrcoef(X))
print()

# variance
print(np.cov(x)) # sample variance
print(np.var(x)) # population variance
print()

print(np.sum((x-(np.sum(x)/n))**2)/(n-1)) # sample variance
print(np.sum((x-(np.sum(x)/n))**2)/n) # population variance

[[0.06850187 0.07579437 0.14465894]
 [0.07579437 0.15955842 0.2263271 ]
 [0.14465894 0.2263271  0.44345162]]

[[1.         0.72497975 0.82998629]
 [0.72497975 1.         0.85085113]
 [0.82998629 0.85085113 1.        ]]

0.0685018746996935
0.06781685595269657

0.0685018746996935
0.06781685595269657


[<a href="#Expectation-and-Variance">Back to top</a>]

# Properties of Variance

\begin{eqnarray}
(1)&&\quad\displaystyle Var(X)=Cov(X,X)\nonumber\\
(2)&&\quad\displaystyle Cov(aX+bY,Z)=a Cov(X,Z)+b Cov(Y,Z)\nonumber\\
(2)&&\quad\displaystyle Cov(Z,aX+bY)=a Cov(Z,X)+b Cov(Z,Y)\nonumber\\
(3)&&\quad\displaystyle Cov(X,Y)=Cov(Y,X)\nonumber\\
(4)&&\quad\displaystyle Cov(X,a)=Cov(a,X)=0\nonumber\\
(5)&&\quad\displaystyle Cov(X,Y)=0\ \ \ \ \ \mbox{if $X$ and $Y$ are independent}\nonumber
\end{eqnarray}

$$
Var(X)
=\mathbb{E}(X-\mathbb{E}X)^2
=\mathbb{E}[(X-\mathbb{E}X)(X-\mathbb{E}X)]
=Cov(X,X)
$$


\begin{eqnarray}
Cov(aX+bY,Z)
&=&\mathbb{E}(aX+bY-a\mathbb{E}X-b\mathbb{E}Y)(Z-\mathbb{E}Z)\nonumber\\
&=&\mathbb{E}[a(X-\mathbb{E}X)+b(Y-\mathbb{E}Y)](Z-\mathbb{E}Z)\nonumber\\
&=&a\mathbb{E}[(X-\mathbb{E}X)(Z-\mathbb{E}Z)]+b\mathbb{E}[(Y-\mathbb{E}Y)(Z-\mathbb{E}Z)]\nonumber\\
&=&aCov(X,Z)+bCov(Y,Z)\nonumber
\end{eqnarray}


$$
Cov(X,Y)
=\mathbb{E}[(X-\mathbb{E}X)(Y-\mathbb{E}Y)]
=\mathbb{E}[(Y-\mathbb{E}Y)(X-\mathbb{E}X)]
=Cov(Y,X)
$$


$$
Cov(X,a)
=\mathbb{E}[(X-\mathbb{E}X)(a-a)]
=0
$$


\begin{eqnarray}
X,\ Y\ \mbox{independent}
&\Rightarrow&\mathbb{E}(XY)=\mathbb{E}(X)\mathbb{E}(Y)\nonumber\\
&\Rightarrow&Cov(X,Y)=\mathbb{E}(XY)-\mathbb{E}(X)\mathbb{E}(Y)=0\nonumber
\end{eqnarray}

### Jensen's inequality

##### Definition

$\varphi:\mathbb{R}\rightarrow\mathbb{R}$ is 
convex if for any $x$, $y$ and $0<\lambda<1$
$$
\varphi(\lambda x+(1-\lambda)y)
\le
\lambda \varphi(x)+(1-\lambda)\varphi(y)
$$
$\varphi:\mathbb{R}\rightarrow\mathbb{R}$ is 
strictly convex if for any $x$, $y$ and $0<\lambda<1$
$$
\varphi(\lambda x+(1-\lambda)y)
<
\lambda \varphi(x)+(1-\lambda)\varphi(y)
$$

##### Jensen's inequality

\begin{array}{llllll}
\mbox{If $\varphi$ convex,}&&\mathbb{E}\varphi(X)\ge\varphi(\mathbb{E}X)\nonumber\\
\nonumber\\
\mbox{If $\varphi$ strictly convex,}&&\mathbb{E}\varphi(X)=\varphi(\mathbb{E}X)\ \ \Leftrightarrow\ \ X=\mathbb{E}X\nonumber
\end{array}

##### Example

\begin{array}{lllll}
\varphi(x)=x^2\quad\quad\ &\quad\Rightarrow\quad&\mathbb{E}X^2\ge (\mathbb{E}X)^2\nonumber\\
\nonumber\\
\varphi(x)=|x|\quad\quad\ &\quad\Rightarrow\quad&\mathbb{E}|X|\ge |\mathbb{E}X|\nonumber\\
\nonumber\\
\varphi(x)=e^x\quad\quad\ &\quad\Rightarrow\quad&\mathbb{E}e^X\ge e^{\mathbb{E}X}\nonumber\\
\nonumber\\
\varphi(x)=-\log x\ &\quad\Rightarrow\quad&\mathbb{E}\log X\le \log \mathbb{E}X\ \ \mbox{for $X>0$}\nonumber
\end{array}

<div align="center"><img src="img/Jensen ineq.png" width="40%"></div>

$$
\varphi(X)\ge \alpha(X-\mu)+\varphi(\mu)
\ \ \stackrel{\mbox{Take expectation}}{\Rightarrow}\ \ 
\mathbb{E}\varphi(X)\ge \alpha\mathbb{E}(X-\mu)+\varphi(\mu)=\varphi(\mu)=\varphi(\mathbb{E}X)
$$

[<a href="#Expectation-and-Variance">Back to top</a>]

# Mean and Variance of Sum of Random Variables

### Example - How to compute the variance

Let $Var(X)=2$, $Var(Y)=2$, $Var(Z)=3$, $Cov(X,Y)=0.25$.
If $Z$ is independent to both $X$ and $Y$, compute the variance of $V$, where $V$ is given by
$$
V=X+2Y-3Z-2
$$

$$Var(V)\stackrel{(1)}{=}Cov(V,V)=Cov(X+2Y-3Z-2,X+2Y-3Z-2)$$

$$\begin{array}{lllllllllllllllllll}
Cov(X+2Y-3Z-2,X+2Y-3Z-2)
&\stackrel{(2)}{=}&Cov(X,X)&+2Cov(X,Y)&-3Cov(X,Z)&-2Cov(X,1)\nonumber\\
&&2Cov(Y,X)&+4Cov(Y,Y)&-6Cov(Y,Z)&-4Cov(Y,1)\nonumber\\
&&-3Cov(Z,X)&-6Cov(Z,Y)&+9Cov(Z,Z)&+6Cov(Z,1)\nonumber\\
&&-2Cov(1,X)&-4Cov(1,Y)&+6Cov(1,Z)&+4Cov(1,1)\nonumber\\
&\stackrel{(3)}{=}&Cov(X,X)&+2Cov(X,Y)&-3Cov(X,Z)&-2Cov(X,1)\nonumber\\
&&2Cov(X,Y)&+4Cov(Y,Y)&-6Cov(Y,Z)&-4Cov(Y,1)\nonumber\\
&&-3Cov(X,Z)&-6Cov(Y,Z)&+9Cov(Z,Z)&+6Cov(Z,1)\nonumber\\
&&-2Cov(X,1)&-4Cov(Y,1)&+6Cov(Z,1)&+4Cov(1,1)\nonumber\\
&=&Cov(X,X)&+4Cov(X,Y)&-6Cov(X,Z)&-4Cov(X,1)\nonumber\\
&&&+4Cov(Y,Y)&-12Cov(Y,Z)&-8Cov(Y,1)\nonumber\\
&&&&+9Cov(Z,Z)&+12Cov(Z,1)\nonumber\\
&&&&&+4Cov(1,1)\nonumber\\
&\stackrel{(4)}{=}&Cov(X,X)&+4Cov(X,Y)&-6Cov(X,Z)\nonumber\\
&&&+4Cov(Y,Y)&-12Cov(Y,Z)\nonumber\\
&&&&+9Cov(Z,Z)\nonumber\\
&\stackrel{(5)}{=}&Cov(X,X)&+4Cov(X,Y)\nonumber\\
&&&+4Cov(Y,Y)\nonumber\\
&&&&+9Cov(Z,Z)\nonumber\\
&\stackrel{(1)}{=}&Var(X)&+4Cov(X,Y)\nonumber\\
&&&+4Var(Y)\nonumber\\
&&&&+9Var(Z)\nonumber\\
&=&2&+4\cdot 0.25&\nonumber\\
&&&+4\cdot 2\nonumber\\
&&&&+9\cdot 3\nonumber\\
&=& 38\nonumber
\end{array}$$

### In general

$$
\begin{array}{lll}
\displaystyle 
\mathbb{E}\left(\sum_{i=1}^nX_i\right)&=&\displaystyle \sum_{i=1}^n\mathbb{E}(X_i)\nonumber\\
\displaystyle 
Var\left(\sum_{i=1}^nX_i\right)&=&\displaystyle \sum_{i=1}^nVar(X_i)+\sum_{i\neq j}Cov(X_i,X_j)\nonumber\\
&=&\displaystyle \sum_{i=1}^nVar(X_i)+2\sum_{1\le i< j\le n}Cov(X_i,X_j)\nonumber
\end{array}
$$

### If $X_i$ are independent

$$
\begin{array}
\displaystyle \mathbb{E}\left(\sum_{i=1}^nX_i\right)&=&\displaystyle \sum_{i=1}^n\mathbb{E}(X_i)\nonumber\\
\displaystyle Var\left(\sum_{i=1}^nX_i\right)&=&\displaystyle \sum_{i=1}^nVar(X_i)\nonumber
\end{array}
$$

### If $X_i$ are iid

\begin{array}{lllllll}
\displaystyle \mathbb{E}\left(\sum_{i=1}^nX_i\right)&=&\displaystyle \sum_{i=1}^n\mathbb{E}(X_i)&=&\displaystyle n\mathbb{E}(X_1)\nonumber\\
\displaystyle Var\left(\sum_{i=1}^nX_i\right)&=&\displaystyle \sum_{i=1}^nVar(X_i)&=&\displaystyle nVar(X_1)\nonumber
\end{array}

[<a href="#Expectation-and-Variance">Back to top</a>]

# Mean and Variance of Weighted Sum of Random Variables

### In general

\begin{array}{lll}
\displaystyle \mathbb{E}\left(\sum_{i=1}^na_iX_i\right)&=&\displaystyle \sum_{i=1}^na_i\mathbb{E}(X_i)\nonumber\\
\displaystyle Var\left(\sum_{i=1}^na_iX_i\right)&=&\displaystyle \sum_{i=1}^na_i^2Var(X_i)+\sum_{i\neq j}a_ia_jCov(X_i,X_j)\nonumber\\
&=&\displaystyle \sum_{i=1}^na_i^2Var(X_i)+2\sum_{1\le i< j\le n}a_ia_jCov(X_i,X_j)\nonumber
\end{array}

### If $X_i$ are independent

\begin{array}{llllll}
\displaystyle \mathbb{E}\left(\sum_{i=1}^na_iX_i\right)&=&\displaystyle \sum_{i=1}^na_i\mathbb{E}(X_i)\nonumber\\
\displaystyle Var\left(\sum_{i=1}^na_iX_i\right)&=&\displaystyle \sum_{i=1}^na_i^2Var(X_i)\nonumber
\end{array}

### If $X_i$ are iid

\begin{array}{lllll}
\displaystyle \mathbb{E}\left(\sum_{i=1}^na_iX_i\right)&=&\displaystyle \sum_{i=1}^na_i\mathbb{E}(X_i)&=&\displaystyle \left(\sum_{i=1}^na_i\right)\mathbb{E}(X_1)\nonumber\\
\displaystyle Var\left(\sum_{i=1}^na_iX_i\right)&=&\displaystyle \sum_{i=1}^na_i^2Var(X_i)&=&\displaystyle \left(\sum_{i=1}^na_i^2\right)Var(X_1)\nonumber
\end{array}

# Mean and Variance of Weighted Sum of Random Variables - Matrix Form

$$
\displaystyle 
S=\sum_{i=1}^na_iX_i
$$

where

\begin{eqnarray}
\mu_i&&\quad \mbox{mean of $X_i$}\nonumber\\
\sigma^2&&\quad \mbox{variance of $X_i$}\nonumber\\
\sigma_{ij}&&\quad \mbox{covariance between $X_i$ and $X_j$}\nonumber\\ 
\rho_{ij}&&\quad \mbox{correlation between $X_i$ and $X_j$}\nonumber
\end{eqnarray}

### Mean

$$
\displaystyle 
\mathbb{E}S=\sum_{i=1}^na_i\mathbb{E}X_i=\sum_{i=1}^na_i\mu_i
$$

### Variance

\begin{eqnarray}
\displaystyle 
Var(S)&=&\displaystyle Cov\left(\sum_{i=1}^na_iX_i,\sum_{j=1}^na_jX_j\right)\nonumber\\
&=&\displaystyle \sum_{i=j}a_ia_jCov\left(X_i,X_j\right)+\sum_{i\neq j}a_ia_jCov\left(X_i,X_j\right)\nonumber\\
&=&\displaystyle \sum_{i=1}^na_i^2Var\left(X_i\right)+2\sum_{1\le i< j\le n}a_ia_jCov\left(X_i,X_j\right)\nonumber\\
&=&\displaystyle \sum_{i=1}^na_i^2\sigma_i^2+2\sum_{1\le i<j\le n}a_ia_j\sigma_{ij}\nonumber\\
&=&\displaystyle \sum_{i=1}^na_i\sigma_i^2+2\sum_{1\le i<j\le n}a_ia_j\rho_{ij}\sigma_i\sigma_j\nonumber
\end{eqnarray}

Or in matrix form 

$$
\displaystyle 
Var(S)
=
\left[\begin{array}{cccc}a_1&a_2&\cdots&a_n\end{array}\right]
\underbrace{\left[\begin{array}{cccc}\sigma_1^2&\sigma_{12}&\cdots&\sigma_{1n}\\
\sigma_{21}&\sigma_2^2&\cdots&\sigma_{2n}\\
\vdots&\vdots&\ddots&\vdots\\
\sigma_{n1}&\sigma_{n2}&\cdots&\sigma_n^2\end{array}\right]}_{\Sigma\quad \mbox{Covariance matrix}}
\left[\begin{array}{c}a_1\\a_2\\\vdots\\a_n\end{array}\right]
$$

### Example - Variance of coin related random variables
\begin{array}{ccc}\hline
\mbox{Distribution}&\mbox{Expectation}&\mbox{Variance}\\\hline
B(p)&p&pq\\
B(n,p)&np&npq\\
Geo(p)&\frac{1}{p}&\frac{q}{p^2}\\
NB(r,p)&\frac{r}{p}&\frac{rq}{p^2}\\\hline
\end{array}

\begin{eqnarray}
X\sim B(p)
&\Rightarrow&\mathbb{E}(X^2)=\mathbb{E}(X)=p\nonumber\\
&\Rightarrow&Var(X)=\mathbb{E}(X^2)-(\mathbb{E}X)^2=p-p^2=p(1-p)=pq\nonumber
\end{eqnarray}


\begin{eqnarray}
X\sim B(n,p)
&\Rightarrow&X=\sum_{i=1}^nX_i,\ X_i\ \mbox{iid}\ B(p)\nonumber\\
&\Rightarrow&Var(X)=\sum_{i=1}^nVar(X_i)=npq\nonumber
\end{eqnarray}



\begin{eqnarray}
1+x+x^2+\cdots=\frac{1}{1-x}
&\stackrel{\mbox{Diff wrt $x$}}{\Rightarrow}&1+2x+3x^2+\cdots=\frac{1}{(1-x)^2}\nonumber\\
&\stackrel{\mbox{Diff wrt $x$}}{\Rightarrow}&2\cdot 1+3\cdot 2 x+4\cdot 3 x^2\cdots=\frac{2}{(1-x)^3}\nonumber\\
&\stackrel{\mbox{Let $x=q$}}{\Rightarrow}&2\cdot 1+3\cdot 2 q+4\cdot 3 q^2\cdots=\frac{2}{p^3}\nonumber
\end{eqnarray}
\begin{eqnarray}
X\sim Geo(p)
&\Rightarrow&\mathbb{E}[X(X-1)]=\left(\sum_{k=1}^{\infty} k(k-1)q^{k-2}\right)qp=\frac{2q}{p^2}\nonumber\\
&\Rightarrow&\mathbb{E}X^2=\mathbb{E}[X(X-1)+X]=\mathbb{E}[X(X-1)]+\mathbb{E}X=\frac{2q}{p^2}+\frac{1}{p}\nonumber\\
&\Rightarrow&Var(X)=\mathbb{E}X^2-(\mathbb{E}X)^2=\frac{2q}{p^2}+\frac{1}{p}-\frac{1}{p^2}=\frac{q}{p^2}\nonumber
\end{eqnarray}


\begin{eqnarray}
X\sim NB(r,p)
&\Rightarrow&X=\sum_{i=1}^rX_i,\ X_i\ \mbox{iid}\ Geo(p)\nonumber\\
&\Rightarrow&Var(X)=\sum_{i=1}^rVar(X_i)=\frac{rq}{p^2}\nonumber
\end{eqnarray}

[<a href="#Expectation-and-Variance">Back to top</a>]

# Standard Deviation

### How to measure the typical size of error or deviation from mean

##### First try
$$
\underbrace{\mathbb{E}}_{\mbox{Average}}
\underbrace{(X-\mathbb{E}X)}_{\mbox{Error}}
$$
However, this try is fertile:
$
\mathbb{E}(X-\mathbb{E}X)=\mathbb{E}X-\mathbb{E}X=0
$.

##### Second try
$$
\underbrace{\mathbb{E}}_{\mbox{Average}} 
\Big|
\underbrace{X-\mathbb{E}X}_{\mbox{Error}}
\Big|
$$
Due to computational difficulties,
this measure of typical  error size is not popular.

##### Standard way to measure typical  error size
$$
\underbrace{SD(X)}_{\mbox{Standard deviation}}
\quad=\quad
\sqrt{
\underbrace{\mathbb{E}}_{\mbox{Average}} 
\Big(
\underbrace{X-\mathbb{E}X}_{\mbox{Error}}
\Big)^2
}
\quad=\quad
\sqrt{
Var(X)
}
$$

In [8]:
import numpy as np

x = np.array([1,2,3,4,5,6])
pmf = np.array([1/6,1/6,1/6,1/6,1/6,1/6])

mean = np.sum(x*pmf)
second_moment = np.sum((x**2)*pmf) 
variance = second_moment - mean**2
std = np.sqrt(variance)

print("Mean:               ", mean)
print("Variance:           ", variance)
print("Standard deviation: ", std)

Mean:                3.5
Variance:            2.916666666666666
Standard deviation:  1.707825127659933


[<a href="#Expectation-and-Variance">Back to top</a>]

# Standardization and Reverse Standardization

### Mean and variance lemma for standardization and reverse standardization
\begin{eqnarray}
\mathbb{E}(aX+b)\quad&=&a\mathbb{E}(X)+b\nonumber\\
Var(aX+b)&=&Var(aX)=a^2Var(X)\nonumber
\end{eqnarray}

### Standardization
If $X$ has mean $\mu$ and standard deviation $\sigma$,
then
$$
\frac{X-\mu}{\sigma}\quad\mbox{has mean 0 and standard deviation 1}
$$

### Reverse standardization

If $X$ has mean 0 and standard deviation 1,
then
$$
\mu+\sigma*X\quad\mbox{has mean $\mu$ and standard deviation $\sigma$}
$$

### Example - Standardization

We flips a fair coin many times.
\begin{eqnarray}
X_i\quad\quad\quad\quad\ \  &&\mbox{$i^{th}$ flip record,
where $H$ and $T$ are recorded as 1 and 0}\nonumber\\
Y_i:=2X_i-1&&\mbox{$i^{th}$ flip record,
where $H$ and $T$ are recorded as 1 and $-1$}\nonumber
\end{eqnarray}

Calculate the mean and variance of the following related random variables,
i.e., fill up blanks of the below table.

\begin{array}{ccc}\hline
\mbox{Random variable}&\mbox{Mean}&\mbox{Variance}\\\hline
Y_i&&\\
\sum_{i=1}^nY_i&&\\
\frac{\sum_{i=1}^nY_i}{\sqrt{n}}&&\\\hline
\end{array}

\begin{eqnarray}
X_i\ \mbox{iid}\ B(0.5)
&\Rightarrow&\mathbb{E}X_i=0.5,\ Var(X_i)=0.5*(1-0.5)=0.25\nonumber\\
&\Rightarrow&Y_i\ \mbox{iid with}\ \mathbb{E}Y_i=0,\ Var(Y_i)=1\nonumber\\
&\Rightarrow&\mathbb{E}\left(\sum_{i=1}^nY_i\right)=0,\ Var\left(\sum_{i=1}^nY_i\right)=n\nonumber\\
&\stackrel{\mbox{Standardization}}{\Rightarrow}&\frac{\sum_{i=1}^nY_i}{\sqrt{n}}\quad\mbox{has mean 0, variance 1}\nonumber
\end{eqnarray}

\begin{array}{ccc}\hline
\mbox{Random variable}&\mbox{Mean}&\mbox{Variance}\\\hline
Y_i&0&1\\
\sum_{i=1}^nY_i&0&n\\
\frac{\sum_{i=1}^nY_i}{\sqrt{n}}&0&1\\\hline
\end{array}

[<a href="#Expectation-and-Variance">Back to top</a>]

# scipy.stats

<div align="center"><img src="img/Order statistics.png" width="80%"></div>

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html#histograms

<div align="center"><img src="img/Averages and variances.png" width="80%"></div>

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html#histograms

<div align="center"><img src="img/Correlating.png" width="80%"></div>

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html#histograms

<div align="center"><img src="img/Histograms.png" width="80%"></div>

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html#histograms

[<a href="#Expectation-and-Variance">Back to top</a>]

# Skewness and Kurtosis

$$\begin{array}{lll}
\mbox{Skewness}(X)&=&\displaystyle E\left(\frac{X-\mu}{\sigma}\right)^3\nonumber\\
\mbox{Kurtosis}(X)&=&\displaystyle E\left(\frac{X-\mu}{\sigma}\right)^4\nonumber\\
\mbox{Excess_Kurtosis}(X)&=&\displaystyle \mbox{Kurtosis}(X)-3\nonumber\\
\end{array}$$

In [3]:
import numpy as np
import scipy.stats as st

x = np.random.normal(0, 1, (1000,)) 

# kurtosis is normalized so that it is zero for the normal distribution
# Excess_Kurtosis(X) = Kurtosis(X) − 3
print("Minimum:            ", np.min(x))
print("Maximum:            ", np.max(x))
print("Mean:               ", np.mean(x))
print("Median:             ", np.median(x))
print("Variance:           ", np.var(x)) # population variance
print("Standard deviation: ", np.std(x))
print("Skewness:           ", st.skew(x))
print("Kurtosis:           ", st.kurtosis(x)) # Excess_Kurtosis(X)

Minimum:             -3.0354733793984288
Maximum:             3.0725357028737514
Mean:                -0.0354427882624812
Median:              -0.0813885874966274
Variance:            0.9766286497285154
Standard deviation:  0.9882452376452494
Skewness:            0.08288308067303092
Kurtosis:            -0.10431286470175172


In [4]:
import numpy as np
import scipy.stats as st

x = np.random.uniform(0, 1, (1000,))

# kurtosis is normalized so that it is zero for the normal distribution
# Excess_Kurtosis(X) = Kurtosis(X) − 3 
n, min_max, mean, var, skew, kurt = st.describe(x)
print("Number of samples:  ", n)
print("Minimum of samples: ", min_max[0])
print("Maximum of samples: ", min_max[1])
print("Mean:               ", mean)
print("Variance:           ", var) # population variance
print("Skewness:           ", skew)
print("Kurtosis:           ", kurt) # Excess_Kurtosis(X)

Number of samples:   1000
Minimum of samples:  0.0004502645090431745
Maximum of samples:  0.9995639077404074
Mean:                0.4927559166188729
Variance:            0.08582685802197292
Skewness:            0.02134107861041948
Kurtosis:            -1.2357290459110941


[<a href="#Expectation-and-Variance">Back to top</a>]