## Expected Value and Mean
In probability, the average value of some random variable X is called the expected value or the expectation. The expected value uses the notation E with square brackets around the name of the variable; for example:

E[X]

In [2]:
# vector mean
from numpy import array
from numpy import mean
v = array([1, 2, 3, 4, 5, 6])

# calculate mean
result = mean(v)
print(result) # 3.5

3.5


In [3]:
# matrix means
from numpy import array
from numpy import mean

M = array([
    [1, 2, 3, 4, 5, 6],
    [1, 2, 3, 4, 5, 6]
])

# column means
col_mean = mean(M, axis = 0)
print(col_mean)
# row means
row_mean = mean(M, axis = 1)
print(row_mean)

[1. 2. 3. 4. 5. 6.]
[3.5 3.5]


## Variance and Standard Deviation
In probability, the variance of some random variable X is a measure of how much values in the distribution vary on average with respect to the mean. The variance is denoted as the function Var() on the variable.

Var[X] = E[(X - E[X])^2] 

In [4]:
# vector variance
from numpy import array
from numpy import var

v = array([1, 2, 3, 4, 5, 6])
# calculate variance
result = var(v, ddof=1)
print(result)

3.5


In [5]:
# matrix variances
from numpy import array
from numpy import mean

M = array([
    [1, 2, 3, 4, 5, 6],
    [1, 2, 3, 4, 5, 6]
])

# column variance
col_var = var(M, ddof=1, axis=0)
print(col_var)
# row variance
row_var = var(M, ddof=1, axis=1)
print(row_var)

[0. 0. 0. 0. 0. 0.]
[3.5 3.5]


The standard deviation is calculated as the square root of the variance and is denoted as lowercase s.

In [6]:
# matrix standard deviation
from numpy import array 
from numpy import std

M = array([
    [1, 2, 3, 4, 5, 6],
    [1, 2, 3, 4, 5, 6]
])

# column standard deviations
col_std = std(M, ddof=1, axis=0)
print(col_std)
# row standard deviations
row_std = std(M, ddof=1, axis=1)
print(row_std)

[0. 0. 0. 0. 0. 0.]
[1.87082869 1.87082869]


## Covariance and Correlation
In probability, covariance is the measure of the joint probability for two random variables.  Itdescribes how the two variables change together.  It is denoted as the function *cov(X,Y)*, where X and Y are the two random variables being considered.

In [1]:
# vector covariance
from numpy import array
from numpy import cov

x = array([1, 2, 3, 4, 5, 6, 7, 8, 9])
y = array([9, 8, 7, 6, 5, 4, 3, 2, 1])

# calculate covariance
Sigma = cov(x, y)[0,1]
print(Sigma)

-7.5


The covariance can be normalized to a score between -1 and 1 to make the magnitude interpretable by dividing it by the standard deviation ofXandY. The result is called thecorrelation of the variables, also called the Pearson correlation  coefficient, named for the developer of the method.

r = cov(X, Y)/ sX . sY

In [2]:
# vector correlation
from numpy import array
from numpy import corrcoef

x = array([1, 2, 3, 4, 5, 6, 7, 8, 9])
y = array([9, 8, 7, 6, 5, 4, 3, 2, 1])

# calculate correlation
corr = corrcoef(x,y)[0,1]
print(corr)

-1.0


## Covariance Matrix
The covariance matrix is a square and symmetric matrix that describes the covariance betweentwo or more random variables.  The diagonal of the covariance matrix are the variances of eachof the random variables, as such it is often called the variance-covariance matrix.  

In [3]:
# covariance matrix
from numpy import array
from numpy import cov

X = array([
    [1, 5, 8],
    [3, 5, 11],
    [2, 4, 9],
    [3, 6, 10],
    [1, 5, 10]
])

# calculate covariance matrix
Sigma = cov(X.T)
print(Sigma)

[[1.   0.25 0.75]
 [0.25 0.5  0.25]
 [0.75 0.25 1.3 ]]


## Get Statistic info from Stats

In [4]:
# statistic info
from numpy import array
from scipy import stats
v = array([1, 2, 3, 4, 5, 6])

# get statistic info
print(stats.describe(v))

DescribeResult(nobs=6, minmax=(1, 6), mean=3.5, variance=3.5, skewness=0.0, kurtosis=-1.2685714285714282)
