E(X); the expectation represents the average value or mean of a random variable X
$\newline$
It provides a measure of the central tendency of the variable's distribution.
$\newline$
$$E[X] = \sum \limits _{i} x_i P(X = x_i) = \sum \limits _{i} x_i p(x_i)$$

In [1]:
import math

x = [1, 2, 3, 4, 5, 6] #possible events
p = [1/6, -1/6, 1/6, -1/6, 1/6, -1/6] #corresponding weighted probabilities set by the rules of the game (see slide 4 & 7 (mm3_updated.pdf))

# Calculate E(X)
expectation = sum(xi * pi for xi, pi in zip(x, p))

# Print the result
print("E(X) =", expectation)

E(X) = -0.5


If X is a continuous random variable

$$E[X] = \int_{-∞}^∞ \, x f(x)dx$$

In [24]:
import math
from scipy.integrate import quad #pip install scipy

def f(x): #whatever function is given in problem put here
    if 0 <= x <= 1:
        return x
    elif -1 <= x < 0:
        return -x
    else:
        return 0

def integrand(x):
    return x * f(x)

# Define the integration limits
lower_limit = float(-1) #whatever lowerlimit is given in problem put here
upper_limit = float(1) #whatever upperlimit is given in problem put here

# Perform the numerical integration
expectation, error = quad(integrand, lower_limit, upper_limit)

# Print the result
print("E[X] =", expectation)


E[X] = 0.0


Now we found the expected value of a function g(X), think of it as a new random variable Y=g(X)

$$E[Y] = E[g(X)] = \sum \limits _{i} g(x_i) p(x_i)$$

In [34]:
x = [-2, -1, 1, 2] #possible events
p = [1/len(x), 1/len(x), 1/len(x), 1/len(x)] #corresponding weighted probabilities

def g(x): #function of the possible events
    return x**2

# Calculate E(X)
expectation = sum(g(xi) * pi for xi, pi in zip(x, p))

# Print the result
print("E(X) =", expectation)

E(X) = 2.5


Regneregler
$\newline$
Always applies:
$$E[X+Y] = E[X] + E[Y]$$
$\newline$
Only applies if they are independent:
$$E[X*Y] = E[X] * E[Y]$$





Variance
$\newline$
Variance is a statistical measure that quantifies the spread or variability of a set of data points around their mean. 
$\newline$
It provides a measure of how much the individual data points deviate from the average.
$$Var(X) = E[(X - E(x))^2] = E[X^2] - E[X]^2$$

In [35]:
def calculate_variance(data):
    # Calculate the number of data points
    n = len(data)
    
    # Calculate the mean of the data points
    mean = sum(data) / n
    
    # Calculate the variance using the formula (x - mean)^2
    # Sum up the squared differences between each data point and the mean
    variance = sum((x - mean) ** 2 for x in data) / n
    
    # Return the calculated variance
    return variance

# Example usage
values = [1, 2, 3, 4, 5]
result = calculate_variance(values)
print("Var(X) =", result)


Var(X) = 2.0


Covariance
$\newline$
It measures how changes in one variable correspond to changes in another variable
$$Cov(X, Y) = E[(X - E[X])(Y - E[Y])]$$

In [None]:
def calculate_covariance(data_x, data_y):
    # Calculate the number of data points
    n = len(data_x)
    
    # Calculate the means of X and Y
    mean_x = sum(data_x) / n
    mean_y = sum(data_y) / n
    
    # Calculate the covariance using the formula (X - E(X))(Y - E(Y))
    covariance = sum((data_x[i] - mean_x) * (data_y[i] - mean_y) for i in range(n)) / n
    
    # Return the calculated covariance
    return covariance

# Example usage
x_values = [1, 2, 3, 4, 5]
y_values = [2, 4, 6, 8, 10]
result = calculate_covariance(x_values, y_values)
print("Cov(X, Y) =", result)


Some useful identities concerning variances:

Calculate the variance of the linear transformation
$$Var(aX + b) = a^2Var(X)$$

In [39]:
def variance_linear_transform(a, b, variance_X):
    # Calculate the variance of the linear transformation
    transformed_variance = (a ** 2) * variance_X
    
    # Return the transformed variance
    return transformed_variance

# Example usage
a = 2
b = 3
variance_X = 5

# Calculate the variance of the linear transformation
result = variance_linear_transform(a, b, variance_X)
print("Var(aX + b) =", result)

Var(aX + b) = 20


Calculate the variance of the constant
$$Var(b) = 0$$

In [37]:
def variance_constant(b):
    # Variance of a constant is always zero
    constant_variance = 0
    
    # Return the variance of the constant
    return constant_variance

# Example usage
b = 5

# Calculate the variance of the constant
result = variance_constant(b)
print("Var(b) =", result)

Var(b) = 0


Calculate the variance of the shifted random variable
$$Var(X + b) = Var(X)$$

In [38]:
def variance_shift(X_variance):
    # Variance remains the same when shifting a random variable by a constant
    shifted_variance = X_variance
    
    # Return the shifted variance
    return shifted_variance

# Example usage
X_variance = 10

# Calculate the variance of the shifted random variable
result = variance_shift(X_variance)
print("Var(X + b) =", result)

Var(X + b) = 10


Instead of covariance, we can work with dimensionless quality:
$\newline$
• Correlation coefficient of X and Y
$\newline$
Correlation coefficient is a measure of the strength and direction of the linear relationship between two variables
$$Corr(X, Y) = \frac{Cov(X, Y)}{\sqrt(Var(X))\sqrt(Var(Y))}$$

In [None]:
import numpy as np

def calculate_correlation_coefficient(x_values, y_values):
    # Calculate the correlation coefficient using numpy's corrcoef function
    correlation_matrix = np.corrcoef(x_values, y_values)
    
    # The correlation coefficient is the element at index (0, 1) or (1, 0) in the correlation matrix
    correlation_coefficient = correlation_matrix[0, 1]
    
    # Return the correlation coefficient
    return correlation_coefficient

# Example usage
x_values = [1, 2, 3, 4, 5]
y_values = [2, 4, 6, 8, 10]
result = calculate_correlation_coefficient(x_values, y_values)
print("Correlation coefficient =", result)