In [1]:
import numpy as np
import pandas as pd
import seaborn as sns


# Question 4


## Theoretical Derivation of $\mathbb{E}(AX)$ and $\mathrm{Cov}(AX)$

Let $X$ be a random vector in $\mathbb{R}^n$ with mean $\mu_X = \mathbb{E}[X]$ and covariance matrix $\Sigma_X = \mathrm{Cov}(X)$. Let $A$ be a $k \times n$ matrix.

**1. Expectation of a Linear Transformation:**

$$
\mathbb{E}[AX] = A\,\mathbb{E}[X] = A\mu_X
$$

This follows from the linearity of expectation.

**2. Covariance of a Linear Transformation:**

$$
\mathrm{Cov}(AX) = \mathbb{E}\big[(AX - \mathbb{E}[AX])(AX - \mathbb{E}[AX])^T\big]
$$

Expanding the terms:

$$
= \mathbb{E}\big[A(X - \mu_X)(X - \mu_X)^T A^T\big]
$$

Since $A$ is constant, it can be factored out:

$$
= A\,\mathbb{E}\big[(X - \mu_X)(X - \mu_X)^T\big]A^T
$$

But $\mathbb{E}\big[(X - \mu_X)(X - \mu_X)^T\big] = \Sigma_X$, so:

$$
\mathrm{Cov}(AX) = A\Sigma_X A^T
$$

Thus, the mean and covariance of the transformed vector $AX$ are given by:

- $\mathbb{E}[AX] = A\mu_X$
- $\mathrm{Cov}(AX) = A\Sigma_X A^T$

In [2]:
# Compute E[AX] and Cov(AX) using the formulas from the derivation



#(3x4)
A = np.array([[1, -1,0,0],
              [1, 1,-2,0],
              [1,1,1,3]])

# Compute mean vector mu_X (4x1)
mu_X = np.array([3,2,-2,0])

# Compute covariance matrix Sigma_X
Sigma_X = np.array([[3,0,0,0],
                   [0,3,0,0],
                   [0,0,3,0],
                   [0,0,0,3]]
                   )

# Compute E[AX]
E_AX = A @ mu_X
print("E[AX]:", E_AX)

# Compute Cov(AX)
Cov_AX = A @ Sigma_X @ A.T
print("Cov(AX):\n", Cov_AX)

E[AX]: [1 9 3]
Cov(AX):
 [[ 6  0  0]
 [ 0 18  0]
 [ 0  0 36]]


## The off-diagonals are all 0, indicating that all variables are uncorrelated since they have a covariance = 0.


## Insight questions


## Linear Combinations in Factor Analysis

In factor analysis, linear combinations of the original variables are used to create new variables called factors. These factors are constructed to capture most of the variance in the data using fewer variables than the original set. This process reduces the dimensionality of the data, making it easier to analyze and interpret.

By representing the data with a smaller number of factors, we achieve computational efficiency and simplify complex datasets. The main goal is to explain the observed correlations among variables with as few factors as possible, thus reducing redundancy and focusing on the underlying structure of the data.

Additionally, factor analysis helps to isolate the most important information in the dataset by identifying the key underlying factors that drive the observed patterns. This allows analysts to focus on the most significant sources of variation and ignore less relevant details.