# Computing Conditional Probability $P(a \mid b)$

We consider the task of computing $P(a \mid b)$, where $a$ can be a vector of features and $b$ can be a vector of features.

In a probabilistic view, this can be computed using:

$$
P(a \mid b) = \frac{P(a, b)}{P(b)}
$$

In Python, we can exploit the concepts of covariance to compute the conditional distribution of $a$ given $b$. Assuming that both $a$ and $b$ are jointly Gaussian-distributed (just for this example), the conditional distribution $P(a \mid b)$ is also Gaussian and can be computed using the conditional mean and covariance.

Given the joint mean vector and covariance matrix:
-  $\mu_1$ and $\Sigma_{11}$ are the mean and covariance of $a$,
-  $\mu_2$ and $\Sigma_{22}$ are the mean and covariance of $b$,
-  $\Sigma_{12}$ is the cross-covariance between $a$ and $b$,

The conditional distribution of $a$ given $b$ is Gaussian, with the following parameters:

### Conditional Mean:

$$
\mu_{a \mid b} = \mu_1 + \Sigma_{12} \Sigma_{22}^{-1} (b - \mu_2)
$$

### Conditional Covariance:

$$
\Sigma_{a \mid b} = \Sigma_{11} - \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{12}^T
$$

This approach allows us to compute the conditional distribution $P(a \mid b)$ for multiple features, leveraging the covariance structure of the data.

Here’s a Python implementation of how this can be computed using NumPy:

In [11]:
import numpy as np

n_features = 10  # Total number of features in the dataset (for both a and b)
n_samples = 1000  # Total number of samples

# Define which indexes belong to a and b
a_indexes = np.array([0, 1, 2])  # Features for a (3 features)
b_indexes = np.array([3, 4, 5, 6])  # Features for b (4 features)

# Generate random data (n_features x n_samples)
data = np.random.rand(n_features, n_samples)

# Compute mean and covariance
mean = np.mean(data, axis=1)  # Mean across samples (axis=1 to get mean per feature)

cov = np.cov(data)  # Covariance matrix (n_features x n_features)

# Partition mean and covariance into a and b
mu_a = mean[a_indexes]  # Mean of a (3 features)
mu_b = mean[b_indexes]  # Mean of b (4 features)

Sigma_aa = cov[np.ix_(a_indexes, a_indexes)]  # Covariance of a (3x3 matrix)
Sigma_bb = cov[np.ix_(b_indexes, b_indexes)]  # Covariance of b (4x4 matrix)
Sigma_ab = cov[np.ix_(a_indexes, b_indexes)]  # Cross-covariance of a and b (3x4 matrix)

# Values of b (these should be the observed values of b)
b_values = np.array([1.0, 1.0, 1.0, 1.0])  # Example values for b (4 features)

# Compute conditional mean of a given b
mu_a_given_b = mu_a + Sigma_ab @ np.linalg.inv(Sigma_bb) @ (b_values - mu_b)

# Compute conditional covariance of a given b
Sigma_a_given_b = Sigma_aa - Sigma_ab @ np.linalg.inv(Sigma_bb) @ Sigma_ab.T

print(f"Conditional mean of a given b shape: {mu_a_given_b.shape}")
print(f"Conditional mean of a given b: {mu_a_given_b}")

print(f"Conditional covariance of a given b shape: {Sigma_a_given_b.shape}")
print(f"Conditional covariance of a given b: {Sigma_a_given_b}")


Conditional mean of a given b shape: (3,)
Conditional mean of a given b: [0.42317393 0.52931148 0.47197586]
Conditional covariance of a given b shape: (3, 3)
Conditional covariance of a given b: [[ 0.08372542 -0.00357772 -0.00916852]
 [-0.00357772  0.08034498  0.00857042]
 [-0.00916852  0.00857042  0.080072  ]]
