#### About
> Conditional Probability Distribution

Conditional Probability Distribution is a probability distribution for a random variable when one or more other random variables are given or held constant. It gives the probability distribution of one variable based on the knowledge of the distribution of another variable.

Use cases - 

1. Machine learning- Conditional probability distribution is used in the context of modeling the probability distribution of a target variable given input features. For instance, in a spam classification problem, we can model the probability of an email being spam given some input features such as the email content, sender information, etc.

2. Finance - It is used to model the returns on investments.


Example -

Suppose we have a dataset of the heights and weights of a group of people. We want to know the probability distribution of the weight of a person given that their height is greater than 6 feet.

To calculate this, we can use Bayes' theorem which states:

P(A|B) = P(B|A) * P(A) / P(B)


In [2]:
import pandas as pd
import numpy as np
from scipy.stats import norm

# Random data
data = pd.DataFrame({'Height': [5.9, 6.2, 6.5, 6.1, 6.3, 6.8, 6.0, 6.7, 6.4, 6.2],
                     'Weight': [155, 176, 201, 164, 180, 210, 142, 200, 185, 168]})


In [3]:
# Calculate the mean and standard deviation of the weight distribution
mean_weight = data['Weight'].mean()
std_weight = data['Weight'].std()


In [4]:
# Calculate the conditional probability distribution of weight given height > 6 feet
prob_weight_given_height = norm.pdf(data['Weight'], loc=mean_weight, scale=std_weight)
prob_height = (data['Height'] > 6).astype(int)
prob_weight_given_height_cond = prob_weight_given_height * prob_height
prob_weight_given_height_cond /= prob_weight_given_height_cond.sum()


In [5]:
# Print the conditional probability distribution of weight
print(prob_weight_given_height_cond)

0    0.000000
1    0.161695
2    0.092809
3    0.131389
4    0.161834
5    0.054816
6    0.000000
7    0.097356
8    0.154407
9    0.145695
Name: Height, dtype: float64
