## Conditional Probability

**Definition:**  
Conditional probability is the probability of one event occurring **given that** another event has already occurred.  
It tells us how the probability of an event changes when we have additional information.

**Formula in words:**  
The probability of event A given that event B has occurred  
is equal to the probability of both A and B happening  
divided by the probability of event B.

**That is:**

P(A given B) = P(A and B) / P(B)

In [5]:
import pandas as pd

data = {
    'Person': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
    'Gender': ['M', 'F', 'F', 'M', 'F', 'M', 'M', 'F'],
    'Play_Sport': ['Yes', 'No', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes']
}

df = pd.DataFrame(data)
df


Unnamed: 0,Person,Gender,Play_Sport
0,A,M,Yes
1,B,F,No
2,C,F,Yes
3,D,M,Yes
4,E,F,No
5,F,M,Yes
6,G,M,No
7,H,F,Yes


In [9]:
# Total females
females = df[df['Gender'] == 'F']

# Females who play sport
females_play = df[(df['Gender'] == 'F') & (df['Play_Sport'] == 'Yes')]

P_B = len(females) / len(df)         # Probability of being female
P_A_and_B = len(females_play) / len(df) 

P_A_given_B = P_A_and_B / P_B
print(f"Probability of being female and plays sport: {P_A_given_B*100}")


Probability of being female and plays sport: 50.0


In [11]:
females = df[df['Gender'] == 'F']

females_not_play = df[(df['Gender'] == 'F') & (df['Play_Sport'] == 'No')]

P_B = len(females) / len(df)         
P_A_and_B = len(females_not_play) / len(df)

P_A_given_B = P_A_and_B / P_B
print(f"Probability of being female and not play sport: {P_A_given_B*100}")


Probability of being female and not play sport: 50.0


In [13]:

data = {
    'Day': ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'],
    'Raining': ['Yes', 'No', 'Yes', 'No', 'Yes', 'No', 'No'],
    'Cold': ['Yes', 'No', 'Yes', 'No', 'No', 'Yes', 'No'],
    'Play_Cricket': ['No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes']
}

df = pd.DataFrame(data)
df


Unnamed: 0,Day,Raining,Cold,Play_Cricket
0,Mon,Yes,Yes,No
1,Tue,No,No,Yes
2,Wed,Yes,Yes,No
3,Thu,No,No,Yes
4,Fri,Yes,No,Yes
5,Sat,No,Yes,Yes
6,Sun,No,No,Yes


In [21]:
rain_days = df[df['Raining'] == 'Yes']
rain_play = df[(df['Raining'] == 'Yes') & (df['Play_Cricket'] == 'Yes')]

P_B = len(rain_days) / len(df)           # Probability of raining
P_A_and_B = len(rain_play) / len(df)     # Probability of raining and playing cricket

P_A_given_B = P_A_and_B / P_B
print(f"Probability of raining and playing cricket : {P_A_given_B*100}%")


Probability of raining and playing cricket : 33.33333333333333%


In [25]:
cold_days = df[df['Cold'] == 'Yes']
cold_play = df[(df['Cold'] == 'Yes') & (df['Play_Cricket'] == 'Yes')]

P_B = len(cold_days) / len(df)           # Probability of cold
P_A_and_B = len(cold_play) / len(df)     # Probability of cold and playing cricket

P_A_given_B = P_A_and_B / P_B
print(f" Probability of cold and playing cricket : {P_A_given_B*100}%")


 Probability of cold and playing cricket : 33.33333333333333%
