## Conditional Probability

Conditional Probability is the probability of an event A happening given that another event B has happened.

Example:
You have a deck of cards. If I tell you "the card that is drawn is a heart", the probability that it's and Ace is no longer 1/52; it's 1/13 because you now only consider hearts. That's conditional probability.

The conditional probability of event $A$ given $B$ is:

$$P(A|B) = \frac{P(A \cap B)}{P(B)}, \quad P(B) > 0$$

$A \cap B$ = both events happen

$P(B)$ = probability of the “given” event

$P(A|B)$ = probability of $A$ in the context where $B$ already happened

In [1]:
# Example: Conditional probability
import numpy as np

# Simulate drawing from a deck (simplified)
cards = ['heart', 'diamond', 'club', 'spade']*13
np.random.shuffle(cards)

# Event A = Ace of hearts
# Event B = Heart
num_trials = 100000
count_A_and_B = 0
count_B = 0

for _ in range(num_trials):
    card = np.random.choice(cards)
    if card == 'heart':  # B happened
        count_B += 1
        if card == 'heart':  # A also happened (Ace of hearts)
            count_A_and_B += 1

P_A_given_B = count_A_and_B / count_B
print(P_A_given_B)

1.0


## Law of Total Probability

The Law of Total Probability states that if $B_1, B_2, ..., B_n$ are mutually exclusive and exhaustive events, then for any event $A$:

$P(A) = \sum_{i=1}^{n} P(A|B_i) \cdot P(B_i)$

The total probability of A can be computed by considering all the "paths" through which A can happen.

## Bayes' Theorem

Bayes' theorem is conditional probability in reverse:
$$P(B|A) = \frac{P(A|B) \cdot P(B)}{P(A)}, \quad P(A) > 0$$

$P(B)$ = prior probability (what you believe about $B$ before seeing $A$)

$P(A|B)$ = likelihood (probability of seeing $A$ if $B$ is true)

$P(B|A)$ = posterior probability (updated belief about $B$ after seeing $A$)

$P(A)$ = total probability of $A$:

### Implementing Gaussian Naive Bayes' theorem on Iris Dataset


First lets understand the problem here.

We have the data points (flowers) with features (length, width) and labels (species).

The question is:

"If I see a flower with these measurements, what species it is most likely to be?"

That is a probability question.
Given the measurement, what is the probability of each class?


Bayes lets us flip conditional probabilites.

It is easier to model features given a class than to model classes given features directly. 

So instead of trying to learn the probability of species given measurements, we learn the probability of measurements given the species which is easier.

Then we reverse it using Bayes.

The steps now:

Step 1 - Priors

This is the baseline belief before seeing any features.

Example: If most flowers in the dataset are Setosa, we expect a new flower to be Setosa with high probability.

Ask this: "If I know nothing about this flower except class frequencies, what would I guess?"

Step 2 - Mean & Variance per feature

We assume each feature follows a Gaussian distribution per class.
Mean = where most data points are for that feature in that class.
Variance = how spread out the feature is

We ask here, "How likely is this measurement for this class?"

Step 3 - Likelihood of new point

For each feature, calculate how probable it is under the class distribution.

Multiply probabilities (independence assumption = “Naive”)

Mental model: “If this were really a flower of species c, how surprised would I be to see these measurements?”

Very likely → probability high

Very unlikely → probability low

Step 4 - Posterior & Prediction

Combine prior belief and evidence from features

Pick class with highest posterior probability

Mental model: “Given my baseline expectation and how likely these measurements are, which species makes the most sense?”

In [5]:
from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()

X = iris.data
y = iris.target


feature_names = iris.feature_names
target_names = iris.target_names

print("Feature names:", feature_names)
print("Target names:", target_names)
print("First 5 samples:\n", X[:5])


Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Target names: ['setosa' 'versicolor' 'virginica']
First 5 samples:
 [[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]


In [13]:
iris_df = pd.DataFrame(X, columns=iris.feature_names)
iris_df["target"] = y

In [14]:
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report

In [15]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=69)

In [17]:
gnb_model = GaussianNB()
gnb_model.fit(X_train, y_train)

0,1,2
,"priors  priors: array-like of shape (n_classes,), default=None Prior probabilities of the classes. If specified, the priors are not adjusted according to the data.",
,"var_smoothing  var_smoothing: float, default=1e-9 Portion of the largest variance of all features that is added to variances for calculation stability. .. versionadded:: 0.20",1e-09


In [18]:
y_pred = gnb_model.predict(X_test)

In [19]:
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

Accuracy: 0.9666666666666667


In [21]:
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       0.89      1.00      0.94         8
           2       1.00      0.92      0.96        12

    accuracy                           0.97        30
   macro avg       0.96      0.97      0.97        30
weighted avg       0.97      0.97      0.97        30

