# Naive Baye's Algorithm (Classification)

### 1. Probability

Independent Events -> Rolling a Dice {1,2,3,4,5,6}

P(1) = 1/6, P(2) = 1/6 P(3) = 1/6

One outcome is not affecting the other outcome.


Dependent Events

I have bag of marbels with three orange and two yellow marbels.

Probability of removing orange marbel and then yellow marbel.

P(orange) = 3/5 (first_event) -> P(white after orange) = 2/4 (second_event)

One event is affecting second event.

P(O and Y) = P(O) * P(Y|O) conditional probability

P(Y|O) = P(O and Y) / P(O)




## Bayes Theorem
# Derivation of Bayes' Theorem

Bayes' Theorem is a fundamental result in probability theory that describes the probability of an event based on prior knowledge of conditions that might be related to the event. It is stated as:

$$
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
$$

Where:
- \( P(A|B) \) is the **posterior probability** of event \( A \) given \( B \).
- \( P(B|A) \) is the **likelihood** of event \( B \) given \( A \).
- \( P(A) \) is the **prior probability** of event \( A \).
- \( P(B) \) is the **marginal probability** of event \( B \).

---

## Step 1: Definition of Conditional Probability

The conditional probability of \( A \) given \( B \) is defined as:

$$
P(A|B) = \frac{P(A \cap B)}{P(B)}
$$

Similarly, the conditional probability of \( B \) given \( A \) is:

$$
P(B|A) = \frac{P(A \cap B)}{P(A)}
$$

---

## Step 2: Express $( P(A \cap B) )$ in Two Ways

From the definition of conditional probability, we can express $( P(A \cap B) )$ in two ways:

1. From \( P(A|B) \):
$
P(A \cap B) = P(A|B) \cdot P(B)
$

2. From \( P(B|A) \):
$
P(A \cap B) = P(B|A) \cdot P(A)
$

---

## Step 3: Equate the Two Expressions for $( P(A \cap B) )$

Since both expressions equal $( P(A \cap B) )$, we can set them equal to each other:

$
P(A|B) \cdot P(B) = P(B|A) \cdot P(A)
$

---

## Step 4: Solve for $( P(A|B) )$

Divide both sides of the equation by $( P(B) )$ to solve for $( P(A|B) )$:

$
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
$

---

## Final Result: Bayes' Theorem

Thus, Bayes' Theorem is derived as:

$$
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
$$

---

## Interpretation

- **Posterior Probability (\( P(A|B) \))**: The probability of \( A \) after observing \( B \).
- **Prior Probability (\( P(A) \))**: The initial probability of \( A \) before observing \( B \).
- **Likelihood (\( P(B|A) \))**: The probability of observing \( B \) given \( A \).
- **Marginal Probability (\( P(B) \))**: The total probability of observing \( B \), calculated as:

$
P(B) = P(B|A) \cdot P(A) + P(B|\neg A) \cdot P(\neg A)
$

Where $( \neg A )$ represents the complement of \( A \).

# Variants of Naivey Bayes

## 1. Bernoulli Naivey Bayes

Whenever your independent features are following a Bernoulli Distribution, then we need to use Bernoulli Naive Bayes.

It can also be used in nlp problems.

## 2. MultiNomial Naive Bayes

If our independent features are in form of text then we must use multinomial naive bayes.

## 3. Gaussian Naive Bayes

When our independent features are following gaussian distribution, then we must use Gaussian Naive Bayes.



# Naive Bayes Practical Implementation

In [None]:
#loading required libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

In [None]:
#separating input and output features
X, y = load_iris(return_X_y = True)

In [None]:
#splitting data into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)

In [None]:
#import gaussian naive bayes
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()

In [None]:
#fitting gaussian naive bayes on training data
gnb.fit(X_train, y_train)

In [None]:
#making predictions on test data
y_pred = gnb.predict(X_test)

In [None]:
# checking performance
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))


[[19  0  0]
 [ 0 12  1]
 [ 0  0 13]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      0.92      0.96        13
           2       0.93      1.00      0.96        13

    accuracy                           0.98        45
   macro avg       0.98      0.97      0.97        45
weighted avg       0.98      0.98      0.98        45



In [None]:
import seaborn as sns
sns.load_dataset("tips")