## **Naive Bayes Classifier**

* The Naïve Bayes classifier is a supervised machine learning algorithm that is used for classification tasks.

* It is based on Bayes' Theorem.

* It assumes that features are independent of each other, meaning the presence or absence of one feature doesn’t impact the probability of another feature. So it gets it's name Naive.

* Naïve bayes classifier gives good results on homogenous type of input feature, that is why it gives very good results on Text data task like sentiment analysis.

### **Bayes' Theorem**
* Bayes' Theorem is a fundamental theorem in probability theory that describes the probability of an event based on prior knowledge of conditions that might be related to the event.

* Formula: $$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$$

where,  

- P(A|B) is the probability of event A occurring given that event B has occurred. It is called **posterior probability**.  
- P(B|A) is the probability of event B occurring given that event A has occurred.  
- P(A) is the probability of event A. It is called **prior probability**.  
- P(B) is the probability of event B.

**Bayes' Theorem Example**
<center>
    <img src="../assets/bayes-theorem-problem.png" />
</center>

* What is the probability of playing tennis when outlook is rainy, temperature is cool, humidity is normal and windy is true?

    * P(PlayTennis=Yes)= Number of 'Yes' / Total instances = 9 / 14 = 0.64

    * P(PlayTennis=No) = Number of 'No' / Total instances = 5 / 14 = 0.36

    * P(Outlook = Rainy|Yes) = Number of Rainy and Yes / Total Yes = 2 / 9

    * P(Temperature = Cool|Yes) = Number of Cool and Yes / Total Yes = 1 / 9

    * P(Humidity = Normal|Yes) = Number of Normal and Yes / Total Yes = 6 / 9

    * P(Windy = True|Yes) = Number of Windy and Yes / Total Yes = 3 / 9
    
    * P(X) = P(Outlook = Rainy) * P(Temperature = Cool) * P(Humidity = Normal) * P(Windy = True)

    * P(X|Yes) * P(Yes) = (P(Outlook = Rainy|Yes) * P(Temperature = Cool|Yes) * P(Humidity = Normal|Yes) * P(Windy = True|Yes)) * P(Yes)

        = (2/9 * 1/9 * 6/9 * 3/9) * 0.64

        = (2 * 1 * 6 * 3) / (9 * 9 * 9 * 9) * 0.64
        
        ≈ 0.0035


    * P(PlayTennis=Yes|X) = (P(X|Yes) * P(Yes)) / P(X)
         = 0.0035 / 0.0219
         ≈ 0.1603




In [1]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import CategoricalNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [2]:
import warnings
warnings.filterwarnings('ignore')

In [3]:
# Create a DataFrame from the given table
data = {
    'Outlook': ['Sunny', 'Sunny', 'Overcast', 'Rainy', 'Rainy', 'Rainy', 'Overcast', 'Sunny', 'Sunny', 'Rainy', 'Sunny', 'Overcast', 'Overcast', 'Rainy'],
    'Temperature': ['Hot', 'Hot', 'Hot', 'Mild', 'Cool', 'Cool', 'Cool', 'Mild', 'Cool', 'Mild', 'Mild', 'Mild', 'Hot', 'Mild'],
    'Humidity': ['High', 'High', 'High', 'High', 'Normal', 'Normal', 'Normal', 'High', 'Normal', 'Normal', 'Normal', 'High', 'Normal', 'High'],
    'Windy': [False, True, False, False, False, True, True, False, False, False, True, True, False, True],
    'PlayTennis': ['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes', 'No']
}

df = pd.DataFrame(data)
df.head()

Unnamed: 0,Outlook,Temperature,Humidity,Windy,PlayTennis
0,Sunny,Hot,High,False,No
1,Sunny,Hot,High,True,No
2,Overcast,Hot,High,False,Yes
3,Rainy,Mild,High,False,Yes
4,Rainy,Cool,Normal,False,Yes


In [4]:
# Encode categorical variables
# We are using label encoding for Outlook, Temperature, Humidity
le_outlook = LabelEncoder()
le_temperature = LabelEncoder()
le_humidity = LabelEncoder()
le_playtennis = LabelEncoder()

df['Outlook'] = le_outlook.fit_transform(df['Outlook'])
df['Temperature'] = le_temperature.fit_transform(df['Temperature'])
df['Humidity'] = le_humidity.fit_transform(df['Humidity'])
df['Windy'] = df['Windy'].astype(int)
df['PlayTennis'] = le_playtennis.fit_transform(df['PlayTennis'])   

In [5]:
df.head()

Unnamed: 0,Outlook,Temperature,Humidity,Windy,PlayTennis
0,2,1,0,0,0
1,2,1,0,1,0
2,0,1,0,0,1
3,1,2,0,0,1
4,1,0,1,0,1


In [6]:
# Separate features and target variable
X = df.drop('PlayTennis', axis=1)
y = df['PlayTennis']

In [7]:
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

In [8]:
# Train Naive Bayes classifier
model = CategoricalNB()
model.fit(X_train, y_train)

In [9]:
# Make predictions on test data and evaluate accuracy
y_pred_test = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred_test)
print(f'Test Accuracy: {accuracy:.2f}')

# Let's compute the confusion matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred_test)
print("Confusion Matrix for Test Data:\n", cm)

# Let's compute training accuracy
y_pred_train = model.predict(X_train)
accuracy = accuracy_score(y_train, y_pred_train)
print(f'Train Accuracy: {accuracy:.2f}')

Test Accuracy: 0.83
Confusion Matrix for Test Data:
 [[1 1]
 [0 4]]
Train Accuracy: 1.00


In [10]:
# Inference on new data
# What is the probability of playing tennis if Outlook is 'Rainy', Temperature is 'Cool', Humidity is 'Normal', and Windy is 1
new_data = [[le_outlook.transform(['Sunny'])[0], le_temperature.transform(['Mild'])[0], le_humidity.transform(['Normal'])[0], 0]]
prediction = model.predict(new_data)
probability = model.predict_proba(new_data)
print("Probability of playing tennis: ", probability)
print(f'Prediction: {le_playtennis.inverse_transform(prediction)[0]} with probability of {probability[0][1]:.2f}')

Probability of playing tennis:  [[0.2816092 0.7183908]]
Prediction: Yes with probability of 0.72


### **Types of Naive Bayes'Classifier**

* **Gaussian Naïve Bayes (GaussianNB)**
    * This is a variant of the Naïve Bayes classifier, which is used with Gaussian distributions.
    * This model is fitted by finding the mean and standard deviation of each class. 
    * Suitable for continuous data that follows a normal (Gaussian) distribution.
    * Assumption: Features are normally distributed within each class.

* **Multinomial Naïve Bayes (MultinomialNB)**
    * This type of Naïve Bayes classifier assumes that the features are from multinomial distributions.
    * Ideal for discrete data.
    * Assumption: Features represent counts or frequencies

* **Bernoulli Naïve Bayes (BernoulliNB)**
    * This is another variant of the Naïve Bayes classifier, which is used with Boolean variables/features.
    * Ideal for discrete data. 
    * Assumption: Each feature is binary (0 or 1), indicating the absence or presence of a feature.

### **Applications**
<center>
    <img src="../assets/naive-bayes-application.png" width=800/>
</center>

In [11]:
weight_m = np.array([20,20,40,80])
mean_wt_m= weight_m.mean()
std_wt_m = weight_m.std()

In [12]:
weight_l = np.array([50,60,70,90,100])
mean_wt_l= weight_l.mean()
std_wt_l = weight_l.std()

In [13]:
height_m = np.array([50,60,70,90,100])
mean_ht_m= height_m.mean()
std_ht_m = height_m.std()

In [14]:
height_l = np.array([50,60,70,90,100])
mean_ht_l = height_l.mean()
std_ht_l = height_l.std()

In [15]:
import math

In [16]:
def gaussian(mean_ht_l,std_ht_l,height_l):
    prob = (1/((2*PI)**0.5)*std_ht_l) * (exp(-(height_l - mean_ht_l)**2)/(2 * std_ht_l)**2 )
