# Naive Bayes Algorithm :
The Naive Bayes algorithm is a family of simple yet powerful probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence
assumptions between the features. It's particularly popular for text classification tasks such as spam detection, sentiment analysis, and document
categorization.

There are several types of Naive Bayes classifiers, depending on the nature of the feature data:

Gaussian Naive Bayes: Assumes that the features follow a normal (Gaussian) distribution. It's used for continuous data. Multinomial Naive Bayes: Used for
discrete data, particularly in text classification where features represent word frequencies. Bernoulli Naive Bayes: Assumes binary features (Os and 1s), used
for binary/boolean features, such as in text classification tasks where the presence or absence of a word is considered. Steps of Naive Bayes Algorithm
Training Phase:

Calculate the prior probability for each class. Calculate the likelihood for each feature given each class. If using Gaussian Naive Bayes, calculate the mean
and variance of the features for each class.

In [1]:
import numpy as np
import pandas as pd

In [12]:
df = pd.read_csv("C:\\Users\\Saurabh\\Downloads\\Social_Network_Ads.csv")

In [13]:
df.head(3)

Unnamed: 0,User ID,Gender,Age,EstimatedSalary,Purchased
0,15624510,Male,19,19000,0
1,15810944,Male,35,20000,0
2,15668575,Female,26,43000,0


In [14]:
df=df.drop(columns=(['User ID','Gender']))

In [15]:
df.sample(3)

Unnamed: 0,Age,EstimatedSalary,Purchased
240,42,149000,1
131,33,31000,0
296,42,73000,1


In [16]:
x = df.drop(columns = ['Purchased']) # Independent column
y = df['Purchased']  # Dependent column

In [17]:
from sklearn.model_selection import train_test_split

In [18]:
x_train, x_test, y_train, y_test = train_test_split(x, y, 
                                                   test_size = 0.2,
                                                   random_state = 23)

In [19]:
from sklearn.preprocessing import StandardScaler

In [20]:
sc = StandardScaler()

In [21]:
x_train_new = sc.fit_transform(x_train)

In [34]:
x_test_new = sc.fit_transform(x_test)

In [35]:
from sklearn.naive_bayes import GaussianNB, MultinomialNB,BernoulliNB

In [36]:
classifier=GaussianNB()

In [37]:
classifier.fit(x_train_new, y_train)

In [38]:
y_pred = classifier.predict(x_test_new)

In [39]:
y_pred

array([0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0,
       1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0])

In [40]:
from sklearn.metrics import confusion_matrix

In [41]:
cn = confusion_matrix(y_test, y_pred)

In [42]:
cn

array([[48,  2],
       [ 5, 25]])

In [45]:
from sklearn.metrics import accuracy_score

In [46]:
accuracy_score(y_test, y_pred)

0.9125