# Q1.
### A company conducted a survey of its employees and found that 70% of the employees use the company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the probability that an employee is a smoker given that he/she uses the health insurance plan?

P(S) = employee is a smoker

P(H) = employee uses the health insurance plan.

P(S|H) = employee is a smoker given they use the health insurance plan.

P(H) = 0.7

P(S|H) = 0.4

# Q2.
###  What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

- Bernoulli Naive Bayes: Assumes that features are binary (e.g., presence or absence of a feature).
- Multinomial Naive Bayes: Suitable for features that represent counts or frequencies (e.g., word counts in text classification).

# Q3.
### How does Bernoulli Naive Bayes handle missing values?

- by replacing them with a default value or using techniques like mean imputation.

# Q4.
### Can Gaussian Naive Bayes be used for multi-class classification?

- Yes, Gaussian Naive Bayes can be used for multi-class classification. It assumes that features follow a Gaussian (normal) distribution and calculates the conditional probability of each class given the feature values.

# Q5.

In [34]:
import warnings
warnings.filterwarnings('ignore')

In [3]:
import pandas as pd
data = pd.read_csv('spambase.data')

In [4]:
data.head()

Unnamed: 0,0,0.64,0.64.1,0.1,0.32,0.2,0.3,0.4,0.5,0.6,...,0.41,0.42,0.43,0.778,0.44,0.45,3.756,61,278,1
0,0.21,0.28,0.5,0.0,0.14,0.28,0.21,0.07,0.0,0.94,...,0.0,0.132,0.0,0.372,0.18,0.048,5.114,101,1028,1
1,0.06,0.0,0.71,0.0,1.23,0.19,0.19,0.12,0.64,0.25,...,0.01,0.143,0.0,0.276,0.184,0.01,9.821,485,2259,1
2,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.137,0.0,0.137,0.0,0.0,3.537,40,191,1
3,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.135,0.0,0.135,0.0,0.0,3.537,40,191,1
4,0.0,0.0,0.0,0.0,1.85,0.0,0.0,1.85,0.0,0.0,...,0.0,0.223,0.0,0.0,0.0,0.0,3.0,15,54,1


In [5]:
X = data.iloc[:,:-1] 
y = data.iloc[: , -1]

In [6]:
from sklearn.model_selection import train_test_split
X_train , X_test , y_train , y_test = train_test_split(X,y,test_size = 0.25 , random_state = 10)

In [10]:
from sklearn.naive_bayes import BernoulliNB , GaussianNB , MultinomialNB

In [11]:
bnb = BernoulliNB()

In [12]:
bnb.fit(X_train,y_train)

In [13]:
y_pred = bnb.predict(X_test)
y_pred

array([1, 0, 0, ..., 1, 0, 1])

In [24]:
from sklearn.model_selection import GridSearchCV

In [29]:
params = {'alpha' : [0.0,0.5,1.0]}

In [30]:
grid_b = GridSearchCV(bnb,params , cv = 10)

In [35]:
grid_b.fit(X_train,y_train)

In [36]:
grid_b.best_params_

{'alpha': 0.0}

In [37]:
y_pred = grid_b.predict(X_test)
y_pred

array([1, 0, 0, ..., 1, 0, 1])

In [38]:
from sklearn.metrics import accuracy_score , precision_score , recall_score , f1_score

In [39]:
print(accuracy_score(y_test,y_pred))

0.8956521739130435


In [40]:
print(precision_score(y_test,y_pred))

0.9066339066339066


In [41]:
print(recall_score(y_test,y_pred))

0.8181818181818182


In [42]:
print(f1_score(y_test,y_pred))

0.8601398601398603


In [46]:
mnb = MultinomialNB()

In [47]:
mnb.fit(X_train,y_train)

In [48]:
params_m = {'alpha' : [0.0,0.5,1.0]}

In [50]:
grid_m = GridSearchCV(mnb,params_m,cv=10)

In [51]:
grid_m.fit(X_train,y_train)

In [52]:
y_pred = grid_m.predict(X_test)
y_pred

array([1, 1, 0, ..., 1, 0, 1])

In [53]:
print(accuracy_score(y_test,y_pred))
print(precision_score(y_test,y_pred))
print(recall_score(y_test,y_pred))
print(f1_score(y_test,y_pred))

0.7904347826086957
0.7333333333333333
0.7317073170731707
0.732519422863485


In [54]:
gnb = GaussianNB()

In [55]:
gnb.fit(X_train,y_train)

In [56]:
params_g = {}
grid_g = GridSearchCV(gnb,params_g,cv=10)

In [58]:
grid_g.fit(X_train,y_train)

In [59]:
y_pred = grid_g.predict(X_test)

In [60]:
print(accuracy_score(y_test,y_pred))
print(precision_score(y_test,y_pred))
print(recall_score(y_test,y_pred))
print(f1_score(y_test,y_pred))

0.8191304347826087
0.6944
0.9623059866962306
0.8066914498141264


Bernoulli Naive Bayes Classifier performed the best among the three variants tested.
The reason could be attributed to the nature of the dataset and the assumptions underlying the model.