In [None]:
1:
    
Let 'S' denote the event that an employee is a smoker, and 'H' denote the event that an 
employee uses the companys health insurance plan. We are given:

P(H) = 0.7 (70% of the employees use the health insurance plan)
P(S | H) = 0.4 (40% of the employees who use the plan are smokers)

We want to find P(S | H), the probability that an employee is a smoker given that he/she 
uses the health insurance plan.

We can use Bayes' theorem to calculate this probability:

P(S | H) = P(H | S) * P(S) / P(H)

We can calculate the probability of an employee using the health insurance plan given that 
he/she is a smoker as follows:

P(H | S) = P(S | H) * P(H) / P(S)
= 0.4 * 0.7 / P(S)

We know that the probability of an employee being a smoker is:

P(S) = P(S | H) * P(H) + P(S | H') * P(H')
= 0.4 * 0.7 + P(S | H') * 0.3

where 'H' is the complement of H (i.e., an employee does not use the health insurance plan).
Since we don't have information about P(S | H'), we need to make an assumption to proceed.
Lets assume that the probability of an employee being a smoker given that he/she does not
use the health insurance plan is 0.2 (i.e., 20% of non-users are smokers):

P(S) = 0.4 * 0.7 + 0.2 * 0.3
= 0.34

Now we can substitute the values we have into Bayes' theorem:

P(S | H) = (0.4 * 0.7) / (0.34)
= 0.8235

Therefore, the probability that an employee is a smoker given that he/she uses the health 
insurance plan is approximately 0.8235.




    

In [None]:
2:
  Both Bernoulli Naive Bayes and Multinomial Naive Bayes are algorithms used in text classification and
other applications where the features are discrete (e.g., word counts, presence/absence of words). However,
they differ in the way they model the data and calculate the probabilities.

In Bernoulli Naive Bayes, each feature is binary (i.e., it can take on only two values, 0 or 1), and the 
algorithm assumes that the features are conditionally independent given the class. This means that each 
feature contributes equally to the probability of the class, regardless of its frequency. Bernoulli Naive
Bayes is often used in binary classification problems (e.g., spam detection, sentiment analysis), where 
each document is either "spam" or "not spam", "positive" or "negative", etc.

In Multinomial Naive Bayes, each feature represents the count or frequency of a word or token in a document,
and the algorithm assumes that the features are conditionally independent given the class. This means that 
the probability of the class depends on the frequencies of the features, rather than just their presence
or absence. Multinomial Naive Bayes is often used in text classification problems (e.g., topic classification, language identification),
where each document can belong to one of several categories (e.g., politics, sports, entertainment).

In summary, Bernoulli Naive Bayes is suited for binary classification problems with binary features, 
while Multinomial Naive Bayes is suited for text classification problems with count or frequency features.  

In [None]:
3:
In Bernoulli Naive Bayes, missing values can be handled by either removing the instances with
missing values or by imputing them.

If the number of instances with missing values is small, it may be reasonable to simply remove
those instances from the dataset. However, this can lead to a loss of information, and may not
be practical if a large proportion of instances have missing values.

An alternative approach is to impute the missing values. One way to do this is to use the mean
or mode of the feature values for the corresponding class. Another approach is to use a machine 
learning algorithm to predict the missing values based on the values of other features in the
dataset. This approach can be particularly effective if the dataset has a large number of instances
and the missing values are not too numerous.

Its important to note that the way missing values are handled can have a significant impact on the
performance of the classifier. In general, its a good idea to experiment with different strategies
and evaluate their impact on the classification accuracy.

In [None]:
4:
    
Yes, Gaussian Naive Bayes can be used for multi-class classification. In multi-class classification, 
the algorithm needs to calculate the probability of each class for a given instance, and then select 
the class with the highest probability as the predicted class.

In Gaussian Naive Bayes, the algorithm assumes that the features follow a Gaussian (normal) distribution.
To perform multi-class classification, the algorithm calculates the probability of each class for a given 
instance using Bayes' theorem and the Gaussian probability density function.

The Gaussian Naive Bayes algorithm can handle multiple classes by comparing the probabilities of each class 
and selecting the class with the highest probability as the predicted class. This approach is known as
"one-vs-all" or "one-vs-rest" classification. Alternatively, a "one-vs-one" approach can be used, where the
algorithm constructs a separate classifier for each pair of classes, and then combines the results to make 
a final prediction.

In summary, Gaussian Naive Bayes can be used for multi-class classification by calculating the probabilities
of each class and selecting the class with the highest probability as the predicted class.    
    
    
    
    
    

In [None]:
5:
    
    

In [8]:
import pandas as pd
from sklearn.naive_bayes import BernoulliNB, MultinomialNB, GaussianNB
from sklearn.model_selection import cross_val_score, KFold

# Load the dataset
data = pd.read_csv('spambase.data')
# Split the dataset into features (X) and labels (y)
X = data.iloc[:, :-1]
y = data.iloc[:, -1]

# Initialize the classifiers
bnb = BernoulliNB()
mnb = MultinomialNB()
gnb = GaussianNB()

# Perform 10-fold cross-validation and calculate the mean accuracy for each classifier
cv = KFold(n_splits=10, shuffle=True, random_state=42)

bnb_scores = cross_val_score(bnb, X, y, cv=cv)
mnb_scores = cross_val_score(mnb, X, y, cv=cv)
gnb_scores = cross_val_score(gnb, X, y, cv=cv)

print("Bernoulli Naive Bayes accuracy:", bnb_scores.mean())
print("Multinomial Naive Bayes accuracy:", mnb_scores.mean())
print("Gaussian Naive Bayes accuracy:", gnb_scores.mean())


Bernoulli Naive Bayes accuracy: 0.8854347826086956
Multinomial Naive Bayes accuracy: 0.7908695652173913
Gaussian Naive Bayes accuracy: 0.8206521739130436


In [11]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

data = pd.read_csv('spambase.data')

bnb_accuracy = cross_val_score(bnb, X, y, cv=10).mean()
mnb_accuracy = cross_val_score(mnb, X, y, cv=10).mean()
gnb_accuracy = cross_val_score(gnb, X, y, cv=10).mean()

bnb_precision = cross_val_score(bnb, X, y, cv=10, scoring='precision').mean()
mnb_precision = cross_val_score(mnb, X, y, cv=10, scoring='precision').mean()
gnb_precision = cross_val_score(gnb, X, y, cv=10, scoring='precision').mean()

bnb_recall = cross_val_score(bnb, X, y, cv=10, scoring='recall').mean()
mnb_recall = cross_val_score(mnb, X, y, cv=10, scoring='recall').mean()
gnb_recall = cross_val_score(gnb, X, y, cv=10, scoring='recall').mean()

bnb_f1_score = cross_val_score(bnb, X, y, cv=10, scoring='f1').mean()
mnb_f1_score = cross_val_score(mnb, X, y, cv=10, scoring='f1').mean()
gnb_f1_score = cross_val_score(gnb, X, y, cv=10, scoring='f1').mean()

print("Bernoulli Naive Bayes:")
print("Accuracy:", bnb_accuracy)
print("Precision:", bnb_precision)
print("Recall:", bnb_recall)
print("F1 score:", bnb_f1_score)



print("Multinomial Naive Bayes:")
print("Accuracy:", mnb_accuracy)
print("Precision:", mnb_precision)
print("Recall:", mnb_recall)
print("F1 score:", mnb_f1_score)



print("Gaussian Naive Bayes:")
print("Accuracy:", gnb_accuracy)
print("Precision:", gnb_precision)
print("Recall:", gnb_recall)
print("F1 score:", gnb_f1_score)


Bernoulli Naive Bayes:
Accuracy: 0.8839130434782609
Precision: 0.886914139754535
Recall: 0.8151235504826666
F1 score: 0.8480714616697421
Multinomial Naive Bayes:
Accuracy: 0.786086956521739
Precision: 0.7390291264847734
Recall: 0.7207971586424625
F1 score: 0.7277511309974372
Gaussian Naive Bayes:
Accuracy: 0.8217391304347826
Precision: 0.7102746648832371
Recall: 0.9569394693704085
F1 score: 0.8129997873786424


In [None]:
#discussion
From the results obtained, it appears that the 'Gaussian Naive Bayes' classifier performed the
'best'


The reason why the Gaussian Naive Bayes classifier performed the best may be due to the fact
that the data has continuous features, and the Gaussian Naive Bayes classifier is better suited 
for such data. On the other hand, the Bernoulli Naive Bayes and Multinomial Naive Bayes classifiers
are more suited for binary and count data, respectively.

In [None]:
#conclusion
In this project, i implemented three variants of Naive Bayes classifier (Bernoulli, Multinomial, and Gaussian)
and evaluated their performance on the Spambase dataset using 10-fold cross-validation.

The results showed that the Gaussian Naive Bayes classifier performed the best, followed by the Multinomial
and Bernoulli Naive Bayes classifiers.

One suggestion for future work would be to perform feature selection or feature engineering to improve the
performance of the classifiers. Additionally, exploring other machine learning algorithms such as decision 
trees or support vector machines may provide insight into which algorithm is best suited for the Spambase dataset.