ASSIGNMENT: NAIVE BAYES-2

1. A company conducted a survey of its employees and found that 70% of the employees use the 
company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the 
probability that an employee is a smoker given that he/she uses the health insurance plan?

To find the probability that an employee is a smoker given that he/she uses the health insurance plan, we need to use Bayes' theorem, which states that:

P(smoker|uses insurance) = P(uses insurance|smoker) * P(smoker) / P(uses insurance)

where:
P(smoker|uses insurance) is the probability that an employee is a smoker given that he/she uses the health insurance plan.
P(uses insurance|smoker) is the probability that an employee uses the health insurance plan given that he/she is a smoker.
P(smoker) is the overall probability of an employee being a smoker.
P(uses insurance) is the overall probability of an employee using the health insurance plan.

Using the information given in the problem, we can calculate these probabilities as follows:

P(smoker) = 40% = 0.4
P(uses insurance) = 70% = 0.7
P(uses insurance|smoker) = 100% = 1 (since all smokers use the plan)
P(smoker|uses insurance) = P(uses insurance|smoker) * P(smoker) / P(uses insurance)
= 1 * 0.4 / 0.7
= 0.5714

Therefore, the probability that an employee is a smoker given that he/she uses the health insurance plan is 0.5714 or approximately 57.14%.

2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

Bernoulli Naive Bayes and Multinomial Naive Bayes are two popular variants of the Naive Bayes algorithm, which is a classification algorithm based on Bayes' theorem.

The main difference between the two variants lies in the type of data they are best suited for.

Bernoulli Naive Bayes is typically used for binary or boolean features, where each feature can take on one of two values, usually represented as 0 or 1. It is commonly used for text classification tasks, where the presence or absence of certain words is used as features. In Bernoulli Naive Bayes, the probability of each feature is modeled as a Bernoulli distribution, which is a discrete probability distribution with two possible outcomes.

Multinomial Naive Bayes, on the other hand, is typically used for discrete count data, where each feature represents the frequency of occurrence of a certain event, such as the number of times a word appears in a document. It is also commonly used for text classification tasks. In Multinomial Naive Bayes, the probability of each feature is modeled as a Multinomial distribution, which is a discrete probability distribution with multiple possible outcomes.

In summary, Bernoulli Naive Bayes is best suited for binary or boolean features, while Multinomial Naive Bayes is best suited for discrete count data. Both variants make the assumption of feature independence, which is the reason they are called "naive".

3.  How does Bernoulli Naive Bayes handle missing values?

In Bernoulli Naive Bayes, missing values can be handled in different ways depending on the implementation or the specific problem being solved. Here are some common approaches:

Ignore missing values: One simple way to handle missing values is to simply ignore them and consider only the available features. This can be done by assigning the missing values to a neutral value, such as zero or -1, and then treating them as absent features.

Impute missing values: Another approach is to impute the missing values with some reasonable estimates based on the available data. For example, one could use the mean or mode of the non-missing values in the same feature as an estimate for the missing values.

Treat missing values as a separate category: In some cases, it may be appropriate to treat missing values as a separate category in the feature space. This can be done by assigning a special value, such as "unknown" or "?", to the missing values, and then including this value as a separate category in the feature space.

It is important to note that the choice of handling missing values can affect the accuracy and robustness of the classifier, and it may depend on the specific problem and the available data. In general, it is recommended to experiment with different methods and evaluate their performance on a validation set before choosing the final approach.

4. Can Gaussian Naive Bayes be used for multi-class classification?

Yes, Gaussian Naive Bayes can be used for multi-class classification problems. The algorithm can be extended to handle multiple classes by using a one-vs-all or one-vs-one approach.

In the one-vs-all approach, the classifier trains multiple binary classifiers, one for each class, where each classifier learns to distinguish between one class and all the other classes combined. During prediction, the classifier selects the class with the highest probability from the multiple binary classifiers.

In the one-vs-one approach, the classifier trains binary classifiers for every pair of classes, where each classifier learns to distinguish between two specific classes. During prediction, the classifier selects the class that wins the most binary classification "duels".

In both approaches, the Gaussian Naive Bayes algorithm is used to estimate the probabilities of the features for each class, based on the assumption of feature independence and the assumption that the features follow a Gaussian distribution. The class probabilities are then calculated using Bayes' theorem.

### PRACTICAL IMPLIMENTATION

In [4]:
import pandas as pd
import numpy as np


In [6]:
df= pd.read_csv('spambase.csv')

In [7]:
df.head()

Unnamed: 0,0,0.64,0.64.1,0.1,0.32,0.2,0.3,0.4,0.5,0.6,...,0.41,0.42,0.43,0.778,0.44,0.45,3.756,61,278,1
0,0.21,0.28,0.5,0.0,0.14,0.28,0.21,0.07,0.0,0.94,...,0.0,0.132,0.0,0.372,0.18,0.048,5.114,101,1028,1
1,0.06,0.0,0.71,0.0,1.23,0.19,0.19,0.12,0.64,0.25,...,0.01,0.143,0.0,0.276,0.184,0.01,9.821,485,2259,1
2,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.137,0.0,0.137,0.0,0.0,3.537,40,191,1
3,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.135,0.0,0.135,0.0,0.0,3.537,40,191,1
4,0.0,0.0,0.0,0.0,1.85,0.0,0.0,1.85,0.0,0.0,...,0.0,0.223,0.0,0.0,0.0,0.0,3.0,15,54,1


In [8]:
X= df.iloc[:,:-1]

In [9]:
X.head()

Unnamed: 0,0,0.64,0.64.1,0.1,0.32,0.2,0.3,0.4,0.5,0.6,...,0.40,0.41,0.42,0.43,0.778,0.44,0.45,3.756,61,278
0,0.21,0.28,0.5,0.0,0.14,0.28,0.21,0.07,0.0,0.94,...,0.0,0.0,0.132,0.0,0.372,0.18,0.048,5.114,101,1028
1,0.06,0.0,0.71,0.0,1.23,0.19,0.19,0.12,0.64,0.25,...,0.0,0.01,0.143,0.0,0.276,0.184,0.01,9.821,485,2259
2,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.0,0.137,0.0,0.137,0.0,0.0,3.537,40,191
3,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.0,0.135,0.0,0.135,0.0,0.0,3.537,40,191
4,0.0,0.0,0.0,0.0,1.85,0.0,0.0,1.85,0.0,0.0,...,0.0,0.0,0.223,0.0,0.0,0.0,0.0,3.0,15,54


In [10]:
y= df.iloc[:,-1]

In [11]:
y.head()

0    1
1    1
2    1
3    1
4    1
Name: 1, dtype: int64

In [12]:
df.shape

(4600, 58)

In [13]:
from sklearn.model_selection import train_test_split

In [15]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.33,random_state= 43)

In [23]:
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB, MultinomialNB, GaussianNB
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create Bernoulli Naive Bayes classifier and fit on the training set
bnb = BernoulliNB()
bnb.fit(X_train, y_train)

# Make predictions on the testing set and compute performance metrics for Bernoulli Naive Bayes
bnb_y_pred = bnb.predict(X_test)
bnb_accuracy = accuracy_score(y_test, bnb_y_pred)
bnb_precision = precision_score(y_test, bnb_y_pred)
bnb_recall = recall_score(y_test, bnb_y_pred)
bnb_f1_score = f1_score(y_test, bnb_y_pred)

# Create Multinomial Naive Bayes classifier and fit on the training set
mnb = MultinomialNB()
mnb.fit(X_train, y_train)

# Make predictions on the testing set and compute performance metrics for Multinomial Naive Bayes
mnb_y_pred = mnb.predict(X_test)
mnb_accuracy = accuracy_score(y_test, mnb_y_pred)
mnb_precision = precision_score(y_test, mnb_y_pred)
mnb_recall = recall_score(y_test, mnb_y_pred)
mnb_f1_score = f1_score(y_test, mnb_y_pred)

# Create Gaussian Naive Bayes classifier and fit on the training set
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Make predictions on the testing set and compute performance metrics for Gaussian Naive Bayes
gnb_y_pred = gnb.predict(X_test)
gnb_accuracy = accuracy_score(y_test, gnb_y_pred)
gnb_precision = precision_score(y_test, gnb_y_pred)
gnb_recall = recall_score(y_test, gnb_y_pred)
gnb_f1_score = f1_score(y_test, gnb_y_pred)

# Print the performance metrics for each classifier on the testing set
print("Bernoulli Naive Bayes:")
print("Accuracy:", bnb_accuracy)
print("Precision:", bnb_precision)
print("Recall:", bnb_recall)
print("F1 score:", bnb_f1_score)
print()

print("Multinomial Naive Bayes:")
print("Accuracy:", mnb_accuracy)
print("Precision:", mnb_precision)
print("Recall:", mnb_recall)
print("F1 score:", mnb_f1_score)
print()

print("Gaussian Naive Bayes:")
print("Accuracy:", gnb_accuracy)
print("Precision:", gnb_precision)
print("Recall:", gnb_recall)
print("F1 score:", gnb_f1_score)



Bernoulli Naive Bayes:
Accuracy: 0.8728260869565218
Precision: 0.8933717579250721
Recall: 0.7948717948717948
F1 score: 0.841248303934871

Multinomial Naive Bayes:
Accuracy: 0.7663043478260869
Precision: 0.7450980392156863
Recall: 0.6820512820512821
F1 score: 0.7121820615796519

Gaussian Naive Bayes:
Accuracy: 0.8152173913043478
Precision: 0.7115384615384616
Recall: 0.9487179487179487
F1 score: 0.8131868131868132
