In [None]:
# Q1. A company conducted a survey of its employees and found that 70% of the employees use the
# company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the
# probability that an employee is a smoker given that he/she uses the health insurance plan?
# Answer :-
# To find the probability that an employee is a smoker given that he/she uses the health insurance plan, you can use conditional probability. The conditional probability of event A (being a smoker) given event B (using the health insurance plan) is denoted by P(A|B) and is calculated using the formula:

# P(A∣B)= 
# P(B)
# P(A∩B)

# In this case:

# Event A is being a smoker.
# Event B is using the health insurance plan.
# You're given:


# P(B), the probability of using the health insurance plan, which is 70% (0.7).

# P(A∩B), the probability of being a smoker and using the health insurance plan, which is 40% of the employees who use the plan, i.e., 40% of 70% (0.4 * 0.7).
# Now, substitute these values into the formula:


# P(A∣B)= 
# 0.7
# 0.4×0.7

# Simplify the expression:

# P(A∣B)=0.4

# So, the probability that an employee is a smoker given that he/she uses the health insurance plan is 0.4 or 40%.

In [None]:
# Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?
# Answer :-
# Bernoulli Naive Bayes and Multinomial Naive Bayes are two variants of the Naive Bayes algorithm, which is a probabilistic classification algorithm based on Bayes' theorem. The main difference between them lies in the type of data they are designed to handle.

# Bernoulli Naive Bayes:

# Data Type: It is suitable for binary data, where each feature represents a binary variable (0 or 1).
# Use Case: Commonly used in text classification tasks, where the presence or absence of a word in a document is considered.
# Multinomial Naive Bayes:

# Data Type: It is designed for discrete data, typically for cases where features represent the frequency of occurrences (counts) of events.
# Use Case: Widely used in text classification, particularly when the features are the word counts (or term frequencies) within documents.

In [None]:
# Q3. How does Bernoulli Naive Bayes handle missing values?
# Answer :-
# Bernoulli Naive Bayes handles missing values by ignoring them during the training process. In the context of Bernoulli Naive Bayes, which is commonly used for binary data (features that are either present or absent), missing values are treated as if the corresponding features are absent.

# The underlying assumption of the Naive Bayes algorithm, including Bernoulli Naive Bayes, is that features are conditionally independent given the class label. When a feature is missing for a particular instance, the algorithm simply excludes that feature from consideration for that instance when calculating probabilities.

# During training, the algorithm estimates probabilities based on the available features. If a feature is missing for a particular instance, the algorithm does not take that feature into account when updating its probability estimates. This approach is consistent with the "naive" assumption of independence among features.

# During classification or prediction, when the algorithm encounters a missing value for a feature, it simply ignores that feature when calculating the likelihoods and makes predictions based on the available features.

# It's important to note that the handling of missing values in Bernoulli Naive Bayes is inherently built into the algorithm's design, and there is no need for explicit imputation or special treatment of missing values during the training or prediction phases.

In [None]:
# Q4. Can Gaussian Naive Bayes be used for multi-class classification?
# Answer :-
# Yes, Gaussian Naive Bayes can be used for multi-class classification. Gaussian Naive Bayes is an extension of the Naive Bayes algorithm that is designed to handle continuous data. It assumes that the features follow a Gaussian (normal) distribution within each class.

# In the context of multi-class classification, where there are more than two classes, Gaussian Naive Bayes can still be applied. The algorithm calculates the likelihood of the observed data given each class using the Gaussian distribution parameters (mean and variance) for each feature within each class.

# The decision rule for classification involves selecting the class with the highest posterior probability, which is the product of the prior probability of the class and the likelihood of the observed data given that class. The prior probability represents the probability of each class without considering the features.

# To summarize, Gaussian Naive Bayes can handle multiple classes by extending its calculations for the Gaussian distribution parameters and applying the standard Naive Bayes classification decision rule. Each class is treated independently, and the class with the highest posterior probability is predicted for a given set of feature values.

In [None]:
# Q5. Assignment:
# Data preparation:
# Download the "Spambase Data Set" from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/
# datasets/Spambase). This dataset contains email messages, where the goal is to predict whether a message
# is spam or not based on several input features.
# Implementation:
# Implement Bernoulli Naive Bayes, Multinomial Naive Bayes, and Gaussian Naive Bayes classifiers using the
# scikit-learn library in Python. Use 10-fold cross-validation to evaluate the performance of each classifier on the
# dataset. You should use the default hyperparameters for each classifier.
# Results:
# Report the following performance metrics for each classifier:
# Accuracy
# Precision
# Recall
# F1 score
# Discussion:
# Discuss the results you obtained. Which variant of Naive Bayes performed the best? Why do you think that is
# the case? Are there any limitations of Naive Bayes that you observed?
# Conclusion:
# Summarise your findings and provide some suggestions for future work.

# Note: This dataset contains a binary classification problem with multiple features. The dataset is
# relatively small, but it can be used to demonstrate the performance of the different variants of Naive
# Bayes on a real-world problem.
# Answer :-

import pandas as pd

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/spambase.data"
column_names = [...]  # Add column names based on the dataset description

# Load the dataset into a Pandas DataFrame
data = pd.read_csv(url, header=None, names=column_names)

X = data.drop('target_variable_column_name', axis=1)  # Replace 'target_variable_column_name' with the actual column name
y = data['target_variable_column_name']

from sklearn.model_selection import cross_val_score
from sklearn.naive_bayes import BernoulliNB, MultinomialNB, GaussianNB
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Create Naive Bayes classifiers
bernoulli_nb = BernoulliNB()
multinomial_nb = MultinomialNB()
gaussian_nb = GaussianNB()

# Perform 10-fold cross-validation and evaluate performance metrics
def evaluate_classifier(classifier, X, y):
    accuracy = cross_val_score(classifier, X, y, cv=10, scoring='accuracy').mean()
    precision = cross_val_score(classifier, X, y, cv=10, scoring='precision').mean()
    recall = cross_val_score(classifier, X, y, cv=10, scoring='recall').mean()
    f1 = cross_val_score(classifier, X, y, cv=10, scoring='f1').mean()
    return accuracy, precision, recall, f1

# Evaluate each classifier
accuracy_b, precision_b, recall_b, f1_b = evaluate_classifier(bernoulli_nb, X, y)
accuracy_m, precision_m, recall_m, f1_m = evaluate_classifier(multinomial_nb, X, y)
accuracy_g, precision_g, recall_g, f1_g = evaluate_classifier(gaussian_nb, X, y)

# Print the results
print("Bernoulli Naive Bayes:")
print(f"Accuracy: {accuracy_b}")
print(f"Precision: {precision_b}")
print(f"Recall: {recall_b}")
print(f"F1 Score: {f1_b}")

print("\nMultinomial Naive Bayes:")
print(f"Accuracy: {accuracy_m}")
print(f"Precision: {precision_m}")
print(f"Recall: {recall_m}")
print(f"F1 Score: {f1_m}")

print("\nGaussian Naive Bayes:")
print(f"Accuracy: {accuracy_g}")
print(f"Precision: {precision_g}")
print(f"Recall: {recall_g}")
print(f"F1 Score: {f1_g}")

# Step 4: Discussion and Conclusion
# Discuss the results, compare the performance of the three Naive Bayes variants, and highlight any observed limitations. Conclude with a summary of your findings and suggestions for future work.

# Note: Ensure that you replace 'target_variable_column_name' with the actual name of the target variable column in your dataset. Additionally, you may need to adjust column names and dataset specifics based on the actual dataset structure and features.