#### Q1. A company conducted a survey of its employees and found that 70% of the employees use the
company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the
probability that an employee is a smoker given that he/she uses the health insurance plan?

#### Ans:

To determine the probability that an employee is a smoker given that they use the health insurance plan, we can use Bayes' theorem. Let's define the events:

A: Employee uses the health insurance plan.
B: Employee is a smoker.

We are given the following information:

P(A) = 70% = 0.70 (probability that an employee uses the health insurance plan)
P(B|A) = 40% = 0.40 (probability that an employee is a smoker given that they use the health insurance plan)

Bayes' theorem states:

P(B|A) = (P(A|B) * P(B)) / P(A)

We need to find P(B|A), which represents the probability of an employee being a smoker given that they use the health insurance plan.

P(B|A) = (P(A|B) * P(B)) / P(A)

Substituting the given values:

P(B|A) = (0.40 * P(B)) / 0.70

We need to determine P(B), the probability that an employee is a smoker. However, we don't have this information in the given data. Without knowing the overall percentage of smokers in the company, we cannot calculate P(B|A) accurately.

##### Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

#### Ans:

Bernoulli Naive Bayes and Multinomial Naive Bayes are two variants of the Naive Bayes algorithm commonly used in machine learning for classification tasks. The main difference between them lies in the type of data they can handle and their underlying assumptions. Here's a comparison:

1. Bernoulli Naive Bayes:
   - Suitable for binary feature data, where each feature is a binary variable (e.g., presence or absence of a particular word in a document).
   - Assumes that each feature is conditionally independent of each other given the class label.
   - It models the presence or absence of a feature using a Bernoulli distribution (a discrete distribution with two possible outcomes).
   - Often used in text classification tasks, such as sentiment analysis or spam detection.

2. Multinomial Naive Bayes:
   - Suitable for categorical feature data, where each feature represents the count or frequency of a specific event (e.g., word counts in a document).
   - Assumes that each feature follows a multinomial distribution, which represents the probability of observing different categories of events.
   - Allows multiple occurrences of a feature in a single instance and takes into account the frequencies of different features.
   - Commonly used in text classification tasks where the frequency of words or terms in documents is important, such as document classification or topic modeling.

In summary, Bernoulli Naive Bayes is used for binary features (presence/absence), while Multinomial Naive Bayes is suitable for categorical features with multiple occurrences and frequencies. The choice between them depends on the nature of the data and the specific requirements of the classification problem.

#### Q3. How does Bernoulli Naive Bayes handle missing values?



#### Ans:

Bernoulli Naive Bayes is a variant of the Naive Bayes algorithm that assumes binary features, where each feature is either present or absent. In the case of missing values, Bernoulli Naive Bayes handles them by ignoring the missing features during training and classification.

During the training phase, the algorithm estimates the probabilities of the binary features (presence or absence) for each class based on the available data. If a feature is missing for a particular instance, it is simply not considered in the probability estimation for that instance.

During the classification phase, when a missing value is encountered for a feature, the algorithm ignores that feature and calculates the conditional probabilities based on the available features. It uses the available features' probabilities to classify the instance into the most likely class.

In summary, Bernoulli Naive Bayes handles missing values by treating them as if the corresponding features are not present, effectively excluding them from the probability calculations. However, it is important to note that this approach assumes the missing values are missing completely at random and do not introduce any bias in the classification process. If missing values are not missing completely at random or if they have a significant impact on the classification task, more sophisticated techniques, such as imputation or handling missing values as a separate category, may be necessary.

##### Q4. Can Gaussian Naive Bayes be used for multi-class classification?

##### Ans:

Yes, Gaussian Naive Bayes can be used for multi-class classification. Gaussian Naive Bayes is a variant of Naive Bayes that assumes the features are continuous and follow a Gaussian (normal) distribution. It is commonly used for classification tasks, particularly when dealing with continuous-valued features.

For multi-class classification, where the task involves classifying instances into more than two classes, Gaussian Naive Bayes can still be applied. It extends the binary Naive Bayes algorithm to handle multiple classes by using the principle of maximum a posteriori (MAP) estimation.

In the case of multi-class classification, Gaussian Naive Bayes estimates the class probabilities based on the assumption that the features are conditionally independent given the class. It calculates the likelihood of each class based on the Gaussian distribution parameters (mean and variance) for each feature and uses Bayes' theorem to determine the posterior probability of each class given the observed feature values.

While Gaussian Naive Bayes can be effective for certain types of datasets and assumptions, it may not perform as well in cases where the features are not well-modeled by a Gaussian distribution or when there are strong dependencies among the features. In such cases, alternative algorithms like Multinomial Naive Bayes or other machine learning algorithms may be more suitable.

##### Q5. Assignment:

Data preparation:
Download the "Spambase Data Set" from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/
datasets/Spambase). This dataset contains email messages, where the goal is to predict whether a message
is spam or not based on several input features.
Implementation:
Implement Bernoulli Naive Bayes, Multinomial Naive Bayes, and Gaussian Naive Bayes classifiers using the
scikit-learn library in Python. Use 10-fold cross-validation to evaluate the performance of each classifier on the
dataset. You should use the default hyperparameters for each classifier.
Results:
Report the following performance metrics for each classifier:
Accuracy
Precision
Recall
F1 score
Discussion:
Discuss the results you obtained. Which variant of Naive Bayes performed the best? Why do you think that is
the case? Are there any limitations of Naive Bayes that you observed?
Conclusion:
Summarise your findings and provide some suggestions for future work.