### Q1. A company conducted a survey of its employees and found that 70% of the employees use thecompany's health insurance plan, while 40% of the employees who use the plan are smokers. What is theprobability that an employee is a smoker given that he/she uses the health insurance plan?

Ans:
* Event A: Employee uses the company's health insurance plan.
* Event B: Employee is a smoker.

We are given:

* P(A), the probability that an employee uses the health insurance plan, which is 70% or 0.70.

* P(B∣A), the conditional probability that an employee is a smoker given that they use the health insurance plan, which is 40% or 0.40.

We want to find P(B∣A), the probability that an employee is a smoker given that they use the health insurance plan.

Using Bayes' theorem:

![image.png](attachment:image.png)

Given that 

P(B∣A)=0.40 and P(A)=0.70, we can substitute the values into the formula to find P(B∣A):

P(B/A) = (0.40 * 0.70)/0.70 = 0.40


Therefore, the probability that an employee is a smoker given that he/she uses the health insurance plan is 40%.


### Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

Ans: Difference between Bernoulli, Multinomial and Gaussian Naive Bayes. Multinomial Naïve Bayes consider a feature vector where a given term represents the number of times it appears or very often i.e. frequency. On the other hand, Bernoulli is a binary algorithm used when the feature is present or not.


* Bernoulli Naive Bayes: Assumes binary features (presence or absence) and is commonly used for document classification tasks such as spam filtering.

* Multinomial Naive Bayes: Handles discrete features representing counts or frequencies (e.g., word counts in text classification). It's suitable for tasks where features represent the occurrences of events.

### Q3. How does Bernoulli Naive Bayes handle missing values?

Ans. Naive Bayes algorithms, including Bernoulli Naive Bayes, typically assume that features are conditionally independent given the class label. However, the specific handling of missing values can depend on the implementation and the strategy chosen by the practitioner.

Here are some common approaches to handling missing values in Bernoulli Naive Bayes:

### 1. Ignoring Missing Values:

- **Default Behavior:**
  - Many implementations of Naive Bayes, including scikit-learn's `BernoulliNB`, handle missing values by simply ignoring them during the computation of probabilities.

- **Impact on Probability Calculation:**
  - If a feature has missing values, it is essentially treated as if that feature did not occur in the instance. The probability of the feature given the class label is calculated based on the observed instances where the feature is present.

### 2. Imputation:

- **Imputing Missing Values:**
  - Some practitioners choose to impute missing values before applying the Bernoulli Naive Bayes algorithm.

- **Imputation Strategies:**
  - Common imputation strategies include replacing missing values with the mean, median, or mode of the observed values for that feature.

- **Impact on Model:**
  - Imputing missing values can affect the distribution of feature values, potentially influencing the model's performance.

### 3. Special Handling:

- **Designating a Category for Missing Values:**
  - Another approach is to designate a special category or value for missing values. For example, if features are binary (0 or 1), a missing value might be assigned a special code (e.g., -1) to indicate its absence.

- **Probability Calculation:**
  - The model can then be trained to consider this special category separately when calculating probabilities.

### 4. Conditional Independence:

- **Assumption of Conditional Independence:**
  - The Naive Bayes algorithm assumes conditional independence between features given the class label. In the presence of missing values, this independence assumption may not hold if the missingness is related to other features.

- **Impact on Results:**
  - If the missingness is systematic and related to other features, it may introduce bias into the model.



### Q4. Can Gaussian Naive Bayes be used for multi-class classification?

Ans. Yes, Gaussian Naive Bayes can be used for multi-class classification. Gaussian Naive Bayes is an extension of the Naive Bayes algorithm that assumes the features follow a Gaussian (normal) distribution. It is particularly suitable for continuous data, and it can be applied to problems with multiple classes.

In the context of multi-class classification, Gaussian Naive Bayes extends the binary classification capability to handle more than two classes. The algorithm assigns a probability distribution to each class based on the observed values of features in the training data.

Here's a brief overview of how Gaussian Naive Bayes can be used for multi-class classification:

### Training:

1. **Estimate Class Priors:**
   - Calculate the prior probability of each class based on the frequency of each class in the training dataset.

2. **Estimate Class Means and Variances:**
   - For each feature and each class, calculate the mean and variance of the observed values in the training data.

### Prediction:

1. **Calculate Class Probabilities:**
   - For each class, use the Gaussian probability density function to calculate the likelihood of the observed feature values given the estimated mean and variance.

2. **Multiply with Priors:**
   - Multiply the likelihood by the prior probability of each class to obtain the unnormalized posterior probabilities.

3. **Normalize Probabilities:**
   - Normalize the unnormalized probabilities to obtain the final posterior probabilities for each class.

4. **Predict Class:**
   - Assign the class with the highest posterior probability as the predicted class for the given instance.



## Q5. Assignment:

### Data preparation:

Download the "Spambase Data Set" from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Spambase). This dataset contains email messages, where the goal is to predict whether a messageis spam or not based on several input features.


### Implementation:

Implement Bernoulli Naive Bayes, Multinomial Naive Bayes, and Gaussian Naive Bayes classifiers using the scikit-learn library in Python. Use 10-fold cross-validation to evaluate the performance of each classifier on the dataset. You should use the default hyperparameters for each classifier.

### Results:

Report the following performance metrics for each classifier:
Accuracy
Precision
Recall
F1 score

### Discussion:

Discuss the results you obtained. Which variant of Naive Bayes performed the best? Why do you think that is
the case? Are there any limitations of Naive Bayes that you observed?

### Conclusion:

Summarise your findings and provide some suggestions for future work.

link through your dashboard. Make sure the repository is public.
Note: This dataset contains a binary classification problem with multiple features. The dataset is relatively small, but it can be used to demonstrate the performance of the different variants of Naive Bayes on a real-world problem.

In [1]:
import pandas as pd
from sklearn.model_selection import cross_val_score
from sklearn.naive_bayes import BernoulliNB, MultinomialNB, GaussianNB
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score


In [5]:
import pandas as pd

# Correct URL pointing directly to the CSV file
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/spambase.data"

# Read the data from the URL into a DataFrame
df = pd.read_csv(url, header=None)  # Assuming the file has no header row


In [7]:
df.head(5)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,48,49,50,51,52,53,54,55,56,57
0,0.0,0.64,0.64,0.0,0.32,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.778,0.0,0.0,3.756,61,278,1
1,0.21,0.28,0.5,0.0,0.14,0.28,0.21,0.07,0.0,0.94,...,0.0,0.132,0.0,0.372,0.18,0.048,5.114,101,1028,1
2,0.06,0.0,0.71,0.0,1.23,0.19,0.19,0.12,0.64,0.25,...,0.01,0.143,0.0,0.276,0.184,0.01,9.821,485,2259,1
3,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.137,0.0,0.137,0.0,0.0,3.537,40,191,1
4,0.0,0.0,0.0,0.0,0.63,0.0,0.31,0.63,0.31,0.63,...,0.0,0.135,0.0,0.135,0.0,0.0,3.537,40,191,1
