### Q1. A company conducted a survey of its employees and found that 70% of the employees use the company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the probability that an employee is a smoker given that he/she uses the health insurance plan?

We need to find \( P(\text{Smoker} | \text{Health Insurance}) \).

Given:
- \( P(\text{Health Insurance}) = 0.70 \)
- \( P(\text{Smoker} | \text{Health Insurance}) = 0.40 \)

By definition, \( P(\text{Smoker} | \text{Health Insurance}) \) is already given as 0.40. Therefore:

\[ P(\text{Smoker} | \text{Health Insurance}) = 0.40 \]

### Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

**Bernoulli Naive Bayes:**
- Suitable for binary/Boolean features (0 or 1, indicating absence or presence of a feature).
- Considers binary occurrence of features in the document.
- Often used for text classification tasks where the presence or absence of a word is considered.

**Multinomial Naive Bayes:**
- Suitable for discrete features (e.g., word counts in a document).
- Considers the frequency of features in the document.
- Often used for text classification tasks where the frequency of words is important.

### Q3. How does Bernoulli Naive Bayes handle missing values?

Bernoulli Naive Bayes does not inherently handle missing values. If a feature is missing, it is typically treated as a zero (indicating absence of the feature). However, this might not always be appropriate, so it is common to impute missing values before applying the Bernoulli Naive Bayes classifier.

### Q4. Can Gaussian Naive Bayes be used for multi-class classification?

Yes, Gaussian Naive Bayes can be used for multi-class classification. It calculates the probability of each class given the feature values and selects the class with the highest probability.

### Q5. Assignment

#### Data preparation:
Download the "Spambase Data Set" from the UCI Machine Learning Repository.

#### Implementation:

```python
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_val_score
from sklearn.naive_bayes import BernoulliNB, MultinomialNB, GaussianNB
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# Load the dataset
data = pd.read_csv("spambase.data", header=None)
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scaling the features for GaussianNB
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Define the classifiers
classifiers = {
    'BernoulliNB': BernoulliNB(),
    'MultinomialNB': MultinomialNB(),
    'GaussianNB': GaussianNB()
}

# Train and evaluate the classifiers
results = {}

for name, clf in classifiers.items():
    if name == 'GaussianNB':
        clf.fit(X_train_scaled, y_train)
        y_pred = clf.predict(X_test_scaled)
    else:
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
    
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    
    results[name] = {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1
    }

# Output the results
results_df = pd.DataFrame(results).T
print(results_df)
```

#### Results:
- Reported performance metrics: Accuracy, Precision, Recall, F1 score for each classifier.

#### Discussion:
- Discuss which variant of Naive Bayes performed the best.
- Bernoulli Naive Bayes may perform better for binary/Boolean features, Multinomial Naive Bayes for word counts, and Gaussian Naive Bayes for continuous features.
- Limitations of Naive Bayes include the assumption of feature independence and sensitivity to zero probabilities.

#### Conclusion:
- suggestions for future work, such as using feature selection or engineering techniques, or trying other classification algorithms like logistic regression or support vector machines.

