Q1. A company conducted a survey of its employees and found that 70% of the employees use the
company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the
probability that an employee is a smoker given that he/she uses the health insurance plan?

To find the probability that an employee is a smoker given that they use the company’s health insurance plan, we use the conditional probability formula. Given:

- \( P(\text{Uses Plan}) = 0.70 \) (70% of employees use the health insurance plan)
- \( P(\text{Smoker} \mid \text{Uses Plan}) = 0.40 \) (40% of those who use the plan are smokers)

We want to find \( P(\text{Smoker} \mid \text{Uses Plan}) \), which is already provided directly as \( 0.40 \).

So, the probability that an employee is a smoker given that they use the health insurance plan is \( \boxed{0.40} \) or 40%.

Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

**Bernoulli Naive Bayes** and **Multinomial Naive Bayes** are both types of Naive Bayes classifiers used for different types of data. Here’s a concise comparison:

### Bernoulli Naive Bayes

- **Feature Type:** Binary/Boolean features.
- **Assumption:** Each feature is binary (0 or 1), indicating the presence or absence of a feature.
- **Application:** Suitable for text classification where features represent the presence or absence of words.
- **Example:** Classifying emails as spam or not spam based on whether certain keywords are present or absent.

**Formula:**
\[
P(\text{Class} \mid \text{Features}) = \frac{P(\text{Class}) \cdot \prod_{i=1}^{n} P(\text{Feature}_i \mid \text{Class})^{\text{Feature}_i}}{P(\text{Features})}
\]
where \(\text{Feature}_i\) is either 0 (absent) or 1 (present).

### Multinomial Naive Bayes

- **Feature Type:** Count-based or frequency-based features.
- **Assumption:** Features represent counts or frequencies of events (e.g., word counts).
- **Application:** Suitable for text classification where features are term frequencies or counts.
- **Example:** Classifying documents based on the frequency of words.

**Formula:**
\[
P(\text{Class} \mid \text{Features}) = \frac{P(\text{Class}) \cdot \prod_{i=1}^{n} \frac{P(\text{Feature}_i \mid \text{Class})^{\text{Feature}_i}}{\text{Feature}_i!}}{P(\text{Features})}
\]
where \(\text{Feature}_i\) represents the count or frequency of the \(i\)-th feature.

### Key Differences

1. **Feature Representation:**
   - **Bernoulli Naive Bayes:** Features are binary (presence/absence).
   - **Multinomial Naive Bayes:** Features are counts or frequencies.

2. **Model Assumptions:**
   - **Bernoulli Naive Bayes:** Assumes each feature is a binary indicator.
   - **Multinomial Naive Bayes:** Assumes features follow a multinomial distribution (counts of events).

3. **Use Cases:**
   - **Bernoulli Naive Bayes:** Suitable for datasets where features are binary.
   - **Multinomial Naive Bayes:** Suitable for datasets where features are counts or frequencies, often used in text classification.

Choosing between these models depends on the nature of your features and the specific characteristics of your data.

Q3. How does Bernoulli Naive Bayes handle missing values?

Bernoulli Naive Bayes does not inherently handle missing values directly. If your dataset has missing values, you need to address them before applying Bernoulli Naive Bayes. Here are some common strategies for handling missing values:

1. **Imputation:**
   - **Method:** Replace missing values with a default value, such as the most frequent value (mode), a specific constant, or the mean/median of the non-missing values.
   - **Example:** For binary features, you might replace missing values with the most common value (0 or 1).

2. **Removing Data:**
   - **Method:** Remove rows or columns with missing values. This approach is straightforward but can lead to loss of data.
   - **Example:** If a feature has many missing values, you might discard that feature or drop rows with missing values.

3. **Model-Based Imputation:**
   - **Method:** Use other models (e.g., regression) to predict and fill in missing values based on other features.
   - **Example:** Predict missing feature values using a regression model trained on non-missing data.

4. **Indicator Variable:**
   - **Method:** Create an additional binary feature indicating whether the original feature had missing values.
   - **Example:** Add a new feature that flags whether the original feature was missing and use this indicator in your model.

**In Summary:**
Before applying Bernoulli Naive Bayes, handle missing values by imputation, removal, or using indicator variables. The choice of method depends on the amount of missing data and the impact of missing values on your analysis.

Q4. Can Gaussian Naive Bayes be used for multi-class classification?

Yes, Gaussian Naive Bayes can be used for multi-class classification.

**How It Works for Multi-Class Classification:**

- **Naive Assumption:** Gaussian Naive Bayes assumes that the features follow a Gaussian (normal) distribution within each class.
- **Probability Calculation:** For multi-class classification, the algorithm calculates the posterior probability for each class using Bayes' theorem and selects the class with the highest probability as the prediction.

**Steps:**

1. **Estimate Parameters:**
   - For each class, estimate the mean and variance of the features assuming they follow a Gaussian distribution.

2. **Calculate Likelihood:**
   - Compute the likelihood of the feature values for each class using the Gaussian probability density function.

3. **Apply Bayes' Theorem:**
   - Use Bayes' theorem to calculate the posterior probability for each class given the feature values.

4. **Class Prediction:**
   - Predict the class with the highest posterior probability.

**Example:**
If you are classifying iris flowers into multiple species based on their features (sepal length, sepal width, petal length, petal width), Gaussian Naive Bayes will compute the probability of each species given the feature values and choose the species with the highest probability.

Gaussian Naive Bayes is straightforward and effective for multi-class problems where the feature distributions can be reasonably approximated by Gaussian distributions.