In [1]:
# Q1. A company conducted a survey of its employees and found that 70% of the employees use the
# company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the
# probability that an employee is a smoker given that he/she uses the health insurance plan?
# Answer:
# the probability that an employee is a smoker given that he/she uses the health insurance plan is 
# 0.4, or 40%.

In [2]:
# Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?
# Bernoulli Naive Bayes and Multinomial Naive Bayes are both variations of the Naive Bayes classifier, but they are suited to different types of data and have different assumptions about the data distribution. Here are the key differences:

# ### Bernoulli Naive Bayes
# 1. **Data Type**: Bernoulli Naive Bayes is used for binary/boolean features. It works well when the features are binary (0/1, True/False, Yes/No).
# 2. **Model Assumption**: It assumes that features are binary variables and models the presence or absence of a feature.
# 3. **Feature Representation**: Each feature is treated as a binary value indicating whether a particular feature is present or absent.
# 4. **Application**: It is commonly used in text classification problems where the presence or absence of a word matters, such as spam detection.
# 5. **Probability Calculation**: It calculates the probability using a Bernoulli distribution. For each feature, it estimates the probability that the feature is present (or absent) given the class.

# ### Multinomial Naive Bayes
# 1. **Data Type**: Multinomial Naive Bayes is used for discrete count data. It works well with features that represent counts or frequencies, such as the frequency of words in a document.
# 2. **Model Assumption**: It assumes that the features are generated from a multinomial distribution.
# 3. **Feature Representation**: Each feature represents the frequency or count of a term in a document.
# 4. **Application**: It is commonly used in text classification problems where word frequency matters, such as document classification or sentiment analysis.
# 5. **Probability Calculation**: It calculates the probability using a multinomial distribution. For each feature, it estimates the probability of observing a given count of the feature given the class.

# ### Summary of Differences
# - **Feature Type**: Bernoulli Naive Bayes is for binary features, while Multinomial Naive Bayes is for count/frequency features.
# - **Use Case**: Bernoulli is typically used for problems involving binary presence/absence of features, whereas Multinomial is used for problems involving frequency/count of features.
# - **Probability Distribution**: Bernoulli Naive Bayes uses the Bernoulli distribution for binary outcomes, while Multinomial Naive Bayes uses the multinomial distribution for count data.

# In conclusion, the choice between Bernoulli Naive Bayes and Multinomial Naive Bayes depends on the nature of the feature data: use Bernoulli for binary data and Multinomial for count data.

In [None]:
# Q3. How does Bernoulli Naive Bayes handle missing values?
Bernoulli Naive Bayes, like other variations of the Naive Bayes classifier, typically assumes that the data is complete and does not inherently provide a mechanism for handling missing values. However, there are some common strategies to handle missing values before applying the Bernoulli Naive Bayes algorithm:

### Common Strategies for Handling Missing Values

1. **Imputation**:
   - **Mean/Median/Mode Imputation**: For binary data, you can replace missing values with the mode (most frequent value) of the feature.
   - **Domain-specific Imputation**: Use domain knowledge to replace missing values. For example, if a binary feature indicates the presence of a specific condition and its absence is rare, you might impute missing values with 0 (absence).

2. **Indicator Variable**:
   - Create a new binary feature indicating whether the original feature value was missing. This way, the model can learn if the presence of a missing value itself carries predictive information.

3. **Model-based Imputation**:
   - Use a predictive model to estimate the missing values. For example, you can train a simple logistic regression model on the non-missing data to predict the missing values.

4. **Ignore Missing Values**:
   - If the number of missing values is very small, you might choose to ignore instances with missing values during training and prediction.

### Example: Imputation for Bernoulli Naive Bayes
Let's say you have a dataset with a binary feature `X1` and some missing values:
```
X1 = [1, 0, 1, 1, ?, 0, 1, ?]
```
You might impute the missing values as follows:
- Calculate the mode of `X1` (most frequent value). Here, the mode is 1.
- Replace the missing values with 1:
```
X1_imputed = [1, 0, 1, 1, 1, 0, 1, 1]
```

### Handling Missing Values During Prediction
When making predictions with Bernoulli Naive Bayes, if you encounter missing values in the input data, you can apply similar imputation techniques or use a previously trained imputation model to fill in the missing values before making the prediction.

### Summary
While Bernoulli Naive Bayes itself does not handle missing values directly, pre-processing steps such as imputation, creating indicator variables, or using model-based approaches can be employed to handle missing values before feeding the data into the Bernoulli Naive Bayes classifier.

In [None]:
# Can Gaussian Naive Bayes be used for multi-class classification?
# Yes, Gaussian Naive Bayes can be used for multi-class classification. In fact, it is well-suited for this task.
# Gaussian Naive Bayes assumes that the features follow a Gaussian (normal) distribution and applies Bayes'
# theorem to predict the probability of each class given the feature values.

