# Q1. A company conducted a survey of its employees and found that 70% of the employees use the company's health insurance plan, while 40% of the employees who use the plan are smokers. What is the probability that an employee is a smoker given that he/she uses the health insurance plan?

> Bayes' theorem is a way to calculate conditional probabilities. In this case, we want to find the probability that an employee is a smoker given that he/she uses the health insurance plan. Let's call this event A. Let's call the event that an employee uses the health insurance plan event B.

>According to Bayes' theorem, the probability of event A given event B is:

>P(A|B) = (P(B|A) * P(A)) / P(B)

>We know that P(B|A) is 0.4, because 40% of the employees who use the plan are smokers. We also know that P(B) is 0.7, because 70% of the employees use the company's health insurance plan.

>We don't know P(A), which is the probability that an employee is a smoker. However, we don't need to know it in order to calculate P(A|B), because it gets canceled out in the equation.

>So, we can calculate P(A|B) as follows:

>P(A|B) = (0.4 * P(A)) / 0.7

>P(A|B) = 0.4 / 0.7

>P(A|B) = 4/7

>So, the probability that an employee is a smoker given that he/she uses the health insurance plan is 4/7 or approximately 57.14%.

# Q2. What is the difference between Bernoulli Naive Bayes and Multinomial Naive Bayes?

>Bernoulli Naive Bayes and Multinomial Naive Bayes are two types of Naive Bayes classifiers that are commonly used for text classification and sentiment analysis. The main difference between the two is the type of data they are best suited to work with.

>Bernoulli Naive Bayes is used when the features are binary (i.e., they take on only two values, usually 0 and 1) and represent whether a particular word is present or not in a document. It is commonly used for tasks such as spam filtering, where the goal is to classify an email as spam or not spam based on the presence or absence of certain keywords.

>On the other hand, Multinomial Naive Bayes is used when the features represent the frequency of occurrence of a particular word in a document or a collection of documents. It is commonly used for tasks such as sentiment analysis, where the goal is to classify a document or a sentence as positive or negative based on the frequency of certain words or phrases.

>In summary, Bernoulli Naive Bayes is used for binary data while Multinomial Naive Bayes is used for count data.

# Q3. How does Bernoulli Naive Bayes handle missing values?

>In Bernoulli Naive Bayes, missing values are treated as a separate category and are not ignored. When there are missing values in the data, the model assumes that the absence of a feature is equivalent to the feature being present and having a value of zero. 

>For example, suppose a dataset contains a feature for whether a person owns a car, and some of the values are missing. The Bernoulli Naive Bayes model would treat the missing values as a separate category, assuming that the person does not own a car.

>This approach is useful when there are many missing values in the dataset, as it allows the model to still make use of the available data. However, it can also lead to biased results if the missing values are not missing at random, i.e., if there is some underlying pattern to their missingness.

# Q4. Can Gaussian Naive Bayes be used for multi-class classification?

>Yes, Gaussian Naive Bayes can be used for multi-class classification. It can be extended to multiple classes by using the "one-vs-all" or "one-vs-one" strategy. In the "one-vs-all" strategy, a separate binary Gaussian Naive Bayes classifier is trained for each class, with that class considered as the positive class and the remaining classes considered as the negative class. In the "one-vs-one" strategy, a separate binary Gaussian Naive Bayes classifier is trained for each pair of classes, with one class considered as the positive class and the other class considered as the negative class. The final prediction is made by selecting the class with the highest probability from all the binary classifiers.

# Q5. Assignment:

## Data preparation:
>Download the "Spambase Data Set" from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Spambase). This dataset contains email messages, where the goal is to predict whether a message is spam or not based on several input features.

## Implementation:
>Implement Bernoulli Naive Bayes, Multinomial Naive Bayes, and Gaussian Naive Bayes classifiers using the scikit-learn library in Python. Use 10-fold cross-validation to evaluate the performance of each classifier on the dataset. You should use the default hyperparameters for each classifier.

## Results:
>Report the following performance metrics for each classifier:
- Accuracy
- Precision
- Recall
- F1 score

## Discussion:
>Discuss the results you obtained. Which variant of Naive Bayes performed the best? Why do you think that is the case? Are there any limitations of Naive Bayes that you observed?

## Conclusion:
>Summarise your findings and provide some suggestions for future work.

In [7]:
import pandas as pd
from sklearn.naive_bayes import BernoulliNB, MultinomialNB, GaussianNB
from sklearn.model_selection import cross_val_score, cross_val_predict
from sklearn.metrics import classification_report

spam_data = pd.read_csv(r'C:\Users\milan\Documents\Data Science\skills\Notes\Pandas_\New Assq\spambase.csv')
X = spam_data.iloc[:, :-1] # input features
y = spam_data.iloc[:, -1] # target variable

# create Bernoulli Naive Bayes classifier and perform 10-fold cross-validation
bnb = BernoulliNB()
bnb_scores = cross_val_score(bnb, X, y, cv=10)
bnb_pred = cross_val_predict(bnb, X, y, cv=10)

# create Multinomial Naive Bayes classifier and perform 10-fold cross-validation
mnb = MultinomialNB()
mnb_scores = cross_val_score(mnb, X, y, cv=10)
mnb_pred = cross_val_predict(mnb, X, y, cv=10)

# create Gaussian Naive Bayes classifier and perform 10-fold cross-validation
gnb = GaussianNB()
gnb_scores = cross_val_score(gnb, X, y, cv=10)
gnb_pred = cross_val_predict(gnb, X, y, cv=10)

# Bernoulli Naive Bayes classifier evaluation metrics
bnb_acc = bnb_scores.mean()
bnb_report = classification_report(y, bnb_pred, digits=4)

# Multinomial Naive Bayes classifier evaluation metrics
mnb_acc = mnb_scores.mean()
mnb_report = classification_report(y, mnb_pred, digits=4)

# Gaussian Naive Bayes classifier evaluation metrics
gnb_acc = gnb_scores.mean()
gnb_report = classification_report(y, gnb_pred, digits=4)

# print the results
print("Bernoulli Naive Bayes Accuracy: {:.4f}".format(bnb_acc))
print(bnb_report)
print("Multinomial Naive Bayes Accuracy: {:.4f}".format(mnb_acc))
print(mnb_report)
print("Gaussian Naive Bayes Accuracy: {:.4f}".format(gnb_acc))
print(gnb_report)


Bernoulli Naive Bayes Accuracy: 0.8839
              precision    recall  f1-score   support

           0     0.8854    0.9286    0.9065      2788
           1     0.8813    0.8151    0.8469      1812

    accuracy                         0.8839      4600
   macro avg     0.8833    0.8719    0.8767      4600
weighted avg     0.8838    0.8839    0.8830      4600

Multinomial Naive Bayes Accuracy: 0.7861
              precision    recall  f1-score   support

           0     0.8203    0.8286    0.8244      2788
           1     0.7321    0.7208    0.7264      1812

    accuracy                         0.7861      4600
   macro avg     0.7762    0.7747    0.7754      4600
weighted avg     0.7855    0.7861    0.7858      4600

Gaussian Naive Bayes Accuracy: 0.8217
              precision    recall  f1-score   support

           0     0.9633    0.7339    0.8331      2788
           1     0.7003    0.9570    0.8088      1812

    accuracy                         0.8217      4600
   macro a