# Naive Bayes

are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

## Pros and Cons

Pros

- With simple assumption, NBClassifiers work well especially on **document classification** and **spam filtering**.
- Require small training set to estimate parameters.
- Fast compare to sophisticated methods.
- Each feature's distribution is independently estimated and therefore alleviate the curse of demensionality.
- Can be used to tackle large scale classification problems for which the full training set might not fit in memory. `partial_fit`

Cons

- A **good classifier**, but a **bad estimator**. Thus the probability outputs should not be taken seriously.

## Different Naive Bayes algorithms

1. Gaussian Naive Bayes
2. Mutinomial Bayes
3. Complement Naive Bayes
4. Bernouli Naive Bayes
5. Categorical Naive BayesZ

## How it works

![image.png](attachment:image.png)

![image.png](attachment:image.png)

Then we can use Maximum A Posteriori estimation to estimate P(y) and P(xi|y), the former is then the relative frequency of class y in the trainning set.

The different naive Bayes classifiers differ mainly by the assumptions they make regarding the distribution of P(xi|y).

https://scikit-learn.org/stable/modules/naive_bayes.html#categorical-naive-bayes

---

## Example usage

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB

X,y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)

gnb = GaussianNB()
model = gnb.fit(X_train, y_train)

y_pred = model.predict(X_test)

_total = X_test.shape[0]
_mislabeled = (y_test != y_pred).sum()
print(f"Number of mislabeled points out of a total {_total} points: {_mislabeled}")

Number of mislabeled points out of a total 75 points: 4
