#### About

> Naive Bayes

Naive Bayes is a probabilistic classifier that is based on the Bayes' theorem. It assumes that the features used for classification are conditionally independent given the class labels. 

However, despite this simplifying assumption, Naive Bayes classifiers are known to perform well in many real-world classification tasks, especially when the assumption of feature independence holds approximately.

> Mathematics

The Naive Bayes classifier is based on Bayes' theorem, which is a mathematical formula for calculating conditional probabilities. Bayes' theorem is defined as:

P(y|x) = P(x|y) * P(y) / P(x)

where:

- P(y|x) is the conditional probability of class y given the features x.
- P(x|y) is the conditional probability of features x given class y.
- P(y) is the prior probability of class y.
- P(x) is the marginal probability of features x.

The Naive Bayes classifier assumes that the features x are conditionally independent given the class y, which allows us to simplify the above equation as:

P(y|x) = P(y) * P(x1|y) * P(x2|y) * ... * P(xn|y)

where x1, x2, ..., xn are the features used for classification, and P(x1|y), P(x2|y), ..., P(xn|y) are the conditional probabilities of each feature given class y. This is the "naive" assumption in Naive Bayes, as it assumes that the features are independent of each other given the class label.







In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

In [2]:
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

In [3]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [4]:
# Create a Gaussian Naive Bayes classifier
gnb = GaussianNB()


In [5]:
# Train the classifier
gnb.fit(X_train, y_train)


In [6]:
# Make predictions on the test set
y_pred = gnb.predict(X_test)


In [7]:
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

Accuracy: 1.00
