<h1 style = "color : dodgerblue"> Naive Bayes Classification </h1>

<h2 style = "color : DeepSkyBlue"> An Overview of Naive Bayes Classification </h2>

* Naive Bayes classification is a probabilistic machine learning algorithm that's particularly useful for classification tasks.

* It's based on Bayes' Theorem and is called "naive" because it makes a simplifying assumption: it assumes that the features in a dataset are independent of each other.

* This assumption is often unrealistic in real-world situations, but it simplifies the computation and can still produce good results in many cases.

<h2 style = "color : DeepSkyBlue"> What is Naive Bayes Classifiers? </h2>

* Naive Bayes classifiers are a collection of classification algorithms based on Bayes' Theorem. It is not a single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of features being classified is independent of each other. To start with, let us consider a dataset.

* One of the most simple and effective classification algorithms, the Naïve Bayes classifier aids in the rapid development of machine learning models with rapid prediction capabilities.

* Naïve Bayes algorithm is used for classification problems. It is highly used in text classification. In text classification tasks, data contains high dimension (as each word represent one feature in the data). It is used in spam filtering, sentiment detection, rating classification etc. The advantage of using naïve Bayes is its speed. It is fast and making prediction is easy with high dimension of data.

* This model predicts the probability of an instance belongs to a class with a given set of feature value. It is a probabilistic classifier. It is because it assumes that one feature in the model is independent of existence of another feature. In other words, each feature contributes to the predictions with no relation between each other. In real world, this condition satisfies rarely. It uses Bayes theorem in the algorithm for training and prediction

<h2 style = "color : DeepSkyBlue"> Why it is Called Naive Bayes? </h2>

It's called "Naive Bayes" because of its simplifying assumption about the data it handles.

* <b style = "color : orangered">Naive:</b> The term "naive" refers to the assumption that the features used in the model are independent of each other. In reality, this is often not true—features can be correlated. However, this assumption simplifies the computation and, despite being naive, the classifier often performs surprisingly well in practice.

* <b style = "color : orangered"> Bayes: </b> The "Bayes" part of the name comes from Bayes' Theorem, which underpins the algorithm. Bayes' Theorem allows us to update our prior beliefs about the probability of a hypothesis based on new evidence. Naive Bayes uses this theorem to calculate the probability of each class given the data.

So, putting it all together, Naive Bayes is a probabilistic model that makes a "naive" assumption of independence among features and leverages Bayes' Theorem to perform classification.

<h2 style = "color : DeepSkyBlue"> Bayes' Theorem </h2>

Bayes' Theorem is the foundation of Naive Bayes classification. It describes the probability of an event, based on prior knowledge of conditions that might be related to the event. The theorem is stated as:

𝑃(𝐴∣𝐵) = 𝑃(𝐵∣𝐴) * 𝑃(𝐴) / 𝑃(𝐵)

Where:

* 𝑃(𝐴∣𝐵) is the posterior probability of class 𝐴 given the evidence 𝐵.

* P(B|A) is the likelihood of the evidence given class 𝐴.

* P(A) is the prior probability of class 𝐴.

* P(B) is the prior probability of the evidence 𝐵.

<h2 style = "color : DeepSkyBlue"> Naive Bayes Classifier </h2>

Naive Bayes classifiers apply Bayes' Theorem with the "naive" assumption of conditional independence between every pair of features given the value of the class variable.

<b style = "color : coral">Steps to Implement Naive Bayes Classification</b>

<b style = "color : orangered">1. Calculate the Prior Probability:</b> This is the initial probability of each class without any evidence. It can be estimated from the training data as the proportion of each class.

<b style = "color : orangered">2. Calculate the Likelihood:</b> For each feature in the data, calculate the likelihood of the feature value given each class. This involves estimating the probability distribution of the features.

<b style = "color : orangered">3. Calculate the Posterior Probability:</b> Use Bayes' Theorem to calculate the posterior probability of each class given the feature values of a new instance.

<b style = "color : orangered">4. Predict the Class:</b> The class with the highest posterior probability is chosen as the predicted class for the new instance.

<h2 style = "color : DeepSkyBlue"> Types of Naive Bayes Classifiers </h2>

<b style = "color : orangered">1.Gaussian Naive Bayes:</b> Assumes that the features follow a normal (Gaussian) distribution.

<b style = "color : orangered">2. Multinomial Naive Bayes:</b> Used for discrete features like word counts in text classification problems.

<b style = "color : orangered">3. Bernoulli Naive Bayes:</b> Used for binary/boolean features.

<h2 style = "color : DeepSkyBlue"> Example </h2>

<b style = "color : orangered"> Imagine we want to classify whether an email is spam or not spam based on the presence of certain keywords. Let's say our features are "offer," "win," and "free": </b>

1. Calculate the prior probability of spam and not spam from the training data.

2. For each keyword, calculate the likelihood of the keyword appearing in spam and not spam emails.

3. Given a new email, use Bayes' Theorem to compute the posterior probability of the email being spam or not spam based on the presence of these keywords.

4. Classify the email as spam or not spam based on the higher posterior probability.

<h2 style = "color : DeepSkyBlue"> Advantages of Naive Bayes </h2>

<b style = "color : orangered">1. Simple and Fast:</b> Easy to implement and computationally efficient.

<b style = "color : orangered">2. Scalable:</b> Works well with large datasets.

<b style = "color : orangered">3. Handles Missing Data:</b> Can handle missing data well by ignoring the missing values during the classification.

<h2 style = "color : DeepSkyBlue"> Limitations of Naive Bayes </h2>

<b style = "color : orangered"> 1. Independence Assumption: </b> The assumption that features are independent is rarely true in real-world data.

<b style = "color : orangered"> 2. Zero Probability: </b> If a feature value was not seen in the training data for a given class, it can lead to zero probability, but this can be handled with techniques like Laplace smoothing.

<h2 style = "color : DeepSkyBlue"> Applications </h2>

<b style = "color : orangered">1. Spam Email Filtering:</b> Classifies emails as spam or non-spam based on features.

<b style = "color : orangered">2. Text Classification:</b> Used in sentiment analysis, document categorization, and topic classification.

<b style = "color : orangered">3. Medical Diagnosis:</b> Helps in predicting the likelihood of a disease based on symptoms.

<b style = "color : orangered">4. Credit Scoring:</b> Evaluates creditworthiness of individuals for loan approval.

<b style = "color : orangered">5. Weather Prediction:</b> Classifies weather conditions based on various factors.

<h2 style = "color : DeepSkyBlue"> Example </h2>

In [1]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

# Sample dataset
emails = np.array([
    [1, 0, 1],  # "Offer", "Win"
    [0, 1, 0],  # "Free"
    [1, 1, 1],  # "Offer", "Free", "Win"
    [0, 0, 0],  # None
    [1, 1, 0]   # "Offer", "Free"
])

labels = np.array([1, 0, 1, 0, 0])  # 1 = Spam, 0 = Not Spam

# Split the data
X_train, X_test, y_train, y_test = train_test_split(emails, labels, test_size=0.2, random_state=42)

# Train the Naive Bayes classifier
model = MultinomialNB()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print('Classification Report:')
print(report)

Accuracy: 1.0
Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         1

    accuracy                           1.00         1
   macro avg       1.00      1.00      1.00         1
weighted avg       1.00      1.00      1.00         1

