<a href="https://colab.research.google.com/github/M-Abbi/Financial-Modeling/blob/main/Naive_Bayes_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Naive Bayes Model: A Simple Explanation
Naive Bayes is a probabilistic machine learning algorithm used primarily for classification tasks. It's based on Bayes' theorem with a crucial simplifying assumption: the features are conditionally independent given the class label. This "naive" assumption is what makes the algorithm computationally efficient and surprisingly effective in many real-world applications, especially text classification.

**Bayes' Theorem:**

Bayes' theorem provides a way to calculate the posterior probability P(C∣X) of a class C given the observed features X:

$P(C∣X)=
(P(X∣C)⋅P(C))/P(X)$


Where:

- P(C∣X) is the posterior probability of class C given features X. This is what we want to calculate (e.g., the probability that an email is spam given its words).
- P(X∣C) is the likelihood of observing features X given that the class is C (e.g., the probability of seeing these words in a spam email).
P- (C) is the prior probability of class C (e.g., the overall probability of an email being spam).
- P(X) is the marginal likelihood or evidence (the probability of observing the features X). Since this is the same for all classes we are comparing, it's often ignored for classification purposes (as we are interested in which class has the highest posterior probability).

**The "Naive" Assumption:**

The "naive" part of Naive Bayes comes from the assumption that all features in X are conditionally independent of each other given the class C. If $X=(x_1, x_2, ..., x_n)$, then:

$P(X∣C) = P(x_1∣C) \cdot P(x_2∣C) \cdot ... \cdot P(x_n∣C)$

This assumption is often not true in reality, but the Naive Bayes model still performs well in many practical scenarios.

**How it Works (Simplified):**

1. Calculate Prior Probabilities: For each class, the algorithm calculates the prior probability P(C) based on the frequency of that class in the training data.

2. Calculate Likelihoods: For each feature $x_i$
  and each class C, the algorithm estimates the likelihood $P(x_i∣C)$. The way this is done depends on the type of the feature (e.g., Gaussian for continuous features, multinomial for discrete counts like word frequencies, Bernoulli for binary features).

3. Calculate Posterior Probabilities: For a new data point with features X, the algorithm uses Bayes' theorem and the naive independence assumption to calculate the posterior probability $P(C∣X)$ for each class C.

4. Make Prediction: The algorithm predicts the class with the highest posterior probability.

**Types of Naive Bayes Classifiers:**

- Gaussian Naive Bayes: Assumes that the continuous features in each class follow a Gaussian (normal) distribution.
- Multinomial Naive Bayes: Typically used for discrete data, such as word counts in text classification.
- Bernoulli Naive Bayes: Suitable for binary or boolean features (e.g., whether a word is present or not in a document).

**Advantages of Naive Bayes:**

- Simple and easy to implement.
- Computationally efficient, especially for large datasets.
- Performs well even with limited training data.
- Often works surprisingly well in practice, particularly for text classification.

**Disadvantages of Naive Bayes:**

- The naive independence assumption is often unrealistic.
- Can have issues if a feature value in the test data was not present in the training data for a particular class (zero probability problem). This can be addressed using smoothing techniques (e.g., Laplace smoothing).
- Not suitable for complex relationships between features.

# Toy Example in Python (Gaussian Naive Bayes)
Let's create a simple dataset with two continuous features and two classes, and then train a Gaussian Naive Bayes classifier.

In [13]:
from sklearn.naive_bayes import GaussianNB

# Very simple dataset:
# Features: [weight, color]
# Class: 0 (small fruit), 1 (large fruit)
X = [[100, 1],   # small, color doesn't really matter here (let's say 1 for 'light')
     [120, 1],
     [150, 0],   # larger, color 0 ('dark')
     [180, 0]]
y = [0, 0, 1, 1]

# Create a Gaussian Naive Bayes classifier
model = GaussianNB()

# Train the model
model.fit(X, y)

# Example of a new, unseen fruit
new_fruit = [[130, 1]]  # weight 130, light color

# Make a prediction
prediction = model.predict(new_fruit)

print(f"Prediction for a fruit with weight 130 and color 1: Class {prediction[0]}")

# Another example
another_fruit = [[90, 0]] # weight 90, dark color
another_prediction = model.predict(another_fruit)
print(f"Prediction for a fruit with weight 90 and color 0: Class {another_prediction[0]}")

Prediction for a fruit with weight 130 and color 1: Class 0
Prediction for a fruit with weight 90 and color 0: Class 1
