# Benchmark

### Explanation of the Code

1. **Initialization**: The `NaiveBayes` class initializes dictionaries to hold the means, variances, and prior probabilities for each class.

2. **Fit Method**:
   - The `fit` method calculates the mean and variance of each feature for each class and the prior probabilities based on the class distribution in the training data.

3. **Predict Method**:
   - The `predict` method computes the posterior probability for each class using the Gaussian probability density function. It returns the class with the highest posterior probability.

4. **Example Usage**:
   - A simple dataset is created with two features and a binary label. The model is trained on this dataset, and predictions are made.

In [5]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [10]:
class NaiveBayes:
    def __init__(self):
        self.means = {}
        self.variances = {}
        self.priors = {}
        self.classes = None

    def fit(self, X, y):
        # Calculate the mean, variance, and prior probabilities for each class
        self.classes = np.unique(y)
        for c in self.classes:
            X_c = X[y == c]
            self.means[c] = X_c.mean(axis=0)
            self.variances[c] = X_c.var(axis=0)
            self.priors[c] = X_c.shape[0] / X.shape[0]

    def predict(self, X):
        # Calculate the posterior probabilities for each class
        posteriors = []
        for c in self.classes:
            prior = np.log(self.priors[c])
            likelihood = -0.5*np.sum(np.log(2 * np.pi * self.variances[c])) \
                         -0.5*np.sum(((X-self.means[c])**2)/self.variances[c], axis=1)
            posterior = prior + likelihood
            posteriors.append(posterior)
        
        return self.classes[np.argmax(posteriors, axis=0)]


In [11]:
# Create a simple dataset
data = {
    'feature1': [1.0, 2.0, 1.5, 2.5, 3.0, 3.5],
    'feature2': [1.0, 1.5, 1.2, 1.8, 2.0, 2.5],
    'label': [0, 0, 0, 1, 1, 1]
}

df = pd.DataFrame(data)
X = df[['feature1', 'feature2']].values
y = df['label'].values

# Initialize and train the Naive Bayes classifier
nb = NaiveBayes()
nb.fit(X, y)

# Make predictions
predictions = nb.predict(X)
print("Predictions:", predictions)


Predictions: [0 0 0 1 1 1]


# Custom