# Naive Bayes from Scratch

Naive Bayes is a family of simple yet powerful probabilistic classifiers based on Bayes’ theorem and the assumption of feature independence. It is widely used for text classification and as a baseline in many ML tasks.

In this notebook, you'll scaffold the steps to implement Naive Bayes from scratch, including the core math, training, prediction, and evaluation.

## 📚 Bayes’ Theorem Refresher

Bayes’ theorem relates conditional and marginal probabilities of random events:

$$ P(A|B) = \frac{P(B|A)P(A)}{P(B)} $$

### Task:
- Scaffold a function to compute Bayes’ theorem for given probabilities.
- Add a docstring explaining its use in Naive Bayes.

In [None]:
def bayes_theorem(P_B_given_A, P_A, P_B):
    """
    Compute P(A|B) using Bayes’ theorem.
    Args:
        P_B_given_A (float): P(B|A)
        P_A (float): P(A)
        P_B (float): P(B)
    Returns:
        float: P(A|B)
    """
    # TODO: Implement Bayes’ theorem
    pass

## 🔢 Training: Estimate Priors and Likelihoods

Naive Bayes learns class priors and feature likelihoods from the training data.

### Task:
- Scaffold functions to estimate class priors and feature likelihoods for categorical data (Multinomial/Bernoulli NB).
- Add docstrings explaining their use.

In [None]:
def estimate_class_priors(y):
    """
    Estimate class prior probabilities from labels.
    Args:
        y (np.ndarray): Array of class labels (n_samples,)
    Returns:
        dict: Mapping from class to prior probability
    """
    # TODO: Estimate class priors
    pass

def estimate_likelihoods(X, y):
    """
    Estimate feature likelihoods P(x|y) for each class (for categorical features).
    Args:
        X (np.ndarray): Feature matrix (n_samples x n_features)
        y (np.ndarray): Class labels (n_samples,)
    Returns:
        dict: Nested dict of likelihoods {class: {feature: likelihood}}
    """
    # TODO: Estimate feature likelihoods
    pass

## 🧮 Laplace Smoothing

Laplace (additive) smoothing prevents zero probabilities for unseen features.

### Task:
- Scaffold a function to apply Laplace smoothing to likelihood estimates.
- Add a docstring explaining its importance.

In [None]:
def laplace_smoothing(count, total, num_classes, alpha=1):
    """
    Apply Laplace smoothing to probability estimates.
    Args:
        count (int): Count of feature/class occurrence
        total (int): Total count for class
        num_classes (int): Number of possible feature values
        alpha (float): Smoothing parameter (default=1)
    Returns:
        float: Smoothed probability
    """
    # TODO: Apply Laplace smoothing
    pass

## 🔗 Prediction: Posterior Probability & Log Probabilities

Naive Bayes predicts the class with the highest posterior probability, often using log probabilities for numerical stability.

### Task:
- Scaffold a function to compute log posterior probabilities for each class given a sample.
- Scaffold a function to predict the class label.
- Add docstrings explaining their use.

In [None]:
def compute_log_posterior(x, class_priors, likelihoods):
    """
    Compute log posterior probability for each class given a sample.
    Args:
        x (np.ndarray): Feature vector (n_features,)
        class_priors (dict): Class prior probabilities
        likelihoods (dict): Feature likelihoods
    Returns:
        dict: Mapping from class to log posterior
    """
    # TODO: Compute log posterior for each class
    pass

def predict_naive_bayes(x, class_priors, likelihoods):
    """
    Predict class label for a sample using Naive Bayes.
    Args:
        x (np.ndarray): Feature vector (n_features,)
        class_priors (dict): Class prior probabilities
        likelihoods (dict): Feature likelihoods
    Returns:
        int or str: Predicted class label
    """
    # TODO: Predict class label
    pass

## 🏋️ Training and Evaluation Loop

Train the Naive Bayes model on a dataset and evaluate its accuracy.

### Task:
- Scaffold a function to train the model (estimate priors and likelihoods).
- Scaffold a function to compute accuracy on a test set.
- Add docstrings explaining their use.

In [None]:
def train_naive_bayes(X, y):
    """
    Train Naive Bayes model (estimate priors and likelihoods).
    Args:
        X (np.ndarray): Feature matrix
        y (np.ndarray): Class labels
    Returns:
        tuple: (class_priors, likelihoods)
    """
    # TODO: Train model
    pass

def compute_accuracy_naive_bayes(X, y, class_priors, likelihoods):
    """
    Compute accuracy of Naive Bayes model on a dataset.
    Args:
        X (np.ndarray): Feature matrix
        y (np.ndarray): True labels
        class_priors (dict): Class prior probabilities
        likelihoods (dict): Feature likelihoods
    Returns:
        float: Accuracy (0 to 1)
    """
    # TODO: Compute accuracy
    pass

## 🧠 Final Summary: Naive Bayes in ML

- Naive Bayes is a simple, fast, and effective classifier for many tasks, especially text classification.
- Understanding its math and assumptions is essential for ML interviews and practical applications.
- The probabilistic reasoning in Naive Bayes is foundational for more advanced models and LLMs.