# Week 6 — Naive Bayes (Generative Text Classifier)

Objectives
- Review joint probability, conditional probability, and Bayes’ theorem
- Understand Naive Bayes as a generative classifier with conditional independence assumptions
- Compare generative vs. discriminative models
- Implement Bernoulli and Multinomial Naive Bayes for text classification
- Train and evaluate on a small spam dataset


In [2]:
import math, random, sys, os
from pprint import pprint

from utils import (
    show_result, tokenize, build_vocab, vectorize_bow, train_test_split,
    NaiveBayesText, accuracy, confusion_matrix, tiny_spam_dataset,
    test_exercise_1_probability, test_exercise_2_nb_fit_predict, test_exercise_3_smoothing
)


## 1. Probability Warm‑up

Definitions
- Joint: \(p(a,b)\)
- Conditional: \(p(a\mid b) = \frac{p(a,b)}{p(b)}\), with \(p(b) > 0\)
- Bayes’ theorem: \(p(a \mid b) = \frac{p(b \mid a)p(a)}{p(b)}\)

Implement the functions below.


In [3]:
# Implement the following functions.
def joint(p_a, p_b):
    """Assume independence: p(a,b) = p(a)*p(b)."""
    raise NotImplementedError

def conditional(p_ab, p_b):
    """p(a|b) = p(a,b) / p(b), assuming p(b) > 0."""
    raise NotImplementedError

def bayes(p_b_given_a, p_a, p_b):
    """Bayes' theorem: p(a|b) = p(b|a) p(a) / p(b)."""
    raise NotImplementedError


In [4]:
res = test_exercise_1_probability({"joint": joint, "conditional": conditional, "bayes": bayes})
show_result("Exercise 1 – Probability", res)


[FAIL] Exercise 1 – Probability | joint runtime error: 


## 2. Naive Bayes as a Generative Model

- Model \(p(x \mid y)\) and \(p(y)\), and compute \(p(y \mid x)\) by Bayes’ rule
- Naive assumption: features are conditionally independent given \(y\)
- Bernoulli variant uses binary word presence; Multinomial uses word counts


In [5]:
texts, labels = tiny_spam_dataset()
print(f"Dataset size: {len(texts)}  |  ham={sum(1 for y in labels if y==0)}  spam={sum(1 for y in labels if y==1)}")
for t, y in list(zip(texts, labels))[:3]:
    print(f"[{y}] {t}")


Dataset size: 14  |  ham=7  spam=7
[0] hey are we still on for lunch tomorrow
[0] please review the meeting notes from today
[0] can you send the updated report


In [6]:
vocab = build_vocab(texts, min_freq=1, max_size=2000)
X_bin = vectorize_bow(texts, vocab, binary=True)
X_cnt = vectorize_bow(texts, vocab, binary=False)

Xtr_bin, Xte_bin, ytr, yte = train_test_split(X_bin, labels, test_size=0.3, seed=7)
Xtr_cnt, Xte_cnt, _, _ = train_test_split(X_cnt, labels, test_size=0.3, seed=7)


## 3. Fit a Naive Bayes Classifier

Complete `student_fit_func(...)`:
1) Build vocabulary
2) Vectorize (binary for Bernoulli, counts for Multinomial)
3) Split into train/test
4) Train `NaiveBayesText(mode, alpha)` and return test accuracy


In [7]:
def student_fit_func(texts, labels, mode='bernoulli', alpha=1.0):
    """
    Returns test accuracy on the tiny dataset.
    """
    raise NotImplementedError


In [8]:
res = test_exercise_2_nb_fit_predict(student_fit_func)
show_result("Exercise 2 – Fit & Predict", res)


[FAIL] Exercise 2 – Fit & Predict | runtime error: 


## 4. Smoothing

Implement `student_train_eval(alpha)` to train once (choose a mode) and return `(train_acc, test_acc)`. Then try several values of \(\alpha\).


In [9]:
def student_train_eval(alpha=1.0, mode='bernoulli'):
    """
    Train Naive Bayes with the given alpha; return (train_acc, test_acc).
    """
    raise NotImplementedError

res = test_exercise_3_smoothing(student_train_eval)
show_result("Exercise 3 – Smoothing", res)

for a in [0.1, 0.5, 1.0, 2.0, 5.0]:
    tr, te = student_train_eval(a, mode='bernoulli')
    print(f"alpha={a:.1f} -> train={tr:.3f} | test={te:.3f}")


[FAIL] Exercise 3 – Smoothing | runtime error: 


NotImplementedError: 

## 5. Generative vs. Discriminative (Short Answer)

1) How does a generative classifier differ from a discriminative classifier?  
2) Why can Naive Bayes be viewed as a simple text generator?  
3) Briefly relate Naive Bayes to modern generative models (e.g., GPT).


_Answer here._
