## **PROGRAM 4 â€” Naive Bayes Text Classification**

Demonstrates the use of a **Naive Bayes classifier** with:

- Training data labeled as _comedy_ or _action_
- **Add-one (Laplace) smoothing**
- Calculation of posterior probabilities
- Prediction of the most likely class for a new document

This program shows how probabilistic models classify text.

In [1]:
from collections import Counter

# Training documents
docs = [
    ("fun couple love love".split(), "comedy"),
    ("fast furious shoot".split(), "action"),
    ("couple fly fast fun fun".split(), "comedy"),
    ("furious shoot shoot fun".split(), "action"),
    ("fly fast shoot love".split(), "action")
]

# Document to classify
D = "fast couple shoot fly".split()

# Classes
classes = {"comedy", "action"}

# ----- Priors -----
priors = {}
for c in classes:
    count = sum(1 for _, cls in docs if cls == c)
    priors[c] = count / len(docs)

# ----- Vocabulary -----
vocab = set()
for words, _ in docs:
    for w in words:
        vocab.add(w)

V = len(vocab)

# ----- Word counts & total words per class -----
wc = {c: Counter() for c in classes}
tw = {c: 0 for c in classes}

for words, c in docs:
    wc[c].update(words)
    tw[c] += len(words)

# ----- Posterior probability -----
def class_probability(c):
    prob = priors[c]
    for w in D:
        word_prob = (wc[c][w] + 1) / (tw[c] + V)
        prob *= word_prob
    return prob

p_comedy = class_probability("comedy")
p_action = class_probability("action")

print("P(Comedy | D) =", p_comedy)
print("P(Action | D) =", p_action)

# Prediction
prediction = "action" if p_action > p_comedy else "comedy"
print("Predicted:", prediction)


P(Comedy | D) = 7.324218750000001e-05
P(Action | D) = 0.00017146776406035664
Predicted: action
