# Naive Bayes

## Definition
Naive Bayes is a probabilistic classifier based on Bayes' theorem, assuming independence between features.

## Real-life Examples
- Spam email detection
- Sentiment analysis (positive/negative reviews)
- Document classification (news, blogs, categories)


In [None]:
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score

# Load dataset
data = fetch_20newsgroups(subset='train', categories=['sci.space','rec.sport.baseball'], shuffle=True, random_state=42)
X_train, y_train = data.data, data.target
test_data = fetch_20newsgroups(subset='test', categories=['sci.space','rec.sport.baseball'], shuffle=True, random_state=42)
X_test, y_test = test_data.data, test_data.target

# Vectorize text
vectorizer = CountVectorizer()
X_train_vec = vectorizer.fit_transform(X_train)
X_test_vec = vectorizer.transform(X_test)

# Train model
model = MultinomialNB()
model.fit(X_train_vec, y_train)
y_pred = model.predict(X_test_vec)

# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))