This project uses a Naive Bayes classifier to detect spam messages in SMS data.
- Source: SMSSpamCollection
- 5572 messages with 2 columns:
label
andmessage
- Load and clean the data (remove missing values and duplicates)
- Map labels (
ham
→ 0,spam
→ 1) - Vectorize text using CountVectorizer
- Train a Multinomial Naive Bayes classifier
- Evaluate with accuracy and a confusion matrix
- Accuracy: ≈ 97.6%
- Confusion Matrix:
- Test Prediction Example: "Congratulations! You won a free gift card." → Spam (1)
- Python, Pandas, Scikit-learn, Matplotlib, Seaborn, Jupyter Notebook