Skip to content

charliezcr/Python-Bayesian-Spam-Filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Bayesian Spam Filter

Project Overview

We receive a lot of mails but our mailbox automatically sorts the spams out and only take hams (the mail that you want, opposite of spams) in our inbox. How exactly does our mailbox calcualte whether the mail is a spam or not? This is a spam filter implemented in python to showcase the use of Naive Bayes Classifier and Bag-of-Words model in the our mail box.

Contents

For a detaile walk-through of the code and explanation of the theories, please look at Python notebook or website
If you are more interested in the code itself, please read the Python file
The rest txt files are training and testing data.

Modules

pip install nltk

  • nltk: natural language processing Please also download punctuation and stopwords in nltk
nltk.download('punkt')
nltk.download('stopwords')

About

This is a spam filter implemented by using Bayes' Theorem and Python's NLTK package to perform basic text analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published