Spam Detector

This is a Spam/Ham detector using Naive Bayes classifier implemented from scratch in Python3.

This is a text classification problem. Naive Bayes makes two assumptions:

bag of words assumption which assumes that positions do not matter.
conditional independence which assumes that feature probabilities are independent for a given class (e.g. spam/ham).

The following image shows the Naive Bayes Algorithm for training and testing text classification:

At the end, classification performance report is generated showing confusion matrix, accuracy, precision, recall and f1-score. It is currently trained on Enron dataset. However, it can be trained on any other email dataset by changing respective paths.

Usage 🔧

Program requires paths to train and test folders which further contain spam and ham folders having respective files to make datasets.

In Spam Ham Email Classification.ipynb, cell#5 contains the following code:

makeDatasets('train/spam', 'train/ham', 'test/spam', 'test/ham')

These are the paths to dataset files. Change these paths to train on any other dataset.

Author 👋

You can get in touch with me on my LinkedIn Profile:

Ahmad Shafique

You can also follow my GitHub Profile to stay updated about my latest projects:

If you liked the repo then please support it by giving it a star ⭐!

Contributions Welcome ✨

If you find any bug in the code or have any improvements in mind then feel free to generate a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Enron+dataset.zip		Enron+dataset.zip
LICENSE		LICENSE
README.md		README.md
Spam Ham Email Classification.ipynb		Spam Ham Email Classification.ipynb
naive_bayes_algorithm.png		naive_bayes_algorithm.png
stopwords.txt		stopwords.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Detector

Usage 🔧

Author 👋

Ahmad Shafique

Contributions Welcome ✨

License 📄

About

Releases

Packages

Languages

License

ahmadshafique/Spam-Detector

Folders and files

Latest commit

History

Repository files navigation

Spam Detector

Usage 🔧

Author 👋

Ahmad Shafique

Contributions Welcome ✨

License 📄

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages