Skip to content

This is a machine learning project that focuses on detecting spam messages from regular messages. The project includes data cleaning and preprocessing, creating a bag of words model, and training the model using the Naive Bayes classifier. The final accuracy of the model is 98.39%.

Notifications You must be signed in to change notification settings

ShubhamSharma476/Spam-Classifier-NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Spam-Classifier-NLP

This repository contains the code for a machine learning model for classifying SMS messages as spam or ham (not spam) using Naive Bayes classifier. The model was trained and tested on a custom dataset of SMS messages.

Dataset:

The dataset used for this project contains SMS messages labeled as spam or ham. The dataset was preprocessed by removing stop words and stemming the remaining words. The dataset was split into training and testing sets.

Data Preprocessing:

Data cleaning and preprocessing were performed on the dataset to remove stop words and reduce words to their stem form. The NLTK library was used for this purpose. The preprocessed text was then converted into a bag of words model using CountVectorizer from the scikit-learn library.

Model:

The model used for this project was Naive Bayes classifier. The model was trained on the preprocessed data using the training set. The accuracy of the model was calculated using the test set, and it achieved an accuracy of 98.39%.

Results:

The final model achieved an accuracy of 98.39% on the test set, indicating that it is highly effective in classifying SMS messages as spam or ham. This model could be further improved by using more advanced algorithms, such as neural networks, or by incorporating additional features, such as message length or time sent.

About

This is a machine learning project that focuses on detecting spam messages from regular messages. The project includes data cleaning and preprocessing, creating a bag of words model, and training the model using the Naive Bayes classifier. The final accuracy of the model is 98.39%.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published