Skip to content

A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using

License

Notifications You must be signed in to change notification settings

fl-wxiao/Spam-Classifier

 
 

Repository files navigation

Spam-Classifier

forthebadge forthebadgeforthebadge forthebadge

Logo

📌 Introduction:-

A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.

✔❌Accuracy ❌✔:-

Text Preprocessing Type Logistic Regression Multinomial NB Support Vector Machine Decision Tree
TFIDF Vectorizer + PorterStemmer 96.68% 97.30% 98.47% 96.68%
CountVectorizer + PorterStemmer 98.65% 98.56% 98.74% 97.84%
CountVectorizer + WordnetLemmatizer 98.56% 98.29% 98.38% 97.75%
TFIDF Vectorizer + WordnetLemmatizer 96.41% 97.48% 98.47% 96.86%

WorkFlow:-

Workflow of SMS spam Classifer

🏁 Datasets Used:-

  • The dataset used is SMS Spam Dataset created by UCI Machine Learning.This dataset is downloaded in kaggle.You can download it here.
  • Reference for this dataset can be found here

📧Contact:-

For any kind of suggesstions/ help in models code Please mail me at ksdkamesh99@gmail.com.

📜 LICENSE

MIT

About

A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%