Skip to content

One of the primary methods for spam mail detection is email filtering. It involves categorize incoming emails into spam and non-spam. Machine learning algorithms can be trained to filter out spam mails based on their content and metadata.

License

Notifications You must be signed in to change notification settings

kanagalingamcse/email-spam-detection

Repository files navigation

EMAIL SPAM DETECTION

One of the primary methods for spam mail detection is email filtering. It involves categorize incoming emails into spam and non-spam. Machine learning algorithms can be trained to filter out spam mails based on their content and metadata. 20945480

DESCRIPTION

• The project code completely done using Python

• Dataset taken from kaggle, link: https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset/code

• Required packages installed, that are pandas, re, nltk, sklearn, seaborn, matplotlib, tqdm, time

• Data Preprocessing, NLP, Classification and Classification report these are the operations performed

• Logistic Regression used as classification model for this project to get high accuracy for the text data perfomed from NLP operations.

• Visualising confusion matrix by heatmap to get clear performance of the classification model

• Finally, Classification report has been executed.

Other Key steps to Spam Mail Detection:

• Email Filtering: One of the primary methods for spam mail detection is email filtering. It involves categorize incoming emails into spam and non-spam. Machine learning algorithms can be trained to filter out spam mails based on their content and metadata.

• Natural Language Processing: Natural Language Processing (NLP) is a technique that enables machines to understand and process human language. It plays a crucial role in spam detection, as it helps in extracting meaningful features from emails such as subject, body, and attachments.

• Text Classification: Text classification is a supervised learning technique used for spam detection. It involves labelling emails as spam or non-spam based on their features, such as the presence of certain keywords, tone, or grammar.

• Feature Engineering: Feature engineering is the process of selecting relevant features from the email to classify it as spam or non-spam. It involves extracting features such as the sender's email address, the presence of certain words or phrases, and the length of the email.

• Supervised Learning: Supervised learning is a technique that involves training the model on labelled data to predict the labels of new, unlabeled data. It is widely used in spam detection for text classification tasks.

• Unsupervised Learning: Unsupervised learning is a technique used to find hidden patterns in the data without the need for labelled data. It can be used for anomaly detection, clustering, and association rule mining.

• Deep Learning: Deep learning is a subfield of machine learning that involves training deep neural networks with multiple hidden layers to learn complex features from the data. It has shown great promise in spam detection tasks.

• Neural Networks: Neural networks are a type of deep learning model inspired by the human brain. They can be trained to extract meaningful features from emails and classify them as spam or non-spam.

About

One of the primary methods for spam mail detection is email filtering. It involves categorize incoming emails into spam and non-spam. Machine learning algorithms can be trained to filter out spam mails based on their content and metadata.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published