Email interactions have become an essential form of communication in the web today. Companies, schools and regular-business have adopted these exchanges to disseminate information to a wide broadcast. Uncovering the main parts of an email can be used as a relevant source of information to extract not only the user’s profile but also the writing trends and patterns associated. In this project, it is proposed an automatic approach to detect the general structure of emails by extracting the greetings, body and signature zones. In specific, a recurrent neural network enhance with a set of customized rule-based constraints are employed for detecting the different email parts. The proposed method is applied in a well known email corpus (Enron, Apache mailing list, etc.) outperforming baseline results related to traditional algorithms and hand-crafted rules. The results obtained show that the analysis of word embedding sequences and the use of specific word-position rules helps to accurately predict the email zones of texts lines.
-
Notifications
You must be signed in to change notification settings - Fork 0
kartiikthakur/Email-Zoning--ML-Project
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Email Zoning component to detect different parts of an email and extract the greeting, body and signature of an email. Word Embedding features and Rucurrent Neural Network model.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published