Skip to content

webianks/scout

Repository files navigation

Scout

Scout helps to perform email search flawlessly on the basis of extracting various email search parameters from plain english query.

Problem

Email search on the mail server is based on explicitly providing input parameters like subject, date, from, to etc. Task is to translate a plain English based search text to the provided input set of parameters.

Example

  • mails from ravi in the last 3 days —> From:- Ravi, ToDate:- Today, FromDate:- Today-3days
  • ppts/presentations from ravi to me and rohan -- > From:- Ravi To:-(Me and Rohan), AttachmentType:- ppt
  • all attachments larger than 3MB —> AttachmentSize > 3MB
  • Citrix XenMobile document -> AttachmentType:- ppt/doc/xls/txt/pdf, AttachmentName :-Citrix/XenMobile/Citrix XenMobile 

Search sentences break-down

All the search sentences should be broken down to the following parameters.

  1. From
  2. To
  3. ToDate
  4. FromDate
  5. HasAttachments
  6. AttachmentType
  7. AttachmentSize
  8. AttachmentName
  9. Subject
  10. CC

Solution

As we studied the problem statement, we came to find out that this can be achieved by Natural Language Processing. We looked through all the options for NLP and decided to go with IBM Watson Knowledge Studio and IBM Alchemy Language API. For a particular language, we deal with finding the different entities in it, which is obtained by training the model accordingly in Watson Knowledge Studio. The model is trained by the different annotations provided by us to recognize different entities such as Username, Time, Attachment, From, Subject etc. We provide a generalized training set to the model to learn it, then deploy the model to alchemy API service. From our Android App, we input the file and make call for each line to the alchemy service which has already be trained, this returns us with a JSON response containing the entities and the order which they we’re detected. Now we process through those obtained entities in the App, for a pattern and draw deductions for the required parameters and write the output in the file.

An Example of entity recognition and processing PDFs from Ravi in the last 3 days

Entities are follows (JSON Response):

"entities":[ 		
{
"count": "1",
"text": "PDF",
"type": "ATTACHMENT_TYPE"
},			
{
"count": "1",
"text": "from",
"type": "FROM"
},			{
"count": "1",
"text": "Ravi",
"type": "USERNAME"
},		
{
"count": "1",
"text": "last",
"type": "SPAN"
},		
{
"count": "1",
"text": "3 days",
"type": "DATE"  
} ]

License

MIT License

Copyright (c) 2017 Ramankit Singh : Copyright (c) 2017 Sajal Gupta

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

About

🔰 Scout helps to perform email search flawlessly on the basis of extracting various email search parameters from plain english query.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages