A small tool to normalize and extract values from unstructured text messages.
Emogram (Text Analysis for unstructured text)


A small set of tools that'll normalize and extract values from unstructured text messages using concepts of NLP. Other Applications can use these modules to extract information and public opinions from Surveys, Social Networking sites, etc.

Getting Started


What things you need to run the program:

  • Python Compiler (3.7 Recommended)
  • A clone of this repository :P
  • Install all the necessary packages form pypi by using the following command:
pip install textblob
pip install spellchecker


Acronym Resolution

Expands acronyms that are present in the text as the first step of text normalization.

Key Phrases Extraction

Rapid Automatic Keyword Extraction (RAKE) algorithm to determine key phrases in a body of text by analyzing the frequency of word appearance and its co-occurance with other words in the text.

Polarity Detection

Using TextBlob to detect Polarity of normalized text that ranges from -1 (Strongly Negative) to 1 (Strongly Positive).

Auto Correct

Autocorrects misspelt words/typos present in the text as a part of text normalization.


