Text Summarization is an advanced project and comes under the umbrella of Natural Language Processing. There are multiple methods people use in order to summarize text.
They can be affectively clubbed under 2 methods:
- Abstractive: Understand the true context of text before summarization (like a human).
- Extractive: Rank the text within the file and identify the impactful terms.
While both these approaches are under research, extractive summarization is presently used across multiple platform. There are multiple methods by which text is summarized under extractive approach as well.
In this script we will use Text Rank approach for text summarization.
- nltk
- numpy
- networkx
stopwords
- Stopwords are the English words which does not add much meaning to a sentence.
- Setup a
python 3.x
virtual environment. Activate
the environment- Install the dependencies using
pip3 install -r requiremnts.txt
- Set up the models by running the following commands,
$ python -m nltk.downloader stopwords
- Run the
text_summary.py
file - Enter the source path.
The code generates the tokens (same as weights) of set of words, it shows the relative importance of words according to the summarizer, just uncomment the l-112
Results can be found here.
Made by Vybhav Chaturvedi