A text summariser application

A text summariser web app using latent semantic analysis, hosted on AWS EC2 with optional API on AWS Lambda.
Web app link

API folder

Code to deploy the SLA summariser as a microservice API on AWS Lambda.

Frontend folder

A web front end implemented in Flask.

Background - two types of summarisers:

Extractive: A summary is formed by picking the most important sentences in the given text based on a scoring system. Common approaches:

Using topic words.
Frequency-driven: how often each word in the sentence appear within and without the sentence.
Latent semantic analysis: An unsupervised method to discover the underlying semantics (topics) of the text.

Abstractive: Using advanced analytics techniques such as deep learning (CNN, RNN with LSTM) to generate a new text that conveys the key information in the original [1].

A summary of the algorithm:

The sentences are cleaned and lemmatised (convert words to their root form, i.e. grew to grow).
A term-sentence matrix is built with TF-IDF (term frequency inverse document frequency).
Apply SVD (singular value decomposition) on the above matrix.
Each sentence is given a weight based on how much information they contain for important topics. This approach is based on the sentence length approach proposed by [2].
Sentences that are at the top of the text is given a slightly higher weight.
Sentences are added to the summary one by one until a set ratio of information is reached.

Reference:

[1] Allahyari, M, Pouriyeh, S. et al (2017). Text Summarization Techniques: A Brief Survey.
[2] Steinberger, J, Poesio, M. et al (2007). Two uses of anaphora resolution in summarization.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
api		api
frontend		frontend
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A text summariser application

API folder

Frontend folder

Background - two types of summarisers:

A summary of the algorithm:

Reference:

About

Releases

Packages

Languages

tri47/text-summariser

Folders and files

Latest commit

History

Repository files navigation

A text summariser application

API folder

Frontend folder

Background - two types of summarisers:

A summary of the algorithm:

Reference:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages