Text summarization

We explored two types of summarization: abstractive and extractive to derive pros and cons of the both approaches.

Preparation:

We parsed articles of the Russian Foreign Economic Bulletin for the source material in non-ML extractive approaches. We also used sentence tokinzer tool created by MIPT DeepPavlov Lab.

Extractive approach:

We started our research with non-ML approaches and used Text/LexRank for summarizations.

We relied on the experiences described in the following papers:

In terms of ML-approach for extractive summarization we used BERTSUM approach:

Fine-tune BERT for Extractive Summarization

Abstractive approach:

Under the condition of the first iteration in RnD we used PGN-architecture on the base of AllenNLP framework. We relied on the experience described in the following paper:

Point-less: More Abstractive Summarization with Pointer-Generator Networks

Study perspectives:

Right now we are working on improvement of the study in both fields:

Extractive approach: LM-tuning for specific domain, domain adaptation
Abtractive approach: as ROUGE metric is discrete it cannot be optimised, we study RL+ML approach for objective function modelling

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Abstractive/PGN		Abstractive/PGN
Extractive		Extractive
RFEJ parsing and preprocessing		RFEJ parsing and preprocessing
ROUGE		ROUGE
sentence_tokenizaton		sentence_tokenizaton
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text summarization

Preparation:

Extractive approach:

Abstractive approach:

Study perspectives:

About

Releases

Packages

Languages

mashchenskaia/Text-summarization

Folders and files

Latest commit

History

Repository files navigation

Text summarization

Preparation:

Extractive approach:

Abstractive approach:

Study perspectives:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages