Skip to content

mashchenskaia/Text-summarization

Repository files navigation

Text summarization

We explored two types of summarization: abstractive and extractive to derive pros and cons of the both approaches.

Preparation:

We parsed articles of the Russian Foreign Economic Bulletin for the source material in non-ML extractive approaches. We also used sentence tokinzer tool created by MIPT DeepPavlov Lab.

Extractive approach:

We started our research with non-ML approaches and used Text/LexRank for summarizations.

We relied on the experiences described in the following papers:

In terms of ML-approach for extractive summarization we used BERTSUM approach:

Abstractive approach:

Under the condition of the first iteration in RnD we used PGN-architecture on the base of AllenNLP framework. We relied on the experience described in the following paper:

Study perspectives:

Right now we are working on improvement of the study in both fields:

  • Extractive approach: LM-tuning for specific domain, domain adaptation
  • Abtractive approach: as ROUGE metric is discrete it cannot be optimised, we study RL+ML approach for objective function modelling

About

RnD project on the topic of text text summarization

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published