Summrz

This is a web app built with Django that provides a nice user interface to the state-of-the-art abstractive summarization work, Text Summarization with Pretrained Encoders Published here by Yang Liu and Mirella Lapata in 2019. Instead of the common seq2seq, this work applies Encoder Representations Transformers (BERT) for the summarization task.

Most breaking through research works takes forever to be appreciated and getting deployed for real-world application. I felt much more on the summarization task, I couldn't find a single website or service offering abtractive summarization. This work is still far from perfect but quite powerful, the model is able to not only extract sentences from original piece of text but use its own words where necessary to produce a summary. This is my attempt to deploy a working example for people to easily play with the models and see how far this field has gone.

Downloading Trained Models

This project has four ready trained models that you can directly install and work with them. First model is BERT Extractive only, second is both extractive and abstractive (BERT), third is abstractive only. These 3 uses CNN and Daily Mail data. The fourth model is both Extractive and abstractive but it uses:

CNN/DM BertExt

CNN/DM BertExtAbs

CNN/DM TransformerAbs

XSum BertExtAbs

##Installation

This project needs Python3.6

Clone this repo:

git clone https://github.com/elkd/summrz.git

pip install -r requirements.txt

Copy the downloaded models to the folder ./models/ You should also downlaod the BERT CNN and Daily Mail data then store here ./bert_data_new/cnndm/ Pre-processed data unzip the zipfile and put all .pt files into folder.

Then run:

python manage.py runserver

Then open the browser and access http://127.0.0.1:8000

Fill in a sample text and summarize. The models doesn't do well on very long texts like a whole article. In that case it's better to split the text into batches. You can check my implementation on the default view here Also you can change the default options of the models here

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
accounts		accounts
client		client
models		models
src		src
static-files		static-files
static		static
summrz		summrz
.gitignore		.gitignore
README.md		README.md
manage.py		manage.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summrz

Downloading Trained Models

About

Releases

Packages

Contributors 2

Languages

elkd/summrz

Folders and files

Latest commit

History

Repository files navigation

Summrz

Downloading Trained Models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages