Skip to content

CNN LSTM Seq2Seq Model for Abstractive Text Summarization

Notifications You must be signed in to change notification settings

murak038/CNN_LSTM_Seq2Seq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNN_LSTM_Seq2Seq

Abstractive Text Summarization Using Sequence to Sequence Model

Project Overview

Abstractive text summarization, on the other hand, generates summaries by compressing the information in the input text in a lossy manner such that the main ideas are preserved. The advantage of abstractive text summarization is that it can use words that are not in the text and reword the information to make the summarizes more readable. In this model, a CNN-LSTM encoder and LSTM decoder model are used to generate headlines for articles using the Gigaword dataset. To improve the quality of the generated summaries, a Bahdanau attention mechanism, a pointer-generator network and a beam-search inference decoder are applied to the model.

Install

This project requires Python 3.6 and the following Python libraries installed:

You will also need to have software installed to run and execute a Jupyter Notebook

If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included. Make sure that you select the Python 3.6 installer.

Architecture

alt text

Hyperparameters

Parameters Values
Kernel Size [1,3,5]
Filter Size 100
Encoder Hidden Units 256
Encoder Layers 1
Decoder Hidden Units 512
Decoder Layers 1
Beam Width 10
Embedding 300d - GloVe
Dropout 0.5
Loss Function torch.nn.CrossEntropyLoss
Optimizer Adam Optimizer
Learning Rate 0.001

Dataset

The model is trained on the Gigaword corpus found at https://github.com/harvardnlp/sent-summary. The dataset contains the first sentence of articles as the input text and the headlines as the ground-truth summaries.

Results

The generated summaries achieved a ROUGE-1 score of 29.79 using the files2rouge function.

About

CNN LSTM Seq2Seq Model for Abstractive Text Summarization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published