Simple Tensorflow implementation of text summarization using seq2seq library.
Encoder-Decoder model with attention mechanism.
Used Glove pre-trained vectors to initialize word embedding.
Used LSTM cell with stack_bidirectional_dynamic_rnn.
Used BahdanauAttention with weight normalization.
- Python 3
- Tensorflow (>=1.8.0)
- pip install -r requirements.txt
Dataset is available at harvardnlp/sent-summary. Locate the summary.tar.gz file in project root directory. Then,
$ python prep_data.py
To use Glove pre-trained embedding, download it via
$ python prep_data.py --glove
sumdata/train/train.title.txt for training data. To train the model, use
$ python train.py
To use Glove pre-trained vectors as initial embedding, use
$ python train.py --glove
$ python train.py -h usage: train.py [-h] [--num_hidden NUM_HIDDEN] [--num_layers NUM_LAYERS] [--beam_width BEAM_WIDTH] [--glove] [--embedding_size EMBEDDING_SIZE] [--learning_rate LEARNING_RATE] [--batch_size BATCH_SIZE] [--num_epochs NUM_EPOCHS] [--keep_prob KEEP_PROB] [--toy] optional arguments: -h, --help show this help message and exit --num_hidden NUM_HIDDEN Network size. --num_layers NUM_LAYERS Network depth. --beam_width BEAM_WIDTH Beam width for beam search decoder. --glove Use glove as initial word embedding. --embedding_size EMBEDDING_SIZE Word embedding size. --learning_rate LEARNING_RATE Learning rate. --batch_size BATCH_SIZE Batch size. --num_epochs NUM_EPOCHS Number of epochs. --keep_prob KEEP_PROB Dropout keep prob. --toy Use only 5K samples of data
Generate summary of each article in
$ python test.py
It will generate result summary file
result.txt. Check out ROUGE metrics between
sumdata/train/valid.title.filter.txt using pltrdy/files2rouge.
Sample Summary Output
"general motors corp. said wednesday its us sales fell ##.# percent in december and four percent in #### with the biggest losses coming from passenger car sales ." > Model output: gm us sales down # percent in december > Actual title: gm december sales fall # percent "japanese share prices rose #.## percent thursday to <unk> highest closing high for more than five years as fresh gains on wall street fanned upbeat investor sentiment , dealers said ." > Model output: tokyo shares close # percent higher > Actual title: tokyo shares close up # percent "hong kong share prices opened #.## percent higher thursday on follow-through interest in properties after wednesday 's sharp gains on abating interest rate worries , dealers said ." > Model output: hong kong shares open higher > Actual title: hong kong shares open higher as rate worries ease "the dollar regained some lost ground in asian trade thursday in what was seen as a largely technical rebound after weakness prompted by expectations of a shift in us interest rate policy , dealers said ." > Model output: dollar stable in asian trade > Actual title: dollar regains ground in asian trade "the final results of iraq 's december general elections are due within the next four days , a member of the iraqi electoral commission said on thursday ." > Model output: iraqi election results due in next four days > Actual title: iraqi election final results out within four days "microsoft chairman bill gates late wednesday unveiled his vision of the digital lifestyle , outlining the latest version of his windows operating system to be launched later this year ." > Model output: bill gates unveils new technology vision > Actual title: gates unveils microsoft 's vision of digital lifestyle
To test with pre-trained model, download pre_trained.zip, and locate it in the project root directory. Then,
$ unzip pre_trained.zip $ python test.py