Self-Attention Model for Sequence Generation

In this repository, we implement our proposed self-attention model based on VAE for sequence generation in the Penn TreeBank and Yelp 2013. The datasets are downloaded from Tomas Mikolov's webpage and Jiacheng Xu's repository.

Traditionally, VAE in sequence generation consists of two RNNs for both encoder and decoder. However, applying attention mechanism to this encoder-decoder architecture is a challenge as we want to generate new sequences from the latent space where the encoder is disregarded. To address this problem, this model uses a stochastic RNN as the decoder of VAE which of additional latent variables allow us to reconstruct the context vector to compensate the missing of the attention information in the generative process.

Setting

Framework:
- Pytorch 0.4.0
Hardware:
- CPU: Intel Core i7-5820K @3.30 GHz
- RAM: 64 GB DDR4-2400
- GPU: GeForce GTX 1080ti

Result of learning curve


Penn TreeBank


Yelp 2013

Attention visualization


Penn TreeBank


Yelp 2013

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
PTB		PTB
Yelp		Yelp
figures		figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Attention Model for Sequence Generation

Setting

Result of learning curve

Attention visualization

About

Releases

Packages

Languages

chriswang0122/self-attention-model

Folders and files

Latest commit

History

Repository files navigation

Self-Attention Model for Sequence Generation

Setting

Result of learning curve

Attention visualization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages