A Hybrid Convolutional Variational Autoencoder for Text Generation #15

flrngel · 2018-04-01T17:15:46Z

https://arxiv.org/abs/1702.02390

Abstract

this model uses rnn + feed-forward convolutional architecture
author claims this model has
- fast train convergence
- handles long sequences
- avoid major difficulties from VAE on textual data

1. Introduction

author claims this model is
- first work that applies deconvolution in latent variable generative model of natural text
this paper discuss
- optimization difficulties of VAEs for text
  - propose effective ways to address them

2. Related Work

Techniques to improve VAE training
- KL-term annealing and input dropout
- imposing structured sparsity on latent variables

3. Model

this model respect latent variables that can be able to output sample realistic sentences with latent space

3.1. Variational Autoencoder

VAE forces the model to map an input to a region of the space

3.2. Deconvolutional Networks

Deconvolutional layer's goal
- to perform an inverse convolution operation and increase spatial size of the input while decreasing the number of feature maps.
Deconvolutional layer's benefit
- efficient GPU implementation that makes fully parallel
- feed-forward is typically easier to optimize than using recurrent counterparts

3.3. Hybrid Convolutional-Recurrent VAE (paper model)

VAE + RNN architecture
RNN is for consuming deconvolutional decoder to have dependency with previous outputs which is,
instead of
encode every detail of a text fragment instead of high level feature (like semantic)

3.4. Optimization Difficulties

input dropout helped
add auxiliary reconstruction term computed from last deconvolutional layer,
finally,
autoregressive part reuses these features

4. Experiments

4.1. Comparision with LSTM VAE

Historyless decoding

paper model's historyless decoding was better
computationally faster (factor 2)

Decoding with history

Paper checks
- is historyless decoding generallizes well?
- how model copes latent variable well
paper claims their model does not fail on long texts

4.2. Controlling the KL term

Aux cost weight

using input dropout increases final loss but this is trade off
note that model finds non-trivial latent vectors when a is large enough

Receptive field

goal is to study the relationship between KL term values and expressiveness of decoder
-RNN decoder in LSTM VAE completely ignore information on latent vector
Aux helps as Figure 6

flrngel added VAE Convolution Text Generation labels May 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Hybrid Convolutional Variational Autoencoder for Text Generation #15

A Hybrid Convolutional Variational Autoencoder for Text Generation #15

flrngel commented Apr 1, 2018 •

edited

A Hybrid Convolutional Variational Autoencoder for Text Generation #15

A Hybrid Convolutional Variational Autoencoder for Text Generation #15

Comments

flrngel commented Apr 1, 2018 • edited

Abstract

1. Introduction

2. Related Work

3. Model

3.1. Variational Autoencoder

3.2. Deconvolutional Networks

3.3. Hybrid Convolutional-Recurrent VAE (paper model)

3.4. Optimization Difficulties

4. Experiments

4.1. Comparision with LSTM VAE

Historyless decoding

Decoding with history

4.2. Controlling the KL term

Aux cost weight

Receptive field

flrngel commented Apr 1, 2018 •

edited