Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RNN with GANs vs Independent RNN Model #8

Closed
sunsidazzz opened this issue Oct 2, 2017 · 2 comments
Closed

RNN with GANs vs Independent RNN Model #8

sunsidazzz opened this issue Oct 2, 2017 · 2 comments

Comments

@sunsidazzz
Copy link

So I am researching in language models that can generate words.

Your results showed that RNN + GANs improves the quality of generated sequences compared to CNN+GANs.
But do you think that the combination of RNN and GANs performs better than an independent RNN model?(LSTM, GRU)
Because the resulting sentences you got from RNN+GANs are not coherent, and a well-trained RNN model can do the same job. Also, when I was training your model, I feel like using CL+VL+TH is very time-consuming. So is it really worth to train an RNN with GANs? Or the purpose of this project is just to prove that RNN could work well with GANs?

Thanks,
Sida Sun

@ofirpress
Copy link
Collaborator

The RNN WGAN model is vastly inferior to an RNN trained with the MLE objective. See for example the last page of "Exploring the Limits of Language Modeling" where the results are much better than the ones we obtain. Also, as you state, the training time with MLE is much faster.

This was the first (along with a simultaneous paper published by MILA) work to show that you can train an RNN with just the GAN objective and without MLE. This is just a first step, and we haven't reached the performance of MLE models yet. We believe that if more effort will be put into this research direction we could get to, or even pass, the performance of MLE models with GAN models.

If you have more questions feel free to email any of the authors.

@yoosif0
Copy link

yoosif0 commented Jul 22, 2018

This paper claims that it produces better results than MLE models https://arxiv.org/abs/1801.07736

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants