Skip to content
/ char-rnn Public
forked from Graydyn/char-rnn

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for word-level language models in Torch

Notifications You must be signed in to change notification settings

namp/char-rnn

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

word-rnn

If you haven't read the readme and blog post for char-rnn, head on over there before going any further.

This fork alters Graydyn/char-rnn (actually word-rnn) to handle UTF-8 encoded input. You will also need to install luautf8:

luarocks install luautf8

Graydyn/char-rnn modifies the original char-rnn in order to work with words instead of characters. A heavy metal lyrics dataset is maintained, as an example of a situation where word-rnn works pretty well. Below are some comments from the original word-rnn:

##The Bad News

Since using words instead of characters blows up the size of vocabulary, memory is a big issue. Given the same graphics card, you will almost always get better results on the character level, because on the word level you will need to train on a much smaller network. Unless you're GPU is much fancier than mine. The word level split works well for a very narrow range of datasets, which are large but contain a minimal number of words. I've included a dataset of heavy metal lyrics which produces fun results when trained with default values. To get the memory usage down, you need to reduce either rnn_size or seq_length. Fortunately, since we are on the word level we get a lot more bang from our buck out of our seq_length. Still, if you reduce it below 4, the results start to look a lot like a string of random words. Also, I've stripped out all punctuation to reduce the vocabulary size. This shouldn't be a big deal to add back in, and I might do so at some point.

##The Good News

This approach removes all spelling mistakes, and it does seem to generate more coherent heavy metal songs, even though I couldn't get the validation loss anywhere near as low as I could with the char level network.

License

MIT

About

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for word-level language models in Torch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Lua 100.0%