word-rnn

If you haven't read the readme and blog post for char-rnn, head on over there before going any further.

This fork alters Graydyn/char-rnn (actually word-rnn) to handle UTF-8 encoded input. You will also need to install luautf8:

luarocks install luautf8

Graydyn/char-rnn modifies the original char-rnn in order to work with words instead of characters. A heavy metal lyrics dataset is maintained, as an example of a situation where word-rnn works pretty well. Below are some comments from the original word-rnn:

##The Bad News

Since using words instead of characters blows up the size of vocabulary, memory is a big issue. Given the same graphics card, you will almost always get better results on the character level, because on the word level you will need to train on a much smaller network. Unless you're GPU is much fancier than mine. The word level split works well for a very narrow range of datasets, which are large but contain a minimal number of words. I've included a dataset of heavy metal lyrics which produces fun results when trained with default values. To get the memory usage down, you need to reduce either rnn_size or seq_length. Fortunately, since we are on the word level we get a lot more bang from our buck out of our seq_length. Still, if you reduce it below 4, the results start to look a lot like a string of random words. Also, I've stripped out all punctuation to reduce the vocabulary size. This shouldn't be a big deal to add back in, and I might do so at some point.

##The Good News

This approach removes all spelling mistakes, and it does seem to generate more coherent heavy metal songs, even though I couldn't get the validation loss anywhere near as low as I could with the char level network.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
model		model
util		util
.gitignore		.gitignore
Readme.md		Readme.md
inspect_checkpoint.lua		inspect_checkpoint.lua
sample.lua		sample.lua
train.lua		train.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

word-rnn

License

About

Releases

Packages

Languages

namp/char-rnn

Folders and files

Latest commit

History

Repository files navigation

word-rnn

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages