Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It could take days not hours to train... #17

Closed
ClaudeCoulombe opened this issue Sep 22, 2017 · 2 comments
Closed

It could take days not hours to train... #17

ClaudeCoulombe opened this issue Sep 22, 2017 · 2 comments

Comments

@ClaudeCoulombe
Copy link

ClaudeCoulombe commented Sep 22, 2017

Greetings,

Nice project... With the default parameters you've gave, it takes days not hours, even with powerful GPUs on AWS.

num_epochs = 10000; batch_size = 512; rnn_size = 512; num_layers = 3; keep_prob = 0.7; embed_dim = 512; seq_length = 30; learning_rate = 0.001;

Also I have to cut the batch_size to 256 (as I think you also did) since I've got a GPU memory overflow. I'm curious to know if there is less greedy hyperparameters. In your Notebook, the executable shows more than 24 days.

I'm also curious to compare with a simple Markov chain.

Anyway thanks for the nice project that allows us to dream with tales.

Claude Coulombe

@zackthoutt
Copy link
Owner

Keep in mind that the number of epochs are way larger than what I let it run for. I trained and downloaded the weights to see what the predictions were like every few hours and never got close to finishing all the epochs.

If you are just trying to train the model for fun and you want good enough results, there are several ways decrease the training time. You can get pretty good results with just a single layer LSTM, so dropping num_layers to one would help a lot. Also, the sequence length could be decreased to 15 or 20 without a huge impact on results.

I can train with a batch size of 512 using an AWS GPU instance. I'm not sure why you were getting the memory overflow.

@ClaudeCoulombe
Copy link
Author

Thanks for your informative answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants