You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also I have to cut the batch_size to 256 (as I think you also did) since I've got a GPU memory overflow. I'm curious to know if there is less greedy hyperparameters. In your Notebook, the executable shows more than 24 days.
I'm also curious to compare with a simple Markov chain.
Anyway thanks for the nice project that allows us to dream with tales.
Claude Coulombe
The text was updated successfully, but these errors were encountered:
Keep in mind that the number of epochs are way larger than what I let it run for. I trained and downloaded the weights to see what the predictions were like every few hours and never got close to finishing all the epochs.
If you are just trying to train the model for fun and you want good enough results, there are several ways decrease the training time. You can get pretty good results with just a single layer LSTM, so dropping num_layers to one would help a lot. Also, the sequence length could be decreased to 15 or 20 without a huge impact on results.
I can train with a batch size of 512 using an AWS GPU instance. I'm not sure why you were getting the memory overflow.
Greetings,
Nice project... With the default parameters you've gave, it takes days not hours, even with powerful GPUs on AWS.
num_epochs = 10000; batch_size = 512; rnn_size = 512; num_layers = 3; keep_prob = 0.7; embed_dim = 512; seq_length = 30; learning_rate = 0.001;
Also I have to cut the batch_size to 256 (as I think you also did) since I've got a GPU memory overflow. I'm curious to know if there is less greedy hyperparameters. In your Notebook, the executable shows more than 24 days.
I'm also curious to compare with a simple Markov chain.
Anyway thanks for the nice project that allows us to dream with tales.
Claude Coulombe
The text was updated successfully, but these errors were encountered: