You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This seems like a problem due to using too many LSTM cells, and iterations for the system to fit into the memory of my GPU card.
I've tried experiments with encoders and decoders using 128 LSTM cells, 10 iterations, and latent dimension dim(z)=50, and they were fine, up to 67 epochs when I ran out of memory.
I tried to run using the default settings, i.e.
python ./train-draw.py
but during the 4th epoch I got an out of memory error?
I'm using a modern GPU with 2GB ram, which is fine with my torch experiments, and all my other LSTM experiments, is there a way to avoid this?
AFTER ANOTHER EPOCH
Training status:
epochs_done: 4
iterations_done: 2000
Log records from the iteration 2000:
epoch_took: 210.858546019
iteration_took: 0.411077022552
saved_to: ('mnist-full-t10-enc256-dec256-z100-lr13.pkl',)
test_kl_term_0: 2.90826129913
.....
Epoch 4, step 50 |
Elapsed Time: 0:00:20
Error allocating 7471104 bytes of device memory (out of memory).
Driver report 4771840 bytes free and 1341718528 bytes total
[12:34:07] blocks.main_loop Error occured during training.
MemoryError: Error allocating 7471104 bytes of device memory (out of memory).
Apply node that caused the error: GpuGemm{no_inplace}
The text was updated successfully, but these errors were encountered: