Out of memory after 3 epochs ? #7

AjayTalati · 2015-03-24T12:46:19Z

I tried to run using the default settings, i.e.

python ./train-draw.py

but during the 4th epoch I got an out of memory error?

I'm using a modern GPU with 2GB ram, which is fine with my torch experiments, and all my other LSTM experiments, is there a way to avoid this?

AFTER ANOTHER EPOCH

Training status:
epochs_done: 4
iterations_done: 2000
Log records from the iteration 2000:
epoch_took: 210.858546019
iteration_took: 0.411077022552
saved_to: ('mnist-full-t10-enc256-dec256-z100-lr13.pkl',)
test_kl_term_0: 2.90826129913
.....

 test_nll_bound: 101.928291321
 total_took: 1034.53031182
 train_kl_term_0: 3.13306331635
......

 train_nll_bound: 103.738845825
 train_total_gradient_norm: 27.4104881287
 train_total_step_norm: 1.72634613514

Epoch 4, step 50 |

Elapsed Time: 0:00:20

Error allocating 7471104 bytes of device memory (out of memory).

Driver report 4771840 bytes free and 1341718528 bytes total

[12:34:07] blocks.main_loop Error occured during training.

MemoryError: Error allocating 7471104 bytes of device memory (out of memory).
Apply node that caused the error: GpuGemm{no_inplace}

The text was updated successfully, but these errors were encountered:

AjayTalati · 2015-03-25T20:26:16Z

This seems like a problem due to using too many LSTM cells, and iterations for the system to fit into the memory of my GPU card.

I've tried experiments with encoders and decoders using 128 LSTM cells, 10 iterations, and latent dimension dim(z)=50, and they were fine, up to 67 epochs when I ran out of memory.

AjayTalati closed this as completed Mar 25, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of memory after 3 epochs ? #7

Out of memory after 3 epochs ? #7

AjayTalati commented Mar 24, 2015

AjayTalati commented Mar 25, 2015

Out of memory after 3 epochs ? #7

Out of memory after 3 epochs ? #7

Comments

AjayTalati commented Mar 24, 2015

AjayTalati commented Mar 25, 2015