Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynet dynamic memory allocation #2

Closed
geeksouvik opened this issue Feb 6, 2022 · 3 comments
Closed

Dynet dynamic memory allocation #2

geeksouvik opened this issue Feb 6, 2022 · 3 comments

Comments

@geeksouvik
Copy link

I have installed dynet with gpu compatibility as mentioned in the docs. Also the --dynet-mem is set in the train_single-source.sh file. Even then I got this error. Following is the Traceback of the entire error.

[dynet] Device Number: 2
[dynet] Device name: GeForce GTX 1080 Ti
[dynet] Memory Clock Rate (KHz): 5505000
[dynet] Memory Bus Width (bits): 352
[dynet] Peak Memory Bandwidth (GB/s): 484.44
[dynet] Memory Free (GB): 11.5464/11.7215
[dynet] Device(s) selected: 2
[dynet] random seed: 2652333402
[dynet] using autobatching
[dynet] allocating memory: 6000MB
[dynet] memory allocation done.
Param, load_model: None
Traceback (most recent call last):
File "/mnt/data/souvik/sanskrit/ocr-post-correction/postcorrection/multisource_wrapper.py", line 65, in
pretrainer = PretrainHandler(
File "/mnt/data/souvik/sanskrit/ocr-post-correction/postcorrection/pretrain_handler.py", line 81, in init
self.pretrain_model(pretrain_src1, pretrain_src2, pretrain_tgt, epochs)
File "/mnt/data/souvik/sanskrit/ocr-post-correction/postcorrection/pretrain_handler.py", line 88, in pretrain_model
self.seq2seq_trainer.train(
File "/mnt/data/souvik/sanskrit/ocr-post-correction/postcorrection/seq2seq_trainer.py", line 55, in train
batch_loss.backward()
File "_dynet.pyx", line 823, in _dynet.Expression.backward
File "_dynet.pyx", line 842, in _dynet.Expression.backward
ValueError: Dynet does not support both dynamic increasing of memory pool size, and automatic batching or memory checkpointing. If you want to use automatic batching or checkpointing, please pre-allocate enough memory using the --dynet-mem command line option (details http://dynet.readthedocs.io/en/latest/commandline.html).

@shrutirij
Copy link
Owner

Hi! It looks like you need to allocate more memory in --dynet-mem since your dataset is probably larger than our sample dataset. You can change it in the train_single-source.sh file to a larger value (e.g., 12000 MB).

@geeksouvik
Copy link
Author

Thanks. But I don't have 12gb of gpu. The max i can afford id ~10gb. Is there a way to change the batch size or any other hyperparameter to run the script. I have 25k samples in pretrain dataset.

@shrutirij
Copy link
Owner

I ran all my experiments on CPU and it wasn't too slow -- you could try that. You can also remove the "--dynet-autobatching" flag and try it without autobatching.

You can also adjust the model size hyperparameters to make a smaller model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants