Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seq2seq with Pytorch cause GPU out of memory #1

Open
ghost opened this issue Nov 23, 2017 · 1 comment
Open

seq2seq with Pytorch cause GPU out of memory #1

ghost opened this issue Nov 23, 2017 · 1 comment

Comments

@ghost
Copy link

ghost commented Nov 23, 2017

Hello, when I train the example seq2seq model on ubuntu dataset on Parlai, GPU is out of memory after training on a few thousand of examples. Do you know how to solve this problem? I saw similar issues posted before. I think it is related to Pytorch.

@ryonakamura
Copy link
Owner

Hello, @ZixuanLiang
I am sorry, but the example implementations currently guarantees operation only with bAbI task.
In other tasks, if the number of vocabulary is too large, the matrix of embedding layer and softmax layer becomes huge, causing out of memory. It is necessary to reduce vocabulary by converting low frequency words to unknown. Unfortunately ParlAI does not implement this feature. I plan to implement a dictionary agent (eg dict-minfreq, subword, sentencepiece) to solve this problem soon.
Also, in case of seq2seq, if the input sentence is too long, the number of LSTM cells to be stored in memory increases, which causes out of memory. Simply it may be possible to avoid by reducing the hidden size.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant