Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory explosion when pretrain Bidirectional LSTM #25

Closed
shangqing-liu opened this issue Apr 26, 2021 · 2 comments
Closed

Memory explosion when pretrain Bidirectional LSTM #25

shangqing-liu opened this issue Apr 26, 2021 · 2 comments

Comments

@shangqing-liu
Copy link

Hi,

Thanks for the wonderful work. May I ask a question, when I pretrain LSTM model with the default settings, the memory is overflow. My server has 180G RAM, so may I ask how much RAM needed for pretraining?

Thanks and best regards.

@ajayjain
Copy link
Collaborator

HI @shangqing-liu -- apologies about the delay. I am a coauthor on this work.

Can I ask what GPU you are using to train this model? Are you encountering a CUDA malloc error (out of memory on GPU) or host-side out of memory error?

At the moment, the dataloader reads the whole dataset in RAM before training. The pre-training dataset is quite large (almost 20GB) so this can be expensive. We pre-trained our models on machines with around 256GB of RAM. However, we didn't encounter OOMs.

If you could provide some more details on your setup, I can help debug!

@shangqing-liu
Copy link
Author

Hi, @ajayjain

Thanks for the reply, I have fixed this problem and now close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants