New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code doesn't run - RAM fills up too quickly #6
Comments
This is expected because the current implementation first preprocess all examples in trainset into various input tensors, include input ids, structure mask, label ids etc, which should take certain time and consume quite large RAM. My advice is that, you can implement a sampling based dataset loader instead of preprocessing and caching the examples at one time before training, although this would somewhat slow the training. Here is what you should do:
Good luck and if you have any further questions, feel free to ask here or via email, thanks for your attention on our work~ |
Thank you for such a thorough response! This solved the RAM consumption and code did go through, but I am facing a few CUDA/infra issues during training which might not be related to this issue. I will reach out to you should I need anything. Thanks again! |
The code did not run on my 16GB RAM, 12GB x2 GeForce GTX TITAN X. It would get stuck while training at 0%.
So i tried to replicate the same in google colab but it kept getting killed even when I had 35GB of RAM.
I added a line to logger in dataset.py to show me RAM usage using psutil.
with every 500 examples, about 4GB of RAM is being used up.
Am I missing something?
The text was updated successfully, but these errors were encountered: