Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input pipeline fully out of core #251

Closed
M-R-Schaefer opened this issue Apr 2, 2024 · 1 comment
Closed

Input pipeline fully out of core #251

M-R-Schaefer opened this issue Apr 2, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@M-R-Schaefer
Copy link
Contributor

While the recent input pipeline rework has already greatly reduced memory consumption, we can defuinitely decrease it further.
The current major RAM hog is the dict of ASE atoms, not the data pipeline. We can avoid this by reading the file on the fly in the generator instead of all at once, thereby avoiding the need to keep all atoms in memory.

@M-R-Schaefer M-R-Schaefer added the enhancement New feature or request label Apr 2, 2024
@M-R-Schaefer
Copy link
Contributor Author

Although this could certainly be done, the current PBP dataset allows for training of even the largest available datasets with 64 GB of RAM, which can be expected from a modern workstation.

@M-R-Schaefer M-R-Schaefer closed this as not planned Won't fix, can't repro, duplicate, stale Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant