Input pipeline fully out of core #251

M-R-Schaefer · 2024-04-02T12:59:33Z

While the recent input pipeline rework has already greatly reduced memory consumption, we can defuinitely decrease it further.
The current major RAM hog is the dict of ASE atoms, not the data pipeline. We can avoid this by reading the file on the fly in the generator instead of all at once, thereby avoiding the need to keep all atoms in memory.

M-R-Schaefer · 2024-07-24T12:49:35Z

Although this could certainly be done, the current PBP dataset allows for training of even the largest available datasets with 64 GB of RAM, which can be expected from a modern workstation.

M-R-Schaefer added the enhancement New feature or request label Apr 2, 2024

M-R-Schaefer closed this as not planned Won't fix, can't repro, duplicate, stale Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input pipeline fully out of core #251

Input pipeline fully out of core #251

M-R-Schaefer commented Apr 2, 2024

M-R-Schaefer commented Jul 24, 2024

Input pipeline fully out of core #251

Input pipeline fully out of core #251

Comments

M-R-Schaefer commented Apr 2, 2024

M-R-Schaefer commented Jul 24, 2024