Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about memory scaling during training #3

Closed
rees-c opened this issue Feb 16, 2024 · 1 comment
Closed

Question about memory scaling during training #3

rees-c opened this issue Feb 16, 2024 · 1 comment

Comments

@rees-c
Copy link

rees-c commented Feb 16, 2024

Hi,

Thanks for building these models. I noticed that the training scripts for the MP pre-trained models use small batch sizes of 16. What was the reasoning for this choice?

My application requires training on graphs with hundreds to a few thousand nodes, and I was hoping that MACE's lack of explicit triplet angle computation (as in DimeNet or GemNet) would offer more favorable memory scaling. Any insights would be greatly appreciated.

Thanks,
Rees

@ilyes319
Copy link
Contributor

ilyes319 commented Oct 9, 2024

Hi @rees-c,
Sorry for the long delay in reply, the MACE github would be a more suitable place for your question.
The batch size has both an effect on the memory consumption but also on the training dynamics.
MACE can fit during training about 1000 nodes on a single GPU, A100. However we rarely go over 64 of batch size per GPU because we see degradation of accuracy past that.

@ACEsuit ACEsuit locked and limited conversation to collaborators Oct 9, 2024
@ilyes319 ilyes319 converted this issue into discussion #16 Oct 9, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants