Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions regarding reducing the size of both input and model? #7

Closed
smiles724 opened this issue Feb 4, 2022 · 3 comments
Closed

Comments

@smiles724
Copy link

Hi, thanks for sharing the code of GemNet, wonderful work for the prediction of energy.

However, I intend to adopt your model to macromolecules such as proteins instead of small molecules. However, as you know, proteins have far more atoms, which unavoidably leads to much more GPU memories. In order to prevent the explosion of GPU memories, I had to control the size of model inputs. Thus, I am here to ask for some advice on how to reduce the size of your model input.

Specifically, a straightforward way is to decrease the cutoff distance. So that the edges become less. But I believe that is not a good practice. Can you give me some other solutions (for other hyperparameters)?

image

@smiles724 smiles724 changed the title Some questions regarding reducing the input size? Some questions regarding reducing the size of both input and model? Feb 4, 2022
@smiles724
Copy link
Author

Moreover, I also need to reduce the model size. After I turned the batch size of 32 into 2, it becomes another problem of utilizing GemNet to process that. Can you give me some hints on getting a smaller model?

@gowithdaflo
Copy link
Collaborator

Thank you for the interest in our work.
Reducing the size of the model input can only be achieved by decreasing the two cutoff distances. As discussed in the appendix of our paper this reduces the prediction accuracy. However, this is a tradeoff with computational resources so you could give this a shot.
Further, to reduce the model size this can be achieved by reducing the embedding sizes (emb_size_...) or the number of interaction blocks (num_blocks). Another approach can be to use GemNet-T instead of GemNet-Q (as done for OC20) or even use the direct force prediction models GemNet-dQ/T. This all significantly lowers the memory consumption but again is a tradeoff with prediction accuracy.
Let me know if this helps.

@gasteigerjo
Copy link
Contributor

gasteigerjo commented Feb 8, 2022

I second the points Flo mentioned. I think your very first step should be using GemNet-T instead of GemNet-Q. That will already give a huge improvement. I wouldn't recommend reducing cutoffs below 4A.

I would not recommend using direct models (GemNet-dT) for molecular dynamics, since they can lead to unstable trajectories. If your task is not simulation, though, then you can use a direct model to gain another 2-3x reduction in memory and runtime.

Embedding sizes and depth (num_blocks) would then be the third thing to consider. It's hard to give advice on which value presents the best trade-off, since they will likely all reduce accuracy. A first step might be to simply half all embedding sizes and use 3 blocks instead of 4. But probably you should not reduce anything below 16. Note that emb_size_quad, emb_size_sbf and emb_size_bil_quad are not relevant for GemNet-T. You can then increase embeddings again to see which ones are the most important for your application.

Overall, I think it should easily be possible to find well-performing settings for GemNet-T on proteins that fit in 45GiB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants