Some questions regarding reducing the size of both input and model? #7

smiles724 · 2022-02-04T12:06:03Z

Hi, thanks for sharing the code of GemNet, wonderful work for the prediction of energy.

However, I intend to adopt your model to macromolecules such as proteins instead of small molecules. However, as you know, proteins have far more atoms, which unavoidably leads to much more GPU memories. In order to prevent the explosion of GPU memories, I had to control the size of model inputs. Thus, I am here to ask for some advice on how to reduce the size of your model input.

Specifically, a straightforward way is to decrease the cutoff distance. So that the edges become less. But I believe that is not a good practice. Can you give me some other solutions (for other hyperparameters)?

smiles724 · 2022-02-04T12:26:19Z

Moreover, I also need to reduce the model size. After I turned the batch size of 32 into 2, it becomes another problem of utilizing GemNet to process that. Can you give me some hints on getting a smaller model?

gowithdaflo · 2022-02-07T20:54:54Z

Thank you for the interest in our work.
Reducing the size of the model input can only be achieved by decreasing the two cutoff distances. As discussed in the appendix of our paper this reduces the prediction accuracy. However, this is a tradeoff with computational resources so you could give this a shot.
Further, to reduce the model size this can be achieved by reducing the embedding sizes (emb_size_...) or the number of interaction blocks (num_blocks). Another approach can be to use GemNet-T instead of GemNet-Q (as done for OC20) or even use the direct force prediction models GemNet-dQ/T. This all significantly lowers the memory consumption but again is a tradeoff with prediction accuracy.
Let me know if this helps.

gasteigerjo · 2022-02-08T09:43:19Z

I second the points Flo mentioned. I think your very first step should be using GemNet-T instead of GemNet-Q. That will already give a huge improvement. I wouldn't recommend reducing cutoffs below 4A.

I would not recommend using direct models (GemNet-dT) for molecular dynamics, since they can lead to unstable trajectories. If your task is not simulation, though, then you can use a direct model to gain another 2-3x reduction in memory and runtime.

Embedding sizes and depth (num_blocks) would then be the third thing to consider. It's hard to give advice on which value presents the best trade-off, since they will likely all reduce accuracy. A first step might be to simply half all embedding sizes and use 3 blocks instead of 4. But probably you should not reduce anything below 16. Note that emb_size_quad, emb_size_sbf and emb_size_bil_quad are not relevant for GemNet-T. You can then increase embeddings again to see which ones are the most important for your application.

Overall, I think it should easily be possible to find well-performing settings for GemNet-T on proteins that fit in 45GiB.

smiles724 changed the title ~~Some questions regarding reducing the input size?~~ Some questions regarding reducing the size of both input and model? Feb 4, 2022

gasteigerjo closed this as completed Feb 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions regarding reducing the size of both input and model? #7

Some questions regarding reducing the size of both input and model? #7

smiles724 commented Feb 4, 2022

smiles724 commented Feb 4, 2022

gowithdaflo commented Feb 7, 2022

gasteigerjo commented Feb 8, 2022 •

edited

Loading

Some questions regarding reducing the size of both input and model? #7

Some questions regarding reducing the size of both input and model? #7

Comments

smiles724 commented Feb 4, 2022

smiles724 commented Feb 4, 2022

gowithdaflo commented Feb 7, 2022

gasteigerjo commented Feb 8, 2022 • edited Loading

gasteigerjo commented Feb 8, 2022 •

edited

Loading