CGSchNet support #21

jchodera · 2020-11-19T15:35:01Z

@peastman: Would love to see if we could support the CGSchNet model described in this excellent paper from @brookehus, since this could allow us to support much larger coarse-grained models as part of our ML integration.

peastman · 2020-11-19T22:11:29Z

Is there really any difference from SchNet? The cfconv layer seems to be nearly identical. The only difference they mention is that they change the activation function from softplus to tanh, and they don't give any explanation for why they made that change. (It also seems like a doubtful choice, since bounded activation functions often don't work as well as unbounded ones for hidden layers.)

jchodera · 2020-11-19T22:28:59Z

I'm not quite sure!

BTW, we've recently noticed that ANI uses CELU, which is not C2-continuous. This causes significant problems with some optimizers, and in principle shouldn't be used with MD. They're now retraining with softplus right now. But we should double-check the activation functions we implement are all C2-continuous.

peastman · 2020-11-19T22:48:22Z

Agreed!

brookehus · 2020-11-20T06:14:26Z

there's no difference from SchNet in terms of the structure! the cgnet code just allows for the activation function to be changed easily in contrast w/ (e.g.) schnetpack (and also has other modularities like writing your own normalization scheme; in general i support a software that easily allows for swapping out these kinds of things, but since peter's influenced a lot of my coding i doubt this is news)
@peastman re: doubtful choices, i believe we did obtain lower losses with shifted softplus, which is what canonical schnet uses, and what one would expect, as you point out. however, the simulations resulting from models trained with SSP were problematic (very rugged, iirc), especially as system size increases. we looked into it, and this seems to have to do with whether the activation fxn saturates or not. perhaps stronger regularization or model averaging could offset this instead of or in addition to the activation fxn switch. @nec4, feel free to add your knowledge here

nec4 · 2020-11-20T13:04:22Z

@peastman : @brookehus gave a good summary. For our systems and CG mapping choices, we found that using Tanh() as an activation function in place of (shifted) Softplus() resulted in more stable simulations using trained models. Of course, there is no problem for users to try any activation function that they wish.

peastman · 2020-11-20T17:55:31Z

Ok, I can add the option to use tanh instead. Can you look at the API in #18 and see if otherwise it looks good for your purposes? Since it's based on handwritten CUDA kernels, it obviously won't be as flexible as pure PyTorch code.

brookehus · 2020-11-27T05:43:06Z

@nec4 do you mind taking a look when you have time? let me know if you need anything

nec4 · 2020-11-27T15:05:47Z

Of course - I will take a look. I'll let you guys know if there is any outstanding issue with our purposes with regard to the API. I will probably get back to you guys sometime early next week.

nec4 · 2020-11-30T17:23:03Z

Hello! - from what I see, the only differences from whats in #18 and our code is the fact that we currently don't use neighbor cutoffs in our models (although we allow for the option of a simple neighbor list) and we have not implemented cosine cutoffs - though in principle these features may be useful to us in the future. Additionally, we have found (following the original schnet paper) that normalizing the output of the CfConv by the number of beads/atoms results in improved performance when using the network generatively (eg, calculating forces for simulations). If there is more to discuss, please let me know!

peastman · 2020-12-03T19:23:58Z

we have not implemented cosine cutoffs - though in principle these features may be useful to us in the future.

Is that because you specifically don't want the cosine cutoff function, or just because you haven't gotten to implementing it yet? In other words, is it a problem that the current implementation includes it?

normalizing the output of the CfConv by the number of beads/atoms results in improved performance

Is that just a scaling of the output? Or does it require changes to the cfconv kernel itself?

nec4 · 2020-12-03T21:22:49Z

@peastman: For the cutoff, we just never needed it/used it - I don't think that the current implementation including it is necessarily a problem (though I have not tried it in any of my models, so I cannot say for certain). For the scaling, it is just simple scalar normalization; i,e, we just divide the final output of the CfConv by the number of beads in the system - so I don't think it requires changes to the kernel itself (you could just put another block after the CfConv layer that performs a simple scaling).

peastman · 2020-12-03T21:57:10Z

Ok, thanks. So it sounds like the only feature I need to add is the option to use tanh.

nec4 · 2020-12-04T03:16:01Z

@peastman: I think so too! Let us know if there is anymore information we can provide.

peastman mentioned this issue Dec 9, 2020

Added option to use tanh activation #23

Merged

raimis added the question Further information is requested label May 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CGSchNet support #21

CGSchNet support #21

jchodera commented Nov 19, 2020

peastman commented Nov 19, 2020

jchodera commented Nov 19, 2020

peastman commented Nov 19, 2020

brookehus commented Nov 20, 2020

nec4 commented Nov 20, 2020

peastman commented Nov 20, 2020

brookehus commented Nov 27, 2020

nec4 commented Nov 27, 2020

nec4 commented Nov 30, 2020

peastman commented Dec 3, 2020

nec4 commented Dec 3, 2020

peastman commented Dec 3, 2020

nec4 commented Dec 4, 2020

CGSchNet support #21

CGSchNet support #21

Comments

jchodera commented Nov 19, 2020

peastman commented Nov 19, 2020

jchodera commented Nov 19, 2020

peastman commented Nov 19, 2020

brookehus commented Nov 20, 2020

nec4 commented Nov 20, 2020

peastman commented Nov 20, 2020

brookehus commented Nov 27, 2020

nec4 commented Nov 27, 2020

nec4 commented Nov 30, 2020

peastman commented Dec 3, 2020

nec4 commented Dec 3, 2020

peastman commented Dec 3, 2020

nec4 commented Dec 4, 2020