-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are discrepancies between old m3gnet repo expected? #64
Comments
How different are the atomic positions? While we adopted mostly the same training protocols, the pre-trained M3GNet in this repo is not an exact replica of the previous M3GNet-TF. I would expect the differences in atomic positions to not be large. Energy errors within the MAE of the potentials (30-40 meV/atom) are not surprising. |
I should add that there is no easy way to port model weights directly over from TF to DGL/Pytorch. So that's why we had to retrain. In any case, this is a baseline model (just to make sure we are reproducing the broad error characteristics of the TF version) and we will provide improved models as we go along. |
|
Yeah for the atomic positions, we usually get to within 1% of the DFT. So I would expect the deviation in atomic positions to be less significant (but not below noise level). There are definite uncertainties in the energies. Better for some systems (e.g., oxides) but worse for others. |
I have redone the cubic crystal test (see examples) with the new matgl implementation. The error characteristics are largely similar to the old m3gnet. We did discover some minor data issues and the new M3gnet is fitted with further filtered data (e.g., some problematic structures with very large forces were removed). So again, not an exact replica of the TF M3GNet but basically similar performance-wise. I will close this issue but feel free to reopen if you discover any serious issues with the new implementation. |
@kenko911 Can provide further details on the additional filtering done. Pls write it in the README. |
Do I understand correctly that the new architecture of the model is also slightly different (the number of parameters seems to be different). Can any details be given about this? It might be relevant. |
The differences are relatively minor. The embedding sizes etc. are all the same. The only slight difference is in the length of the bond expansion I believe. Otherwise, the activation, optimizers, etc. are all the same. It is not possible to exactly replicate the old model given we are moving to an entirely different code base. But this is pretty close. Our focus is on improving the models going forward and this model is just a baseline. |
I'm comparing results between the pretrained m3gnet in this repo and in the original m3gnet repo, and for some of my structures I am finding pretty large discrepancies. Is this expected? For example, for this structure:
I get a difference of 40 meV/atom in expected energy, and different atomic positions as well.
The text was updated successfully, but these errors were encountered: