Skip to content

Conversation

@frostedoyster
Copy link
Contributor

@frostedoyster frostedoyster commented Apr 9, 2025

Hi everyone and thanks for the great work on the library. Also, let us know if you have feedback on https://github.com/Luthaf/vesin!

Summary

Checklist

Before a pull request can be merged, the following items must be checked:

  • Doc strings have been added in the Numpy docstring format.
    Run ruff on your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply run
pre-commit install and a check will be run prior to allowing commits.

@cla-bot
Copy link

cla-bot bot commented Apr 9, 2025

We require contributors to sign our Contributor License Agreement (CLA), and we don't have record of your signature. In order for us to review and merge your code, please sign the CLA.

Copy link
Member

@CompRhys CompRhys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start, I would be a little bit cautious about including the batched autograd code in this repo for forces and stresses as it seems as though it could differ for future models and "MetaTensor" seems as though it may be a collection of models in the future c.f. fairchem or graphpes and so it may be more maintainable if that logic lives in your repo? That wouldn't be a reason not to merge but worth some thought.

@cla-bot cla-bot bot added the cla-signed Contributor license agreement signed label Apr 10, 2025
@frostedoyster
Copy link
Contributor Author

I would be a little bit cautious about including the batched autograd code in this repo for forces and stresses as it seems as though it could differ for future models and "MetaTensor"

This is a conscious design decision that was made in metatensor, of course with its advantages and disadvantages, but meaning that any model will conform to it. The upsides are that, while the model can calculate its own forces at training time, the computation of the forces is done outside of it at evaluation time, ensuring that the forces are computed as the derivative of the energies to prevent e.g. incorrect MD. I agree with you that ideally we would like to make a single function inside metatensor that computes forces and stresses from energies in this way, but it's also true that the same function would only contain two/three lines (an autograd call, a division by the volume, and perhaps a concatenation). This is why we haven't done it yet and generally prefer having the calling code do it instead. Of course, we will update this interface once this function is added, although this might take a while as we are going through a sizeable refactoring effort.

Comment on lines 206 to 210
if not neighbor_list_options.full_list:
raise ValueError(
"Neighbor list options must be full for metatensor models to "
"be used in torchsim."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because neighbor_list_fn can only return full NL? I think we should still support half NL, because that's still a relevant use case for some model. I can help with an implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as far as I understand torch-sim has an interface for the NLs which doesn't support half NLs. Currently, the only backend for this interface seems to be vesin-torch. @CompRhys please correct me if I'm wrong. Anyway, a full NL should cover >90% of models in practice

Copy link
Collaborator

@abhijeetgangan abhijeetgangan Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the default neighbor list is vesin-torch but you can change that. All of them are tested against ase neighborlist implement. You can turn off the full NL here https://github.com/Radical-AI/torch-sim/blob/c79fc6591802df53a5fd0869b96715826ef0d09e/torch_sim/neighbors.py#L547

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can models request a different NL from torch-sim? Or is the API of neighbor_list_fn defined around full NL?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can request a different NL when you initialize the model. For example, in MACE https://github.com/Radical-AI/torch-sim/blob/c79fc6591802df53a5fd0869b96715826ef0d09e/torch_sim/models/mace.py#L105

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I updated this!

Co-authored-by: Guillaume Fraux <luthaf@luthaf.fr>
Signed-off-by: Filippo Bigi <98903385+frostedoyster@users.noreply.github.com>
@frostedoyster frostedoyster force-pushed the main branch 2 times, most recently from 91c5b25 to e51eadd Compare April 10, 2025 10:01
Copy link
Collaborator

@orionarcher orionarcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @frostedoyster, this will be very nice interface to have.

Couple comments just to make sure the interface behaves as intended.

For added context, I want to point you to our Current Posture on Model Interfaces, just to share how we are thinking about supporting models over time.

)

# Calculate neighbor list(s) for this system
for neighbor_list_options in self._requested_neighbor_lists:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all models in metatensor use a radial cutoff for the neighbor list?

If not, consider setting a dynamic memory_scales_with @propery. The default memory_scales_with from ModelInterface will return "n_atoms_x_density", which is only correct for models with radial cutoffs.

For max-neighbors based cutoffs, the memory scaler will be incorrect and the autobatching will subtly break. See here for a more detailed discussion.

Copy link
Contributor Author

@frostedoyster frostedoyster Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at the moment this interface only supports a fixed cutoff radius, although metatensor itself would allow for a clamped number of neighbors by simply feeding a filtered neighbor list. However, we have never tested this possibility in training or in production, and therefore I would keep it as is for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also the possibility for model to set interaction_range to infinity, which is used for long-range models. I don't know if these would work with the current code, I'll check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model scaling can be handled manually for edge cases, so I am happy leaving it as n_atoms_x_density

frostedoyster and others added 2 commits April 10, 2025 21:21
Signed-off-by: Filippo Bigi <98903385+frostedoyster@users.noreply.github.com>
@frostedoyster
Copy link
Contributor Author

frostedoyster commented Apr 11, 2025

Sorry for all the failures:

  • the formatting pre-commit hook wasn't working for me, it should be good now
  • the tests are working locally for me, using a fresh environment and the current imports. There was a small syntax issue in my test.yaml, should be good now
  • I can't get the docs to build locally with any torch version, due to MACE failing to load. I would need some help with that, if possible

@cla-bot
Copy link

cla-bot bot commented Apr 11, 2025

We require contributors to sign our Contributor License Agreement (CLA), and we don't have record of your signature. In order for us to review and merge your code, please sign the CLA.

@cla-bot cla-bot bot removed the cla-signed Contributor license agreement signed label Apr 11, 2025
@Luthaf
Copy link
Contributor

Luthaf commented Apr 11, 2025

Err, I already signed the CLA, and if I try to sign it again I get a Record of your CLA already exists. error message.

@CompRhys
Copy link
Member

Err, I already signed the CLA, and if I try to sign it again I get a Record of your CLA already exists. error message.

The api that the bot checks doesn't have a signed CLA associated with the Github account "Luthaf", I don't have deeper access to check the actual records. Could you try again to register the github username, if that doesn't work we will debug further.

@Luthaf
Copy link
Contributor

Luthaf commented Apr 11, 2025

I might have registered as luthaf instead of Luthaf, but most of the github-related things accept either. Trying to register again gives me the Record of your CLA already exists error.

@CompRhys
Copy link
Member

Sorry for all the failures:

  • the formatting pre-commit hook wasn't working for me, it should be good now
  • the tests are working locally for me, using a fresh environment and the current imports. There was a small syntax issue in my test.yaml, should be good now
  • I can't get the docs to build locally with any torch version, due to MACE failing to load. I would need some help with that, if possible

We're very glad to see contributions and there's always some teething issues trying to get new processes to work smoothly!
Getting the doc's to build cleanly for contributions is something that is a learning curve for all of us.

The SiO2 rattled structure test to test a high force configuration seems to be problematic on the Github runners for some models. We see quite big differences for MACE in CI when tests pass locally. Is that the same with PET-MAD here?

@Luthaf
Copy link
Contributor

Luthaf commented Apr 11, 2025

The test errors on CI seems to only happen on ARM macos + float32 dtype. The tests are fine on ubuntu

I can also reproduce the failure locally on another ARM mac. So I'd guess that something makes the model not work quite as well on macos+float32 (maybe a different BLAS implementation?), which the CI is showing us.

I'll see what's the best way to fix that with @frostedoyster

@CompRhys
Copy link
Member

I might have registered as luthaf instead of Luthaf, but most of the github-related things accept either. Trying to register again gives me the Record of your CLA already exists error.

it's been updated it to be case insensitive, the next push to this branch should now add the signed label back

@cla-bot cla-bot bot added the cla-signed Contributor license agreement signed label Apr 12, 2025
Copy link
Member

@CompRhys CompRhys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Fixed the docs issue, c.f. step 7 docs/dev/add_model.md.
  • Fixed my incorrect rattling code to only use weibull for scale.

LGTM.

@frostedoyster
Copy link
Contributor Author

Thanks for the help! Just one last comment: you might want to update this line in the top-level readme to reflect all the new models that you've added recently

Support for MACE, Fairchem, and SevenNet MLIP models with more in progress

Copy link
Collaborator

@orionarcher orionarcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I updated the README as suggested. Thanks for the contribution! I'll merge when tests are passing.

@orionarcher orionarcher merged commit 8437e6d into TorchSim:main Apr 13, 2025
88 of 89 checks passed
orionarcher added a commit that referenced this pull request Apr 13, 2025
@orionarcher
Copy link
Collaborator

orionarcher commented Apr 13, 2025

Looking back at the discussion, I am realizing I may have been a bit trigger happy in merging this, apologies. Even though the docs build worked once, it's now failing in main. Should I revert this or would you prefer to take care of any remaining issues in a new PR?

EDIT: I'd suggest just opening a new PR and we can merge that in or make more changes as needed, sorry for confusion.

orionarcher added a commit that referenced this pull request Apr 13, 2025
@orionarcher orionarcher mentioned this pull request Apr 14, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed Contributor license agreement signed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants