Atomistic models based on metatensor-torch #405

Luthaf · 2023-10-20T14:54:21Z

This PR contains all the code required to define, export, load and validate arbitrary atomistic models based on metatensor-torch. The user has to provide a TorchScript-compatible torch.nn.Module, with the following signature:

def forward(self, system: System, run_options: ModelRunOptions) -> Dict[str, TensorBlock]:
    ...

The System contains positions, cell and the neighbors lists requested by any submodules, which are requested by defining a requested_neighbors_lists function (this allow e.g. rascaline to request a NL without the end module knowing about it):

def requested_neighbors_lists(self) -> List[NeighborsListOptions]:
    ...

The ModelRunOptions is what the engine wants; in particular it contains a list of ModelOutput (the model forward function can return multiple outputs, and the MD engine should request the ones it wants) and the set of selected_atoms on which to run the calculation.

When exporting a model, the user should use ModelCapabilities to declare what the model is able to do, as well as the units it uses as input & output. MetatensorAtomisticModule then does unit conversion between what the engine provides and the model wants for input; and what the engine wants and the model provides on output.

Still TBD:

Python side documentation
Commented example on how to use this API
Decide on the naming for the new metadata classes (ModelCapabilities, ModelOutput, ModelRunOptions)
Define a standard for metatensor metadata of some outputs: what should be the samples/components/properties names and values for the energy output, for the dipole output, …
Can a model provide both per-atom and per-structure of the same quantity? How would this look like?
Document the standard above
Record the torch extension and torch version used when saving the model
~~Add a way to profile model execution time~~: will be done in a later PR
Provide a function to connect neighbors lists distances with positions & cell in the computational graph

📚 Documentation preview 📚 https://metatensor--405.org.readthedocs.build/en/405/

PicoCentauri

I have to wrap my head around this. The syntax and the logic, but could be very good.

I have some first specific comments in the units.

python/metatensor-torch/metatensor/torch/atomistic/units.py

PicoCentauri · 2023-10-23T11:28:27Z

python/metatensor-torch/metatensor/torch/atomistic/model.py

+    _requested_neighbors_lists: List[NeighborsListOptions]
+    _known_quantities: Dict[str, Quantity]
+
+    def __init__(self, module: torch.nn.Module, capabilities: ModelCapabilities):


capabilities is something like the target? i.e. "forces", "dipole moments", "partial charges"?

capabilities.outputs contains the different targets the model is able to produce. capabilities contains other information about the model (unit used as input, species it can handle).

Maybe we should rename this to ModelDefinition or something, and also store in there the model authors, papers to cite, date of training, etc.

I also stumbled on this term, I thought about ModelConfig, but the term does not capture really that it includes the model outputs.

Maybe we should rename this to ModelDefinition or something, and also store in there the model authors, papers to cite, date of training, etc.

I like ModelDefinition, then this would include a metainformation variable of type a Dict[str, str] ?

I would prefer to give more structure to the data, and have multiple fields of type str instead of a dict. We could have a other field if people want to store more data in there, but I would start without this for now.

Maybe we could consider metadata that is included in models from PyTorch Hub or hugging face
https://pytorch.org/docs/stable/hub.html#torch.hub.load
https://huggingface.co/docs/transformers/v4.34.1/en/model_doc/auto#transformers.AutoModel.from_pretrained

I identify these parameters

provider hub platform (e.g. "Hugging Face")

repo_url (e.g. "https://camembert-model.fr")

model_name (e.g. "camenbert")

model_checkpoint (e.g. "camembert-large")

Might be also useful to consider, the metadata that is contained in ONNX models https://onnx.ai/onnx/intro/concepts.html#metadata I identified these as possibly useful

producer_version: The version of the generating tool.
like version of the model or commit ID

model_version: The version of the model itself, encoded in an integer.
I think this as basically what model_checkpoint above is expressing, just that they express it as a string instead of a number. If you look at models from the computer vision community https://pytorch.org/vision/stable/models.html it also makes more sense. For example the first model the model_name would be AlexNet and the version would be AlexNet_Weights.IMAGENET1K_V1. I think that would be good enough to track models. Sometimes they use the _VX at the end to describe that the training procedure changed a bit, but sometimes they change the name at the beginning to do that. It is a mess.

model_license: The well-known name or URL of the license under which the model is made available.
Not sure if this makes sense, as the license should be stored in the repo where the model comes from I would say

doc_string: Human-readable documentation for this model

I'm not sold on the hugging face style metadata, it feels more related to a full repository of models instead of a single one. The ONNX metatadata makes a lot more sense to me.

We currently collect some minimal metadata, which should be expanded with stuff like this. If you agree @agoscinski I would leave the metadata definition to a later PR?

frostedoyster · 2023-11-22T05:33:12Z

Do I understand correctly that one "System" is one structure?
How much work would it be to change the signature to the following?

def forward(self, systems: List[System], run_options: ModelRunOptions) -> Dict[str, TensorBlock]:

This would allow a single model to be trained and exported, since the forward needs to take in multiple structures during training. At the moment, I can only see this working if, after having trained the model, you manually convert it to a different class that has the system: System signature as the forward function, unless I'm missing something.
For now, we could have the interface fail if len(systems) != 1, and in the future this would facilitate PIMD and other techniques where the MD engine can ask for multiple evaluations at the same time.
EDIT: Otherwise, would the current interface work for

def forward(self, systems: Union[System, List[System]], run_options: ModelRunOptions) -> Dict[str, TensorBlock]:

Luthaf · 2023-11-22T14:31:07Z

We had a couple of discussion on taking a single system or multiple, but I can't remember the argument for going with a single one. I agree having more than one system is quite useful here, I'll give it a go and check if everything works the same.

python/metatensor-torch/metatensor/torch/atomistic/ase_calculator.py

Luthaf · 2023-12-11T15:06:45Z

Regarding example: I'd prefer to leave them to a separate PR. I already have a branch, but it's going to take a lot more work, so I would rather merge this without waiting for the examples to be done.

… a model

PicoCentauri

I am happy.

There is not much metatdata to store for these, so turning them to pure Tensor makes the code easier to use. It will also allow to replace rascaline.torch.System with this class.

- take multiple systems instead of a single one - return dict of TensorMap instead of TensorBlock (for equivariant targets) - separate outputs and selected_atoms into two arguments

This allow us to use rewritten asserts in the tests, and get nicer error messages on test failure

Luthaf · 2023-12-13T16:23:51Z

I found another small bug, I'll wait for CI to pass & then this should be good to go!

Also make sure to call it for neighbors list inside the ASE calculator

Luthaf · 2023-12-13T16:36:52Z

CI is hitting a bug in CMake: https://discourse.cmake.org/t/3-28-segmentation-fault-on-macos-11-runner/9588. I'll give them a day to fix it before trying force usage of a different cmake version.

Version 3.28.0 has a miscompilation issue and segfaults in some cases

Luthaf force-pushed the atomistic-models branch 2 times, most recently from 4ae3959 to 07fea5e Compare October 20, 2023 16:07

PicoCentauri reviewed Oct 23, 2023

View reviewed changes

Luthaf force-pushed the atomistic-models branch 2 times, most recently from c5db0b1 to dd04da5 Compare October 26, 2023 13:39

Luthaf mentioned this pull request Oct 26, 2023

WIP: pair_style metatensor Luthaf/lammps#1

Closed

8 tasks

Luthaf force-pushed the atomistic-models branch 3 times, most recently from 3424c94 to 4900992 Compare October 27, 2023 14:11

Luthaf mentioned this pull request Oct 27, 2023

bond environments: bond-atom neighbouring Luthaf/rascaline#240

Open

Luthaf force-pushed the atomistic-models branch 2 times, most recently from fbff672 to 7d095c9 Compare October 27, 2023 15:06

Luthaf mentioned this pull request Oct 30, 2023

Use rascaline calculators inside metatensor's atomistic models Luthaf/rascaline#253

Merged

Luthaf force-pushed the atomistic-models branch from f292a97 to 0d2c141 Compare November 9, 2023 15:07

Luthaf force-pushed the atomistic-models branch 2 times, most recently from 2629d0b to 44758e4 Compare November 17, 2023 16:34

Luthaf force-pushed the atomistic-models branch 5 times, most recently from 1f2cb79 to cdbbd7c Compare November 23, 2023 15:31

Luthaf mentioned this pull request Nov 27, 2023

How will we interface with different molecular dynamics (MD) codes? lab-cosmo/equisolve#42

Open

abmazitov mentioned this pull request Nov 29, 2023

metatensor-learn - the building blocks sub-repo for atomistic machine learning models #426

Merged

4 tasks

Luthaf mentioned this pull request Dec 1, 2023

Add SOAP-BPNN lab-cosmo/metatrain#7

Merged

Luthaf force-pushed the atomistic-models branch 2 times, most recently from 0e3348e to 939275e Compare December 5, 2023 15:53

Luthaf commented Dec 6, 2023

View reviewed changes

python/metatensor-torch/metatensor/torch/atomistic/ase_calculator.py Show resolved Hide resolved

Luthaf force-pushed the atomistic-models branch from cb06e4e to 9acf895 Compare December 6, 2023 14:51

Luthaf force-pushed the atomistic-models branch from dcbd458 to e6b8823 Compare December 11, 2023 15:37

Luthaf requested a review from PicoCentauri December 12, 2023 15:57

This was referenced Dec 12, 2023

Atomistic models follow-up #430

Open

Add tutorials for atomistic models #431

Merged

Luthaf added 4 commits December 13, 2023 14:25

Implement an ASE calculator using MetatensorAtomisticModule

1289fad

Record who requested a given neighbors list

56a8de8

Record and check the metatensor & torch versions when loading a model

1c3b5b2

Record and check as best we can the set of PyTorch extensions used in…

6656ad7

… a model

Luthaf force-pushed the atomistic-models branch from e6b8823 to 88ae5aa Compare December 13, 2023 13:25

PicoCentauri approved these changes Dec 13, 2023

View reviewed changes

Luthaf added 8 commits December 13, 2023 16:45

Refactor System to use pure torch.Tensor for species/positions/cell

8525994

There is not much metatdata to store for these, so turning them to pure Tensor makes the code easier to use. It will also allow to replace rascaline.torch.System with this class.

Change the model API to allow training with the same interface

2be9c9d

- take multiple systems instead of a single one - return dict of TensorMap instead of TensorBlock (for equivariant targets) - separate outputs and selected_atoms into two arguments

Add documentation for standard outputs

dcf56ff

Use Labels for selected atoms

f1e8668

Facility to integrate existing neighbors list with torch autograd

39ea164

Manually add PYTEST_DONT_REWRITE to torch.autograd.gradcheck

8f9116b

This allow us to use rewritten asserts in the tests, and get nicer error messages on test failure

Add System.__str__ and System.__repr__

3f1bc37

Reorganize documentation to have atomistic models at the top level

19adb7a

frostedoyster approved these changes Dec 13, 2023

View reviewed changes

Luthaf force-pushed the atomistic-models branch from 88ae5aa to a8be7d6 Compare December 13, 2023 16:23

Make the unit conversion TorchScript compatible

1826005

Also make sure to call it for neighbors list inside the ASE calculator

Luthaf force-pushed the atomistic-models branch from a8be7d6 to 1826005 Compare December 13, 2023 16:32

Pin to an older version of CMake

f0dd3fa

Version 3.28.0 has a miscompilation issue and segfaults in some cases

Luthaf force-pushed the atomistic-models branch from b186c08 to f0dd3fa Compare December 14, 2023 10:53

Luthaf merged commit e788e67 into lab-cosmo:master Dec 14, 2023
27 checks passed

Luthaf deleted the atomistic-models branch December 14, 2023 11:17

Luthaf mentioned this pull request Feb 2, 2024

Implement todict and fromdict for ASE calculator #472

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atomistic models based on metatensor-torch #405

Atomistic models based on metatensor-torch #405

Luthaf commented Oct 20, 2023 •

edited

PicoCentauri left a comment

PicoCentauri Oct 23, 2023

Luthaf Oct 23, 2023

agoscinski Oct 24, 2023 •

edited

Luthaf Oct 24, 2023

agoscinski Oct 25, 2023

agoscinski Oct 26, 2023

Luthaf Oct 26, 2023

Luthaf Dec 6, 2023

frostedoyster commented Nov 22, 2023 •

edited

Luthaf commented Nov 22, 2023

Luthaf commented Dec 11, 2023

PicoCentauri left a comment

Luthaf commented Dec 13, 2023

Luthaf commented Dec 13, 2023

Atomistic models based on metatensor-torch #405

Atomistic models based on metatensor-torch #405

Conversation

Luthaf commented Oct 20, 2023 • edited

PicoCentauri left a comment

Choose a reason for hiding this comment

PicoCentauri Oct 23, 2023

Choose a reason for hiding this comment

Luthaf Oct 23, 2023

Choose a reason for hiding this comment

agoscinski Oct 24, 2023 • edited

Choose a reason for hiding this comment

Luthaf Oct 24, 2023

Choose a reason for hiding this comment

agoscinski Oct 25, 2023

Choose a reason for hiding this comment

agoscinski Oct 26, 2023

Choose a reason for hiding this comment

Luthaf Oct 26, 2023

Choose a reason for hiding this comment

Luthaf Dec 6, 2023

Choose a reason for hiding this comment

frostedoyster commented Nov 22, 2023 • edited

Luthaf commented Nov 22, 2023

Luthaf commented Dec 11, 2023

PicoCentauri left a comment

Choose a reason for hiding this comment

Luthaf commented Dec 13, 2023

Luthaf commented Dec 13, 2023

Luthaf commented Oct 20, 2023 •

edited

agoscinski Oct 24, 2023 •

edited

frostedoyster commented Nov 22, 2023 •

edited