lab-cosmo · frostedoyster · Jul 15, 2024 · Jul 12, 2024 · Jul 12, 2024 · Jul 12, 2024
diff --git a/README.rst b/README.rst
@@ -18,7 +18,7 @@ architecture.
 
 What is metatrain?
 ##################
-``metatrain`` is a command line interface (cli) to `train` and `evaluate` atomistic
+``metatrain`` is a command line interface (cli) to ``train`` and ``evaluate`` atomistic
 models of various architectures. It features a common ``yaml`` option inputs to
 configure training and evaluation. Trained models are exported as standalone files that
 can be used directly in various molecular dynamics (MD) engines (e.g. ``LAMMPS``,
@@ -30,7 +30,7 @@ that can be connected to an MD engine. Any custom architecture compatible with
 TorchScript_ can be integrated in ``metatrain``, gaining automatic access to a training
 and evaluation interface, as well as compatibility with various MD engines.
 
-Note: ``metatrain`` does not provide mathematical functionalities `per se` but relies on
+Note: ``metatrain`` does not provide mathematical functionalities *per se* but relies on
 external models that implement the various architectures.
 
 .. _TorchScript: https://pytorch.org/docs/stable/jit.html

diff --git a/docs/src/architectures/pet.rst b/docs/src/architectures/pet.rst
@@ -152,8 +152,8 @@ training dataset. All default values are given by atomic versions for better
 transferability across various datasets.
 
 To increase the step size of the learning rate scheduler by, for example, 2 times, take
-the default value for ``SCHEDULER_STEP_SIZE_ATOMIC`` from the default_hypers and specify
-a value that's twice as large.
+the default value for ``SCHEDULER_STEP_SIZE_ATOMIC`` from the default hypers and
+specify a value that's twice as large.
 
 It is worth noting that the stopping criterion of PET is either exceeding the maximum
 number of epochs (specified by ``EPOCH_NUM`` or ``EPOCH_NUM_ATOMIC``) or exceeding the
@@ -168,11 +168,11 @@ probability of achieving the best accuracy on a typical moderate-sized dataset.
 result, some default hyperparameters might be excessive, meaning they could be adjusted
 to significantly increase the model's speed with minimal impact on accuracy. For
 practical use, especially when conducting massive calculations where model speed is
-crucial, it may be beneficial to set ``N_TRANS_LAYERS`` to `2` instead of the default
-value of `3`. The ``N_TRANS_LAYERS`` hyperparameter controls the number of transformer
+crucial, it may be beneficial to set ``N_TRANS_LAYERS`` to ``2`` instead of the default
+value of ``3``. The ``N_TRANS_LAYERS`` hyperparameter controls the number of transformer
 layers in each message-passing block (see more details in the `PET paper
 <https://arxiv.org/abs/2305.19302>`_). This adjustment would result in a model that is
-about `1.5` times more lightweight and faster, with an expected minimal deterioration in
+about *1.5 times* more lightweight and faster, with an expected minimal deterioration in
 accuracy.
 
 Architecture Hyperparameters

diff --git a/docs/src/dev-docs/architecture-life-cycle.rst b/docs/src/dev-docs/architecture-life-cycle.rst
@@ -5,15 +5,15 @@ Life Cycle of an Architecture
 
 .. TODO: Maybe add a flowchart later
 
-Architectures in `metatrain` undergo different stages based on their
+Architectures in ``metatrain`` undergo different stages based on their
 development/functionality level and maintenance status. We distinguish three distinct
 stages: **experimental**, **stable**, and **deprecated**. Typically, an architecture
 starts as experimental, advances to stable, and eventually becomes deprecated before
 removal if maintenance is no longer feasible.
 
 .. note::
     The development and maintenance of an architecture must be fully undertaken by the
-    architecture's authors or maintainers. The core developers of `metatrain`
+    architecture's authors or maintainers. The core developers of ``metatrain``
     provide infrastructure and implementation support but are not responsible for the
     architecture's internal functionality or any issues that may arise therein.
 
@@ -47,7 +47,7 @@ satisfied:
 2. Comprehensive architecture documentation including a schema for verifying the
    architecture's hyperparameters.
 3. If an architecture has external dependencies, all must be publicly available on PyPI.
-4. Adherence to the standard output infrastructure of `metatrain`, including
+4. Adherence to the standard output infrastructure of ``metatrain``, including
    logging and model save locations.
 
 Deprecated Architectures

diff --git a/docs/src/dev-docs/cli/index.rst b/docs/src/dev-docs/cli/index.rst
@@ -12,7 +12,7 @@ the ``eval`` and the ``export`` functions of ``metatrain``.
    export
 
 We provide a custom formatter class for the formatting the help message of the
-`argparse` package.
+``argparse`` package.
 
 .. toctree::
    :maxdepth: 1

diff --git a/docs/src/dev-docs/index.rst b/docs/src/dev-docs/index.rst
@@ -1,7 +1,7 @@
 Developer documentation
 =======================
 
-This is a collection of documentation for developers of the `metatrain` package.
+This is a collection of documentation for developers of the ``metatrain`` package.
 It includes documentation on how to add a new model, as well as the API of the utils
 module.
 

diff --git a/docs/src/dev-docs/new-architecture.rst b/docs/src/dev-docs/new-architecture.rst
@@ -4,13 +4,14 @@ Adding a new architecture
 =========================
 
 This page describes the required classes and files necessary for adding a new
-architecture to `metatrain` as experimental or stable architecture as described on the
+architecture to ``metatrain`` as experimental or stable architecture as described on the
 :ref:`architecture-life-cycle` page. For **examples** refer to the already existing
 architectures inside the source tree.
 
-To work with `metatrain` any architecture has to follow the same public API to be called
-correctly within the :py:func:`metatrain.cli.train` function to process the user's
-options. In brief, the core of the ``train`` function looks similar to these lines
+To work with ``metatrain`` any architecture has to follow the same public API to be
+called correctly within the :py:func:`metatrain.cli.train` function to process the
+user's options. In brief, the core of the ``train`` function looks similar to these
+lines
 
 .. code-block:: python
 
@@ -30,6 +31,7 @@ options. In brief, the core of the ``train`` function looks similar to these lin
 
     trainer.train(
         model=model,
+        dtype=dtype,
         devices=[],
         train_datasets=[],
         val_datasets=[],
@@ -75,8 +77,8 @@ requirements to be stable. The usual structure of architecture looks as
 
 .. note::
     A new architecture doesn't have to be registered somewhere in the file tree of
-    `metatrain`. Once a new architecture folder with the required files is created
-    `metatrain` will include the architecture automatically.
+    ``metatrain``. Once a new architecture folder with the required files is created
+    ``metatrain`` will include the architecture automatically.
 
 Model class (``model.py``)
 --------------------------
@@ -118,8 +120,8 @@ Note that the ``ModelInterface`` does not necessarily inherit from
 :py:class:`torch.nn.Module` since training can be performed in any way.
 ``__supported_devices__`` and ``__supported_dtypes__`` can be defined to set the
 capabilities of the model. These two lists should be sorted in order of preference since
-`metatrain` will use these to determine, based on the user request and
-machines' availability, the optimal `dtype` and `device` for training.
+``metatrain`` will use these to determine, based on the user request and
+machines' availability, the optimal ``dtype`` and ``device`` for training.
 
 The ``export()`` method is required to transform a trained model into a standalone file
 to be used in combination with molecular dynamic engines to run simulations. We provide
@@ -141,6 +143,7 @@ methods for ``train()``, ``save_checkpoint()`` and ``load_checkpoint()``.
         def train(
             self,
             model: ModelInterface,
+            dtype: torch.dtype,
             devices: List[torch.device],
             train_datasets: List[Union[Dataset, torch.utils.data.Subset]],
             val_datasets: List[Union[Dataset, torch.utils.data.Subset]],
@@ -155,7 +158,7 @@ methods for ``train()``, ``save_checkpoint()`` and ``load_checkpoint()``.
         ) -> "TrainerInterface":
             pass
 
-The format of checkpoints is not defined by `metatrain` and can be any format that
+The format of checkpoints is not defined by ``metatrain`` and can be any format that
 can be loaded by the trainer (to restart training) and by the model (to export the
 checkpoint).
 
@@ -164,15 +167,15 @@ Init file (``__init__.py``)
 The names of the ``ModelInterface`` and the ``TrainerInterface`` are free to choose but
 should be linked to constants in the ``__init__.py`` of each architecture. On top of
 these two constants the ``__init__.py`` must contain constants for the original
-`__authors__` and current `__maintainers__` of the architecture.
+``__authors__`` and current ``__maintainers__`` of the architecture.
 
 .. code-block:: python
 
-    from .model import CustomSOTAModel
-    from .trainer import Trainer
+    from .model import ModelInterface
+    from .trainer import TrainerInterface
 
-    __model__ = CustomSOTAModel
-    __trainer__ = Trainer
+    __model__ = ModelInterface
+    __trainer__ = TrainerInterface
 
     __authors__ = [
         ("Jane Roe <jane.roe@myuniversity.org>", "@janeroe"),
@@ -207,7 +210,7 @@ required to improve usability. The default hypers must follow the structure
     training:
         ...
 
-`metatrain` will parse this file and overwrite these default hypers with the
+``metatrain`` will parse this file and overwrite these default hypers with the
 user-provided parameters and pass the merged ``model`` section as a Python dictionary to
 the ``ModelInterface`` and the ``training`` section to the ``TrainerInterface``.
 

diff --git a/docs/src/dev-docs/utils/data/readers.rst b/docs/src/dev-docs/utils/data/readers.rst
@@ -33,7 +33,7 @@ Target type specific readers
 ----------------------------
 
 :func:`metatrain.utils.data.read_targets` uses sub-functions to parse supported
-target properties like the `energy` or `forces`. Currently we support reading the
+target properties like the ``energy`` or ``forces``. Currently we support reading the
 following target properties via
 
 .. autofunction:: metatrain.utils.data.read_energy

diff --git a/docs/src/getting-started/custom_dataset_conf.rst b/docs/src/getting-started/custom_dataset_conf.rst
@@ -4,9 +4,9 @@ Customize a Dataset Configuration
 =================================
 Overview
 --------
-The main task in setting up a training procedure with `metatrain` is to provide
+The main task in setting up a training procedure with ``metatrain`` is to provide
 files for training, validation, and testing datasets. Our system allows flexibility in
-parsing data for training. Mandatory sections in the `options.yaml` file include:
+parsing data for training. Mandatory sections in the ``options.yaml`` file include:
 
 - ``training_set``
 - ``test_set``
@@ -78,16 +78,16 @@ A single string in this section automatically expands, using the string as the
 
 .. note::
 
-   `metatrain` does not convert units during training or evaluation. Units are
+   ``metatrain`` does not convert units during training or evaluation. Units are
    only required if model should be used to run MD simulations.
 
 Targets Section
 ^^^^^^^^^^^^^^^
 Allows defining multiple target sections, each with a unique name.
 
 - Commonly, a section named ``energy`` should be defined, which is essential for running
-  molecular dynamics simulations. For the ``energy`` section gradients like `forces` and
-  `stress` are enabled by default.
+  molecular dynamics simulations. For the ``energy`` section gradients like ``forces``
+  and ``stress`` are enabled by default.
 - Other target sections can also be defined, as long as they are prefixed by ``mtt::``.
   For example, ``mtt::free_energy``. In general, all targets that are not standard
   outputs of ``metatensor.torch.atomistic`` (see
@@ -137,7 +137,7 @@ without them.
 Multiple Datasets
 -----------------
 For some applications, it is required to provide more than one dataset for model
-training. `metatrain` supports stacking several datasets together using the
+training. ``metatrain`` supports stacking several datasets together using the
 ``YAML`` list syntax, which consists of lines beginning at the same indentation level
 starting with a ``"- "`` (a dash and a space)
 

diff --git a/docs/src/getting-started/usage.rst b/docs/src/getting-started/usage.rst
@@ -11,20 +11,20 @@ registered via the abbreviation ``mtt`` to your command line. The general help o
 
     mtt --help
 
-We now demonstrate how to `train` and `evaluate` a model from the command line. For this
-example we use the :ref:`architecture-soap-bpnn` architecture and a subset of the `QM9
-dataset <https://paperswithcode.com/dataset/qm9>`_. You can obtain the reduced dataset
-from our :download:`website <../../static/qm9/qm9_reduced_100.xyz>`.
+We now demonstrate how to ``train`` and ``evaluate`` a model from the command line. For
+this example we use the :ref:`architecture-soap-bpnn` architecture and a subset of the
+`QM9 dataset <https://paperswithcode.com/dataset/qm9>`_. You can obtain the reduced
+dataset from our :download:`website <../../static/qm9/qm9_reduced_100.xyz>`.
 
 Training
 ########
 
-To train models, `metatrain` uses a dynamic override strategy for your training
+To train models, ``metatrain`` uses a dynamic override strategy for your training
 options. We allow a dynamical composition and override of the default architecture with
 either your custom ``options.yaml`` and even command line override grammar. For
-reference and reproducibility purposes `metatrain` always writes the fully
+reference and reproducibility purposes ``metatrain`` always writes the fully
 expanded, including the overwritten option to ``options_restart.yaml``. The restart
-options file is written into a subfolder named with the current `date` and `time` inside
+options file is written into a subfolder named with the current *date* and *time* inside
 the ``output`` directory of your current training run.
 
 The sub-command to start a model training is
@@ -45,7 +45,7 @@ training using the default hyperparameters of an SOAP BPNN model
    :language: yaml
 
 For each training run a new output directory in the format
-``output/YYYY-MM-DD/HH-MM-SS`` based on the current `date` and `time` is created. We use
+``output/YYYY-MM-DD/HH-MM-SS`` based on the current *date* and *time* is created. We use
 this output directory to store checkpoints, the ``train.log`` log file  as well the
 restart ``options_restart.yaml`` file. To start the training create an ``options.yaml``
 file in the current directory and type
@@ -64,7 +64,7 @@ The sub-command to evaluate an already trained model is
 
     mtt eval
 
-Besides the trained `model`, you will also have to provide a file containing the
+Besides the trained ``model``, you will also have to provide a file containing the
 system and possible target values for evaluation. The system of this ``eval.yaml``
 is exactly the same as for a dataset in the ``options.yaml`` file.
 

diff --git a/examples/programmatic/llpr/llpr.py b/examples/programmatic/llpr/llpr.py
@@ -51,7 +51,7 @@
 from metatrain.utils.neighbor_lists import get_system_with_neighbor_lists  # noqa: E402
 
 
-qm9_systems = read_systems("qm9_reduced_100.xyz", dtype=torch.float64)
+qm9_systems = read_systems("qm9_reduced_100.xyz")
 
 target_config = {
     "energy": {
@@ -65,7 +65,7 @@
         "virial": False,
     },
 }
-targets, _ = read_targets(target_config, dtype=torch.float64)
+targets, _ = read_targets(target_config)
 
 requested_neighbor_lists = model.requested_neighbor_lists()
 qm9_systems = [
@@ -77,7 +77,7 @@
 # We also load a single ethanol molecule on which we will compute properties.
 # This system is loaded without targets, as we are only interested in the LPR
 # values.
-ethanol_system = read_systems("ethanol_reduced_100.xyz", dtype=torch.float64)[0]
+ethanol_system = read_systems("ethanol_reduced_100.xyz")[0]
 ethanol_system = get_system_with_neighbor_lists(
     ethanol_system, requested_neighbor_lists
 )

diff --git a/pyproject.toml b/pyproject.toml
@@ -96,6 +96,9 @@ source = [
     ".tox/*/lib/python*/site-packages/metatrain"
 ]
 
+[tool.black]
+exclude = 'docs/src/examples'
+
 [tool.isort]
 skip = "__init__.py"
 profile = "black"
@@ -106,4 +109,8 @@ lines_after_imports = 2
 known_first_party = "metatrain"
 
 [tool.mypy]
+exclude = [
+    "docs/src/examples"
+]
+follow_imports = 'skip'
 ignore_missing_imports = true
diff --git a/src/metatrain/cli/eval.py b/src/metatrain/cli/eval.py
@@ -167,8 +167,10 @@ def _eval_targets(
         system = sample["system"]
         get_system_with_neighbor_lists(system, model.requested_neighbor_lists())
 
-    # Infer the device from the model
-    device = next(itertools.chain(model.parameters(), model.buffers())).device
+    # Infer the device and dtype from the model
+    model_tensor = next(itertools.chain(model.parameters(), model.buffers()))
+    dtype = model_tensor.dtype
+    device = model_tensor.device
 
     # Create a dataloader
     dataloader = torch.utils.data.DataLoader(
@@ -188,9 +190,10 @@ def _eval_targets(
     # Evaluate the model
     for batch in dataloader:
         systems, batch_targets = batch
-        systems = [system.to(device=device) for system in systems]
+        systems = [system.to(dtype=dtype, device=device) for system in systems]
         batch_targets = {
-            key: value.to(device=device) for key, value in batch_targets.items()
+            key: value.to(dtype=dtype, device=device)
+            for key, value in batch_targets.items()
         }
         batch_predictions = evaluate_model(model, systems, options, is_training=False)
         batch_predictions = average_by_num_atoms(
@@ -238,10 +241,6 @@ def eval_model(
     """
     logger.info("Setting up evaluation set.")
 
-    # TODO: once https://github.com/lab-cosmo/metatensor/pull/551 is merged and released
-    # use capabilities instead of this workaround
-    dtype = next(model.parameters()).dtype
-
     if isinstance(output, str):
         output = Path(output)
 
@@ -258,13 +257,12 @@ def eval_model(
         eval_systems = read_systems(
             filename=options["systems"]["read_from"],
             reader=options["systems"]["reader"],
-            dtype=dtype,
         )
 
         if hasattr(options, "targets"):
             # in this case, we only evaluate the targets specified in the options
             # and we calculate RMSEs
-            eval_targets, eval_info_dict = read_targets(options["targets"], dtype=dtype)
+            eval_targets, eval_info_dict = read_targets(options["targets"])
         else:
             # in this case, we have no targets: we evaluate everything
             # (but we don't/can't calculate RMSEs)