openkim · mjwen · Jan 29, 2024 · Jan 3, 2024 · Jan 8, 2024 · Jan 8, 2024
diff --git a/README.md b/README.md
@@ -1,105 +1,42 @@
-# KIM-based Learning-Integrated Fitting Framework (KLIFF)
+# KIM-based Learning-Integrated Fitting Framework (KLIFF) version 1.0
 
-[![Build Status](https://travis-ci.com/openkim/kliff.svg?branch=master)](https://travis-ci.com/openkim/kliff)
-[![Python package](https://github.com/openkim/kliff/workflows/Python%20package/badge.svg)](https://github.com/openkim/kliff/actions)
-[![Documentation Status](https://readthedocs.org/projects/kliff/badge/?version=latest)](https://kliff.readthedocs.io/en/latest/?badge=latest)
-[![Anaconda-Server Badge](https://img.shields.io/conda/vn/conda-forge/kliff.svg)](https://anaconda.org/conda-forge/kliff)
-[![PyPI](https://img.shields.io/pypi/v/kliff.svg)](https://pypi.python.org/pypi/kliff)
+[//]: # ([![Build Status]&#40;https://travis-ci.com/openkim/kliff.svg?branch=master&#41;]&#40;https://travis-ci.com/openkim/kliff&#41;)
 
-### Documentation at: <https://kliff.readthedocs.io>
+[//]: # ([![Python package]&#40;https://github.com/openkim/kliff/workflows/Python%20package/badge.svg&#41;]&#40;https://github.com/openkim/kliff/actions&#41;)
 
-KLIFF is an interatomic potential fitting package that can be used to fit
-physics-motivated (PM) potentials, as well as machine learning potentials such
-as the neural network (NN) models.
+[//]: # ([![Documentation Status]&#40;https://readthedocs.org/projects/kliff/badge/?version=latest&#41;]&#40;https://kliff.readthedocs.io/en/latest/?badge=latest&#41;)
 
+[//]: # ([![Anaconda-Server Badge]&#40;https://img.shields.io/conda/vn/conda-forge/kliff.svg&#41;]&#40;https://anaconda.org/conda-forge/kliff&#41;)
 
-## Installation
+[//]: # ([![PyPI]&#40;https://img.shields.io/pypi/v/kliff.svg&#41;]&#40;https://pypi.python.org/pypi/kliff&#41;)
 
-### Using conda
-```sh
-conda intall -c conda-forge kliff
-```
-
-### Using pip
-```sh
-pip install kliff
-```
-
-### From source
-```
-git clone https://github.com/openkim/kliff
-pip install ./kliff
-```
-
-To train a KIM model, `kim-api` and `kimpy` are needed; to train a machine learning
-model, `PyTorch` is needed. For more information on installing these packages, see
-[Installation](https://kliff.readthedocs.io/en/latest/installation.html).
-
-## A quick example to train a neural network potential
-
-```python
-from kliff import nn
-from kliff.calculators import CalculatorTorch
-from kliff.descriptors import SymmetryFunction
-from kliff.dataset import Dataset
-from kliff.models import NeuralNetwork
-from kliff.loss import Loss
-from kliff.utils import download_dataset
+**This  branch contains the as-going development of KLIFF v1, which includes significant
+enhancements for ML models. Please note that this branch is in active development and is
+not guaranteed to work at present**
 
-# Descriptor to featurize atomic configurations
-descriptor = SymmetryFunction(
-    cut_name="cos", cut_dists={"Si-Si": 5.0}, hyperparams="set51", normalize=True
-)
+KLIFF is a framework to develop physics-based and machine learning interatomic potentials.
+It is undergoing major upgrades to support machine learning models. The current version
+have limited support of machine learning models, restricted to descriptors based dense
+neural network models. The upcoming versio 1.0 would also include support for graph-based
+models, and provide tools for implementing more generic ML framework.
 
-# Fully-connected neural network model with 2 hidden layers, each with 10 units
-N1 = 10
-N2 = 10
-model = NeuralNetwork(descriptor)
-model.add_layers(
-    # first hidden layer
-    nn.Linear(descriptor.get_size(), N1),
-    nn.Tanh(),
-    # second hidden layer
-    nn.Linear(N1, N2),
-    nn.Tanh(),
-    # output layer
-    nn.Linear(N2, 1),
-)
+To use this branch for evaluation, please install it in the following way:
 
-# Training set (dataset will be downloaded from:
-# https://github.com/openkim/kliff/blob/master/examples/Si_training_set.tar.gz)
-dataset_path = download_dataset(dataset_name="Si_training_set")
-dataset_path = dataset_path.joinpath("varying_alat")
-train_set = Dataset(dataset_path)
-configs = train_set.get_configs()
-
-# Set up calculator to compute energy and forces for atomic configurations in the
-# training set using the neural network model
-calc = CalculatorTorch(model, gpu=False)
-calc.create(configs)
-
-# Define a loss function and train the model by minimizing the loss
-loss = Loss(calc)
-result = loss.minimize(method="Adam", num_epochs=10, batch_size=100, lr=0.001)
-
-# Write trained model as a KIM model to be used in other codes such as LAMMPS and ASE
-model.write_kim_model()
+```
+pip install git+https://github.com/openkim/kliff.git@v1
 ```
 
-Detailed explanation and more tutorial examples can be found in the
-[documentation](https://kliff.readthedocs.io/en/latest/tutorials.html).
-
-
-## Why you want to use KLIFF (or not use it)
-
-- Interacting seamlessly with[ KIM](https://openkim.org), the fitted model can
-  be readily used in simulation codes such as LAMMPS and ASE via the `KIM API`
-- Creating mixed PM and NN models
-- High level API, fitting with a few lines of codes
-- Low level API for creating complex NN models
-- Parallel execution
-- [PyTorch](https://pytorch.org) backend for NN (include GPU training)
-
+## Upcoming changes
+
+- [ ] Functional model interface
+- [ ] Updated Parameter interface
+- [ ] Support for Libdescriptor library
+- [ ] Support for graph-based models
+- [ ] Property transforms
+- [ ] Torch based trainers for ML models
+- [ ] Support for Pytorch Lightning trainer
+- [ ] Uncertainty quantification
+- [ ] ML layers support (NEQUIP and MACE)
 
 ## Citing KLIFF
 

diff --git a/kliff/models/parameter.py b/kliff/models/parameter.py
@@ -93,7 +93,9 @@ def __new__(
             index: Index of the parameter in the parameter vector. Used for setting the
              parameter in the KIMPY.
             opt_mask: Boolean array of the same shape as the parameter. The values
-                marked ``True`` are optimized, and ``False`` are not optimized.
+                marked ``True`` are optimized, and ``False`` are not optimized. Single
+                boolean value can also be provided, in which case it will be applied to
+                all the components of the parameter.
 
         Returns:
             A new instance of Parameter.

diff --git a/kliff/transforms/configuration_transforms/__init__.py b/kliff/transforms/configuration_transforms/__init__.py
@@ -0,0 +1,13 @@
+from kliff.utils import torch_geometric_available
+
+from .configuration_transform import ConfigurationTransform
+from .descriptors import Descriptor, show_available_descriptors
+from .graphs import *
+
+__all__ = [
+    "ConfigurationTransform",
+    "Descriptor",
+    "KLIFFTorchGraphGenerator",
+    "KLIFFTorchGraph",
+    "show_available_descriptors",
+]
diff --git a/kliff/transforms/configuration_transforms/configuration_transform.py b/kliff/transforms/configuration_transforms/configuration_transform.py
@@ -0,0 +1,85 @@
+from typing import TYPE_CHECKING, List, Union
+
+if TYPE_CHECKING:
+    from kliff.dataset import Configuration
+
+
+class ConfigurationTransform:
+    """
+    A configuration transform is a function that maps a configuration to a "fingerprint".
+    The fingerprint can be any object that represents the configuration, and restriction
+    or checks on the fingerprint is not imposed. For example, current configuration transforms
+    include graph representations of the configuration,and descriptors.
+    """
+
+    def __init__(self, copy_to_config=False):
+        self._implicit_fingerprint_copying = copy_to_config
+
+    def forward(self, configuration: "Configuration"):
+        """
+        Map a configuration to a fingerprint. Also handle the implicit copying of the
+        fingerprint to the configuration.
+
+        Args:
+            configuration: Instance of ~:class:`kliff.dataset.Configuration`. For which the
+                fingerprint is to be generated.
+
+        Returns:
+            Fingerprint of the configuration.
+        """
+        raise NotImplementedError
+
+    def __call__(self, configuration: "Configuration"):
+        fingerprint = self.forward(configuration)
+        if self.copy_to_config:
+            configuration.fingerprint(fingerprint)
+        return fingerprint
+
+    def inverse(self, *args, **kargs):
+        """
+        Inverse mapping of the transform. This is not implemented for any of the transforms,
+        but is there for future use.
+        """
+        NotImplementedError(
+            "Do you mean `backward`?\n"
+            "Any of the implemented transforms do not support inverse mapping.\n"
+            "For computing jacobian-vector product use `backward` function."
+        )
+
+    def transform(self, configuration: "Configuration"):
+        return self(configuration)
+
+    @property
+    def copy_to_config(self):
+        return self._implicit_fingerprint_copying
+
+    @copy_to_config.setter
+    def copy_to_config(self, value: bool):
+        self._implicit_fingerprint_copying = value
+
+    def collate_fn(self, config_list):
+        """
+        Collate a list of configurations into a list of transforms. This is useful for
+        batch processing.
+
+        Args:
+            config_list: List of configurations.
+        """
+        transforms_list = []
+        for conf in config_list:
+            transform = self(conf)
+            transforms_list.append(transform)
+        return transforms_list
+
+    def export_kim_model(self, filename: str, modelname: str):
+        """
+        Save the configuration transform to a file.
+
+        Args:
+            filename: Name of the file to save the transform to.
+            modelname: Name of the model to save.
+        """
+        raise NotImplementedError
+
+
+# TODO: should neighbor lists be a transform? It fits the definition as graph.
diff --git a/kliff/transforms/configuration_transforms/default_hyperparams.py b/kliff/transforms/configuration_transforms/default_hyperparams.py
@@ -0,0 +1,130 @@
+def symmetry_functions_set51() -> dict:
+    r"""Hyperparameters for symmetry functions, as discussed in:
+    Nongnuch Artrith and Jorg Behler. "High-dimensional neural network potentials for
+    metal surfaces: A prototype study for copper." Physical Review B 85, no. 4 (2012):
+    045439.
+    """
+    return {
+        "g2": [
+            {"eta": 0.0035710676725828126, "Rs": 0.0},
+            {"eta": 0.03571067672582813, "Rs": 0.0},
+            {"eta": 0.07142135345165626, "Rs": 0.0},
+            {"eta": 0.12498736854039845, "Rs": 0.0},
+            {"eta": 0.21426406035496876, "Rs": 0.0},
+            {"eta": 0.3571067672582813, "Rs": 0.0},
+            {"eta": 0.7142135345165626, "Rs": 0.0},
+            {"eta": 1.428427069033125, "Rs": 0.0},
+        ],
+        "g4": [
+            {"zeta": 1, "lambda": -1, "eta": 0.00035710676725828126},
+            {"zeta": 1, "lambda": 1, "eta": 0.00035710676725828126},
+            {"zeta": 2, "lambda": -1, "eta": 0.00035710676725828126},
+            {"zeta": 2, "lambda": 1, "eta": 0.00035710676725828126},
+            {"zeta": 1, "lambda": -1, "eta": 0.010713203017748437},
+            {"zeta": 1, "lambda": 1, "eta": 0.010713203017748437},
+            {"zeta": 2, "lambda": -1, "eta": 0.010713203017748437},
+            {"zeta": 2, "lambda": 1, "eta": 0.010713203017748437},
+            {"zeta": 1, "lambda": -1, "eta": 0.0285685413806625},
+            {"zeta": 1, "lambda": 1, "eta": 0.0285685413806625},
+            {"zeta": 2, "lambda": -1, "eta": 0.0285685413806625},
+            {"zeta": 2, "lambda": 1, "eta": 0.0285685413806625},
+            {"zeta": 1, "lambda": -1, "eta": 0.05356601508874219},
+            {"zeta": 1, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 2, "lambda": -1, "eta": 0.05356601508874219},
+            {"zeta": 2, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 4, "lambda": -1, "eta": 0.05356601508874219},
+            {"zeta": 4, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 16, "lambda": -1, "eta": 0.05356601508874219},
+            {"zeta": 16, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 1, "lambda": -1, "eta": 0.08927669181457032},
+            {"zeta": 1, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 2, "lambda": -1, "eta": 0.08927669181457032},
+            {"zeta": 2, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 4, "lambda": -1, "eta": 0.08927669181457032},
+            {"zeta": 4, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 16, "lambda": -1, "eta": 0.08927669181457032},
+            {"zeta": 16, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 1, "lambda": -1, "eta": 0.16069804526622655},
+            {"zeta": 1, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 2, "lambda": -1, "eta": 0.16069804526622655},
+            {"zeta": 2, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 4, "lambda": -1, "eta": 0.16069804526622655},
+            {"zeta": 4, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 16, "lambda": -1, "eta": 0.16069804526622655},
+            {"zeta": 16, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 1, "lambda": -1, "eta": 0.28568541380662504},
+            {"zeta": 1, "lambda": 1, "eta": 0.28568541380662504},
+            {"zeta": 2, "lambda": -1, "eta": 0.28568541380662504},
+            {"zeta": 2, "lambda": 1, "eta": 0.28568541380662504},
+            {"zeta": 4, "lambda": -1, "eta": 0.28568541380662504},
+            {"zeta": 4, "lambda": 1, "eta": 0.28568541380662504},
+            {"zeta": 16, "lambda": 1, "eta": 0.28568541380662504},
+        ],
+    }
+
+
+def symmetry_functions_set30() -> dict:
+    r"""Hyperparameters for symmetry functions, as discussed in:
+    Artrith, N., Hiller, B. and Behler, J., 2013. Neural network potentials for metals and
+    oxides–First applications to copper clusters at zinc oxide. physica status solidi (b),
+    250(6), pp.1191-1203.
+    """
+    return {
+        "g2": [
+            {"eta": 0.003213960905324531, "Rs": 0.0},
+            {"eta": 0.03571067672582813, "Rs": 0.0},
+            {"eta": 0.07142135345165626, "Rs": 0.0},
+            {"eta": 0.12498736854039845, "Rs": 0.0},
+            {"eta": 0.21426406035496876, "Rs": 0.0},
+            {"eta": 0.3571067672582813, "Rs": 0.0},
+            {"eta": 0.7142135345165626, "Rs": 0.0},
+            {"eta": 1.428427069033125, "Rs": 0.0},
+        ],
+        "g4": [
+            {"zeta": 1, "lambda": -1, "eta": 0.00035710676725828126},
+            {"zeta": 1, "lambda": 1, "eta": 0.00035710676725828126},
+            {"zeta": 2, "lambda": -1, "eta": 0.00035710676725828126},
+            {"zeta": 2, "lambda": 1, "eta": 0.00035710676725828126},
+            {"zeta": 1, "lambda": -1, "eta": 0.010713203017748437},
+            {"zeta": 1, "lambda": 1, "eta": 0.010713203017748437},
+            {"zeta": 2, "lambda": -1, "eta": 0.010713203017748437},
+            {"zeta": 2, "lambda": 1, "eta": 0.010713203017748437},
+            {"zeta": 1, "lambda": 1, "eta": 0.0285685413806625},
+            {"zeta": 2, "lambda": 1, "eta": 0.0285685413806625},
+            {"zeta": 1, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 2, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 4, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 16, "lambda": 1, "eta": 0.05356601508874219},
+            {"zeta": 1, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 2, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 4, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 16, "lambda": 1, "eta": 0.08927669181457032},
+            {"zeta": 1, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 2, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 4, "lambda": 1, "eta": 0.16069804526622655},
+            {"zeta": 16, "lambda": 1, "eta": 0.16069804526622655},
+        ],
+    }
+
+
+def bispectrum_default() -> dict:
+    return {
+        "jmax": 4,
+        "rfac0": 0.99363,
+        "diagonalstyle": 3,
+        "rmin0": 0,
+        "switch_flag": 1,
+        "bzero_flag": 0,
+        "use_shared_array": False,
+        "weights": None,
+    }
+
+
+def soap_default() -> dict:
+    return {
+        "n_max": 4,
+        "l_max": 4,
+        "cutoff": 4.0,
+        "radial_basis": "polynomial",
+        "eta": 0.5,
+    }