DScribe is a Python package for transforming atomic structures into fixed-size numerical fingerprints. These fingerprints are often called "descriptors" and they can be used in various tasks, including machine learning, visualization, similarity analysis, etc.
Homepage
For more details and tutorials, visit the homepage at: https://singroup.github.io/dscribe/
Quick Example
import numpy as np
from ase.build import molecule
from dscribe.descriptors import SOAP
from dscribe.descriptors import CoulombMatrix
# Define atomic structures
samples = [molecule("H2O"), molecule("NO2"), molecule("CO2")]
# Setup descriptors
cm_desc = CoulombMatrix(n_atoms_max=3, permutation="sorted_l2")
soap_desc = SOAP(species=["C", "H", "O", "N"], rcut=5, nmax=8, lmax=6, crossover=True)
# Create descriptors as numpy arrays or sparse arrays
water = samples[0]
coulomb_matrix = cm_desc.create(water)
soap = soap_desc.create(water, positions=[0])
# Easy to use also on multiple systems, can be parallelized across processes
coulomb_matrices = cm_desc.create(samples)
coulomb_matrices = cm_desc.create(samples, n_jobs=3)
oxygen_indices = [np.where(x.get_atomic_numbers() == 8)[0] for x in samples]
oxygen_soap = soap_desc.create(samples, oxygen_indices, n_jobs=3)
# Some descriptors also allow calculating derivatives with respect to atomic
# positions
der, des = soap_desc.derivatives(samples, method="auto", return_descriptor=True)
Currently implemented descriptors
Descriptor | Spectrum | Derivatives |
---|---|---|
Coulomb matrix | ||
Sine matrix | ||
Ewald matrix | ||
Atom-centered Symmetry Functions (ACSF) | ||
Smooth Overlap of Atomic Positions (SOAP) | ||
Many-body Tensor Representation (MBTR) | ||
Local Many-body Tensor Representation (LMBTR) | ||
Valle-Oganov descriptor |
Installation
In-depth installation instructions can be found in the documentation, but in short:
pip
pip install dscribe
conda
conda install -c conda-forge dscribe
From source
git clone https://github.com/SINGROUP/dscribe.git
cd dscribe
git submodule update --init
pip install .