# Descriptors

## Quip SOAP & _turbo_ SOAP

In [2]:
import phdtools.descriptors.quip as qt
from pprint import pprint
from phdtools.computes.misc import parse_config

### 1. Parsing the input files

The full description of the input files and all the quippy based descriptors are available in the official doc page of quippy.

`QUIPtools` contains functionalities to help read and parse the necessary information.
The quippy tool needs a specific formatted string of information to function correctly, all the information are read from a `.json` file or a `dict()` variable.

In [3]:
json_quip_soap = '../data/quip_soap.json'
json_quip_turbo = '../data/quip_soap_turbo.json'

We can get the SOAP and turbo SOAP strings from .json files directly:

In [4]:
qt.QUIPtools(method='soap',
             descr_dict=json_quip_soap).getString

'soap cutoff=6.0 cutoff_transition_width=0.5 n_max=8 l_max=4 atom_sigma=0.5 n_Z=1 Z={3} n_species=8 species_Z={1 3 6 8 9 15 21 23} '

In [5]:
qt.QUIPtools(method='soap_turbo',
             descr_dict=json_quip_turbo).getString

"soap_turbo species_Z={1 3 6 8 9 15 21 23} l_max=4 n_species=8 rcut_hard=4.5 rcut_soft=3.5 basis=poly3gauss scaling_mode=polynomial radial_enhancement=1 multi={'alpha_max': 8, 'atom_sigma_r': 0.2, 'atom_sigma_t': 0.2, 'atom_sigma_r_scaling': 0.1, 'atom_sigma_t_scaling': 0.1, 'amplitude_scaling': 1.0, 'central_weight': 1.0} central_index=2 alpha_max={8 8 8 8 8 8 8 8} atom_sigma_r={0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2} atom_sigma_t={0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2} atom_sigma_r_scaling={0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1} atom_sigma_t_scaling={0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1} amplitude_scaling={1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0} central_weight={1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0}  "

Alternatively, it is possible to work directly with `dict()` objects.

In [6]:
soap_dict = parse_config(filename=json_quip_soap)
pprint(soap_dict)

{'Z': [3],
 'atom_sigma': 0.5,
 'cutoff': 6.0,
 'cutoff_transition_width': 0.5,
 'l_max': 4,
 'n_Z': 1,
 'n_max': 8,
 'n_species': 8,
 'species_Z': [1, 3, 6, 8, 9, 15, 21, 23]}


We can init the parser object and define in a later stage all the information required to parse different strings.

In [7]:
quipParser = qt.QUIPtools(method='soap')
print(f"Descriptor dictionary: {quipParser.descriptorDict}")

Descriptor dictionary not set.

Descriptor dictionary: None


In this way we can change and recreate the string with updated parameters

In [8]:
quipParser.descriptorDict = soap_dict
print(f"Descriptor dictionary:")
pprint(quipParser.descriptorDict)

Descriptor dictionary:
{'Z': [3],
 'atom_sigma': 0.5,
 'cutoff': 6.0,
 'cutoff_transition_width': 0.5,
 'l_max': 4,
 'n_Z': 1,
 'n_max': 8,
 'n_species': 8,
 'species_Z': [1, 3, 6, 8, 9, 15, 21, 23]}


### 2. Computing the descriptor

In order to compute the descriptor we need a trajectory.
The trajectory prepared using the tools explained in the previous notebook.

In [9]:
from phdtools.ASEtools import asetools as at

In [10]:
rcut_dict = {'H': 1.0, 'C': 1.0, 'O': 1.0, 'Li': 0.1, 'P': 1.0, 'F': 1.0}
mol_names_list = ['EC', 'EMC', 'PF6', 'Li']

In [11]:
li_ions_example_dict = dict(
    projectName = "example_1_LiElectrolytes",
    trajPath = "../data/traj_2.1_0-100-1.xyz",
    rcutCorrection = rcut_dict,
    moleculeNames = mol_names_list,
    frameRange = (0,30,2)
)

In [12]:
zshift_tuple = ('EC', [6,8])

In [13]:
ase_db = at.ASEtraj(**li_ions_example_dict).read(Zshift=zshift_tuple)

Gathering the Universe ...

Total atoms: 8402
Atom types: ['C' 'F' 'H' 'Li' 'O' 'P']

rcut correction: {'H': 1.0, 'C': 1.0, 'O': 1.0, 'Li': 0.1, 'P': 1.0, 'F': 1.0}

Searching for molecules in the system ...

['EC', 'EMC', 'PF6', 'Li']


Uniques molecules found: 4
Molecules found: {'C3H4O3': 'EC', 'C4H8O3': 'EMC', 'F6P': 'PF6', 'Li': 'Li'}
1.7427s

Computing MolIDs ...
1.4101s

Computing MolSymbols ...
Total numner of molecules: 777
0.0454s

<end>
3.3802s

Reading traj:
Begin: 0 | End: 30 | Stride: 2


Applying Z shift: 100%|██████████| 15/15 [00:00<00:00, 20964.53it/s]

0.3832s






Assuming that we already know useful parameters we can proceed by computing the descriptor.

In [14]:
soap_db = qt.QUIP(method='soap', 
                  descr_dict=json_quip_soap).fit(ase_db=ase_db)

Computing quippy soap ...
6.7847s



In [15]:
print(len(soap_db), soap_db[0].shape)

15 (114, 10401)
