# This is part of the supporting information for the paper  
*ParAMS: Parameter Fitting for Atomistic and Molecular Models* (DOI: *123123*)  
The full documentation can be found at https://www.scm.com/doc.trunk/params/index.html

# SCC-DFTB repulsive potential parametrization

Set num_processes to the number of processors on your machine. The DFTB calculations will be parallelized over that many cores.

In [1]:
import os, sys
import numpy as np
from os.path    import join as opj
from scm.params import *
from scm.params import __version__ as paramsver
from scm.plams import *

num_processes = 8

INDIR = '../dftbdata'
if not os.path.exists(INDIR):
    os.makedirs(INDIR)
    


print(f"ParAMS Version used: {paramsver}")

ParAMS Version used: 0.5.0



# Step 1: Define the job collection
This adds lattice optimizations of the wurtzite and rocksalt polymorphs of ZnO to the job collection.

For wurtzite, the elastic tensor is calculated. From the output, the bulk modulus can then be extracted.

The job collection is stored in jobcollection.yml.

In [2]:
wurtzite, rocksalt = Molecule(opj(INDIR, 'w.xyz')), Molecule(opj(INDIR, 'rs.xyz')) 
print("### Wurtzite ###")
print(wurtzite)
print("### Rocksalt ###")
print(rocksalt)

w_opt_s = Settings()
w_opt_s.input.ams.Task = 'GeometryOptimization'
w_opt_s.input.ams.GeometryOptimization.OptimizeLattice = 'Yes'
rs_opt_s = w_opt_s.copy()
w_opt_s.input.ams.Properties.ElasticTensor = 'Yes' # to get bulk modulus of wurtzite

jc = JobCollection()
jc.add_entry('wurtzite_lattopt', JCEntry(w_opt_s, wurtzite))
jc.add_entry('rocksalt_lattopt', JCEntry(rs_opt_s, rocksalt))
print("### Job collection ###")
print(jc)
jc.store(opj(INDIR, 'jobcollection.yml'))

### Wurtzite ###
  Atoms: 
    1        Zn      1.645000      0.949741      0.023375 
    2        Zn      0.000000      1.899482      2.678375 
    3         O      1.645000      0.949741      3.295375 
    4         O      0.000000      1.899482      0.640375 
  Lattice:
        3.2900000000     0.0000000000     0.0000000000
       -1.6450000000     2.8492235800     0.0000000000
        0.0000000000     0.0000000000     5.3100000000

### Rocksalt ###
  Atoms: 
    1        Zn      2.170000      2.170000      2.170000 
    2         O      0.000000      0.000000      0.000000 
  Lattice:
        0.0000000000     2.1700000000     2.1700000000
        2.1700000000     0.0000000000     2.1700000000
        2.1700000000     2.1700000000     0.0000000000

### Job collection ###
---
ID: rocksalt_lattopt
ReferenceEngineID: None
AMSInput: |
   Task GeometryOptimization
   geometryoptimization
     OptimizeLattice Yes
   End
   system
     Atoms
                Zn      2.1700000000      2.1700

# Step 2: Define the training set
There are four target quantities
* $a$ wurtzite lattice parameter, 
* $c$ wurtzite lattice parameter, 
* $B_0$ wurtzite bulk modulus, and
* and $\Delta E$ = relative energy between the wurtzite and rocksalt polymorphs (per ZnO formula unit).

**If you set recalculate_reference_data to True**, the AMS BAND periodic DFT software will be used to run the reference jobs and calculate the reference data. Any engine, or combination of different engines in the Amsterdam Modeling Suite, can be used to seamlessly calculate the reference data, if the reference values are not known beforehand. NOTE: It may take many hours to calculate the reference data.

**Otherwise**, DFT-calculated reference values are taken from https://doi.org/10.1021/jp404095x

In [3]:
recalculate_reference_data = False
training_set = DataSet()
if recalculate_reference_data:
    training_set.add_entry('bulkmodulus("wurtzite_lattopt")', weight=1, reference=None) 
    training_set.add_entry('lattice("wurtzite_lattopt", 0)', weight=1, reference=None) 
    training_set.add_entry('lattice("wurtzite_lattopt", 2)', weight=1, reference=None)
    training_set.add_entry('energy("wurtzite_lattopt")/2.0-energy("rocksalt_lattopt")', weight=1, reference=None)
    band_settings = Settings()
    band_settings.input.band.basis.type = 'TZP'
    band_settings.input.band.kspace.quality = 'Normal' # Ideally should be "Good", but will be more expensive
    band_settings.input.band.xc.libxc = 'PBE'
    band_settings.runscript.nproc = num_processes
    init(path=INDIR, folder='band_reference_data')
    reference_results = jc.run(engine_settings=band_settings, use_pipe=False)
    finish()
    training_set.calculate_reference(reference_results)
    training_set.store(opj(INDIR, 'trainingset_from_band_calculations.yml'))
else:
    training_set.add_entry('bulkmodulus("wurtzite_lattopt")', weight=1, reference=129) # GPa
    training_set.add_entry('lattice("wurtzite_lattopt", 0)', weight=1, reference=3.29) # a, angstrom
    training_set.add_entry('lattice("wurtzite_lattopt", 2)', weight=1, reference=5.31) # c, angstrom
    training_set.add_entry('energy("wurtzite_lattopt")/2.0-energy("rocksalt_lattopt")', weight=1, reference=-0.30/27.211)
    training_set.store(opj(INDIR, 'trainingset_with_previous_reference_values.yml'))

print("### Training set ###")
print(training_set)


[13:25:43] PLAMS working folder: /home/hellstrom/latex/params/params_si.git/trunk/dftbdata/band_reference_data.003
[13:25:43] JOB wurtzite_lattopt STARTED
[13:25:43] JOB wurtzite_lattopt RUNNING
[23:29:26] JOB wurtzite_lattopt FINISHED
[23:29:26] JOB wurtzite_lattopt SUCCESSFUL
[23:29:26] JOB rocksalt_lattopt STARTED
[23:29:26] JOB rocksalt_lattopt RUNNING
[01:58:39] JOB rocksalt_lattopt FINISHED
[01:58:39] JOB rocksalt_lattopt SUCCESSFUL
[01:58:39] PLAMS run finished. Goodbye
### Training set ###
---
Expression: bulkmodulus("wurtzite_lattopt")
Weight: 1
ReferenceValue: 120.298
---
Expression: lattice("wurtzite_lattopt", 0)
Weight: 1
ReferenceValue: 3.3034906603720327
---
Expression: lattice("wurtzite_lattopt", 2)
Weight: 1
ReferenceValue: 5.276393397180919
---
Expression: energy("wurtzite_lattopt")/2.0-energy("rocksalt_lattopt")
Weight: 1
ReferenceValue: -0.014083789325482599
...



Set the settings for the parametrized DFTB engine. Here, we set the k-space quality to 'Good', which is important for lattice optimizations.

In [4]:
dftb_s = Settings()
dftb_s.input.dftb.kspace.quality = 'Good'

Create a "parameter interface" to the DFTB repulsive potential. 

Repulsive potentials are stored as splines towards the end of Slater-Koster (.skf) files.

Here, we optimize only the Zn-O and O-Zn repulsive potentials (which must be identical).

* Take electronic parameters and unchanged repulsive potentials (e.g. O-O.skf) from AMSHOME/atomicdata/DFTB/DFTB.org/znorg-0-1

* Define an analytical repulsive function. Here, we choose a tapered double exponential of the form $V^{\text{rep}}(r) = f^{\text{cut}}(r)\left[p_0\exp(-p_1r) + p_2\exp(-p_3r)\right]$, where $p_0, p_1, p_2, p_3$ are the parameters to be fitted, and $f^\text{cut}(r)$ is a smoothly decaying cutoff function decaying to 0 at $r = 5.67$ bohr.

* r_range specifies for which distances to write the repulsive potential, and spline parameters, to the new O-Zn.skf and Zn-O.skf files.

* Only optimize parameters for the O-Zn pair. Note: The Zn-O repulsive potential will be identical to the O-Zn one. When specifying active parameters for a DFTBSplineRepulsivePotentialParams, the elements must be ordered alphabetically.

* Define initial values and allowed ranges for the parameter values.

In [5]:
interface = DFTBSplineRepulsivePotentialParams(
    folder=opj(os.environ['AMSHOME'],'atomicdata', 'DFTB', 'DFTB.org', 'znorg-0-1'), 
    repulsive_function=TaperedDoubleExponential(cutoff=5.67), 
    r_range=np.arange(0., 5.87, 0.1), 
    other_settings=dftb_s
)
for p in interface:    
    p.is_active = p.name.startswith('O-Zn:')

print("### Active parameters ###")
interface.active.x = [1.0, 1.1, 1.2, 1.3] # initial values
interface.active.range = [ (0.,10.), (0.,10.), (0.,10.), (0.,10) ]
for p in interface.active:
    print(p)


### Active parameters ###
..................
Name:     O-Zn:p0
Value:    1.0
Range:    (0.0, 10.0)
Active:   True

..................
Name:     O-Zn:p1
Value:    1.1
Range:    (0.0, 10.0)
Active:   True

..................
Name:     O-Zn:p2
Value:    1.2
Range:    (0.0, 10.0)
Active:   True

..................
Name:     O-Zn:p3
Value:    1.3
Range:    (0.0, 10)
Active:   True



# Step 3: Run the optimization
* Specify a Nelder-Mead optimizer from scipy.

In [6]:

optimizer = Scipy(method='Nelder-Mead')

optimization = Optimization(jc, 
                            training_set, 
                            interface, 
                            optimizer, 
                            title="ZnO_repulsive_opt",
                            use_pipe=False, 
                            parallel=ParallelLevels(processes=num_processes), 
                            #plams_workdir_path=os.path.abspath('.'),
                            callbacks=[Logger(printfreq=1,
                                              writefreq_history=1,
                                              writefreq_datafiles=1,
                                              writefreq_bestparams=1
                                             ),
                                      TimePerEval(printfrequency=10)])


optimization.summary()
optimization.optimize()

Directory 'ZnO_repulsive_opt' already exists. Will use 'ZnO_repulsive_opt.003' instead.
Optimization() Instance Settings:
Title:                             ZnO_repulsive_opt.003
Workdir:                           /home/hellstrom/latex/params/params_si.git/trunk/notebooks/ZnO_repulsive_opt.003
JobCollection size:                2
Interface:                         DFTBSplineRepulsivePotentialParams
Active parameters:                 4
Optimizer:                         Scipy
Parallelism:                       ParallelLevels(optimizations=1, parametervectors=1, jobs=1, processes=8, threads=1)
Verbose:                           True
Callbacks:                         Logger
                                   TimePerEval

Evaluators:
-----------
Name:                              trainingset (_LossEvaluator)
Loss:                              SSE
Evaluation frequency:              1

Data Set entries:                  4
Data Set jobs:                     2
Batch size:                     



[2020-12-17 02:01:27] Best trainingset loss = 1.236e+02
[2020-12-17 02:01:27] Step 1, trainingset loss = 123.563, first 4 params = 1.00 1.10 1.20 1.30
[2020-12-17 02:02:47] Step 2, trainingset loss = 150.394, first 4 params = 0.80 1.10 1.20 1.30
[2020-12-17 02:04:39] Best trainingset loss = 7.974e+01
[2020-12-17 02:04:39] Step 3, trainingset loss = 79.742, first 4 params = 1.00 0.90 1.20 1.30
[2020-12-17 02:06:07] Step 4, trainingset loss = 132.506, first 4 params = 1.00 1.10 1.01 1.30
[2020-12-17 02:07:34] Step 5, trainingset loss = 81.254, first 4 params = 1.00 1.10 1.20 1.11
[2020-12-17 02:09:25] Step 6, trainingset loss = 80.239, first 4 params = 1.20 1.00 1.10 1.21
[2020-12-17 02:11:18] Step 7, trainingset loss = 102.324, first 4 params = 1.10 0.95 1.34 1.16
[2020-12-17 02:13:13] Step 8, trainingset loss = 216.302, first 4 params = 1.15 0.88 1.22 1.09
[2020-12-17 02:14:34] Step 9, trainingset loss = 86.918, first 4 params = 1.04 1.05 1.21 1.25
[2020-12-17 02:15:55] Step 10, traini

[2020-12-17 04:01:30] Step 71, trainingset loss = 65.044, first 4 params = 0.15 0.29 1.53 1.89
[2020-12-17 04:03:28] Step 72, trainingset loss = 57.206, first 4 params = 0.27 0.44 1.51 1.72
[2020-12-17 04:05:16] Best trainingset loss = 5.101e+01
[2020-12-17 04:05:16] Step 73, trainingset loss = 51.006, first 4 params = 0.18 0.29 1.50 1.91
[2020-12-17 04:07:01] Step 74, trainingset loss = 53.082, first 4 params = 0.13 0.20 1.48 2.02
[2020-12-17 04:08:29] Step 75, trainingset loss = 58.942, first 4 params = 0.14 0.25 1.51 1.95
[2020-12-17 04:10:20] Step 76, trainingset loss = 55.078, first 4 params = 0.24 0.39 1.51 1.78
[2020-12-17 04:11:47] Step 77, trainingset loss = 62.169, first 4 params = 0.15 0.29 1.53 1.89
[2020-12-17 04:13:38] Step 78, trainingset loss = 54.377, first 4 params = 0.23 0.37 1.51 1.81
[2020-12-17 04:15:33] Step 79, trainingset loss = 58.132, first 4 params = 0.22 0.32 1.49 1.87
[2020-12-17 04:17:22] Step 80, trainingset loss = 53.737, first 4 params = 0.21 0.35 1.52

[2020-12-17 06:21:11] Step 146, trainingset loss = 46.782, first 4 params = 0.14 0.19 1.47 2.04
[2020-12-17 06:23:08] Best trainingset loss = 4.674e+01
[2020-12-17 06:23:08] Step 147, trainingset loss = 46.740, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:25:00] Step 148, trainingset loss = 46.784, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:26:54] Step 149, trainingset loss = 46.782, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:28:48] Step 150, trainingset loss = 46.789, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:28:48] Time per f-evaluation (trainingset): 0:01:49.945571
[2020-12-17 06:30:40] Step 151, trainingset loss = 46.775, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:32:33] Step 152, trainingset loss = 46.782, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:34:28] Step 153, trainingset loss = 46.780, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 06:36:22] Step 154, trainingset loss = 46.769, first 4 params = 0.14 0.18 1.47 2.04
[2020-

[2020-12-17 08:47:16] Step 223, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 08:49:09] Step 224, trainingset loss = 46.770, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 08:51:03] Step 225, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 08:52:56] Step 226, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 08:54:50] Step 227, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 08:56:45] Step 228, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 08:58:38] Step 229, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 09:00:32] Step 230, trainingset loss = 46.771, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 09:00:32] Time per f-evaluation (trainingset): 0:01:53.912672
[2020-12-17 09:02:26] Step 231, trainingset loss = 46.772, first 4 params = 0.14 0.18 1.47 2.04
[2020-12-17 09:04:21] Step 232, trainingset loss = 46.771, fir

KeyboardInterrupt: 

# Step 4: Find the results
* ZnO_repulsive_opt/trainingset_history.dat contains the loss function value and parameters for each iteration
* ZnO_repulsive_opt/data/predictions/trainingset contains the individual predictions ($a$, $c$, $B_0$ and $\Delta E$) for each parameter set
* ZnO_repulsive_opt/data/contributions/trainingset contains the fraction of the total loss function value for each item in the training set, for each parameter set

# Step 2b: Recalculate the reference data with AMS BAND
Any engine, or combination of different engines in the Amsterdam Modeling Suite, can be used to seamlessly calculate the reference data, if the reference values are not known beforehand.

Here, we will use the AMS BAND periodic DFT code. For this demonstration, we will 

The `range` attribute allows to define box constraints for every optimizer: