This is part of the supporting information for the paper  
*ParAMS: Parameter Fitting for Atomistic and Molecular Models* (DOI: *123123*)  
The full documentation can be found at https://www.scm.com/doc.trunk/params/index.html

# 1. Working with ParAMS data structures

## 1.1 The Data Set

In [4]:
import numpy as np
from scm.params import DataSet
ds = DataSet()

The `DataSet` defines which properties of what systems $\{P_j\}$ are relevant for the optimization.  
Here, we add two entires: The relative energy of two systems `mymol01`, `mymol02` and the frequencies of `mymol01`:

In [5]:
pr1 = "vibfreq('mymol01')"
pr2 = "energy('mymol01') - energy('mymol02')"

ds.add_entry(pr1, weight=1.0)
ds.add_entry(pr2, weight=0.1, reference=1.234)

 We can treat the `ds` object as a list, with a few functional additions:

In [6]:
len(ds)

2

In [7]:
[i.reference for i in ds]

[None, 1.234]

In [8]:
ds(pr1)

---
Expression: vibfreq('mymol01')
Weight: 1.0

In [10]:
ds[pr1].weight == ds[0].weight

True

In [11]:
ds[1].jobids

{'mymol01', 'mymol02'}

Calling `.store()` will write a YAML representation of the object to disk:

In [12]:
# ds.store('mydataset.yaml') # This is equivalent to printing to file:
print(ds)

---
Expression: vibfreq('mymol01')
Weight: 1.0
---
Expression: energy('mymol01') - energy('mymol02')
Weight: 0.1
ReferenceValue: 1.234
...



## 1.2 Job Collection
When a new parameter set $\boldsymbol{x}$ is generated, jobs that are part of a `DataSet` instance need to be
re-calculated with the new parameters, before the loss $L$ can be calculated.
The `JobCollection` sores the relevant input geometries $\{R\}$ and the settings alongside the individual settings for an execution of a Job with AMS:

In [14]:
from scm.params import JobCollection

Check which jobIDs are required by the `ds` object:

In [15]:
ds.jobids

{'mymol01', 'mymol02'}

We already have a Job Collection stored in *myjobs.yml*, which can be easily loaded:

In [16]:
jc = JobCollection('../data/myjobs.yml')

The Job Collection behaves like a *dict*:

In [17]:
for k,v in jc.items():
    print(f'{k}:')
    print(v.molecule)

mymol01:
  Atoms: 
    1        He      0.000000      0.000000      0.000000 
    2        He      2.500000      0.000000      0.000000 

mymol02:
  Atoms: 
    1        He      0.000000      0.000000      0.000000 
    2        He      2.500000      0.000000      0.000000 
    3        He      1.250000      1.750000      0.000000 



In addition to the chemical system, each entry in the collection also stores a [PLAMS Settings](https://www.scm.com/product/plams/) instance. The combination of system and settings clearly defines how a job should be executed.

In [18]:
for k,v in jc.items():
    print(f'{k}:')
    print(v.settings)

mymol01:
input: 	
      ams: 	
          task: 	SinglePoint

mymol02:
input: 	
      ams: 	
          task: 	GeometryOptimization



## 1.3 Parameter Interfaces
Parameter interfaces are responsible for the communication between an Optimizer, and the software that calculates the jobs stored in a Job Collection.  
Every parameter interface can be reparameterized.

In [19]:
from scm.params import LennardJonesParams
p = LennardJonesParams()

Every Parameter Interface is *list*-like:

In [20]:
len(p)

2

In [21]:
for pi in p:
    print(f"{pi.name}: {pi.value}, {pi.range}")

eps: 0.0003, (0.0001, 0.001)
rmin: 3.0, (0.5, 10.0)


The `.active` subset defines which parameters to optimize:

In [22]:
assert p == p.active
p['eps'].is_active = False
len(p.active)

1

In [23]:
for pi in p.active:
    print(f"{pi.name}: {pi.value}, {pi.range}")

rmin: 3.0, (0.5, 10.0)
