## Introduction

This is an example to derive SMIRNOFF typed atom-centerd polarizabilities from quantum mechanically calculated electrostatics potentials with `factorpol` package.

A global optimization is built by concatenating Individual objective function find polarizabilities for the training set.

## Dependencies

- ray
- openff-toolkit
- openff-recharge*
- sqlalchemy
- openeye-toolkits
- scipy

In [1]:
import os
import pickle
from glob import glob

import numpy as np
import pandas as pd
from openff.toolkit import ForceField, Molecule
from pkg_resources import resource_filename

from factorpol.alphas_training import (AlphaData, AlphasTrainer, AlphaWorker,
                                       optimize_alphas, optimize_alphas_fast)
from factorpol.charge_training import ChargeTrainer
from factorpol.utilities import (calc_rrms, flatten_a_list, Polarizability,
                                 StorageHandler)

cwd = os.getcwd()

  setattr(self, word, getattr(machar, word).flat[0])
  return self._float_to_str(self.smallest_subnormal)
  setattr(self, word, getattr(machar, word).flat[0])
  return self._float_to_str(self.smallest_subnormal)


## Prepare dataset

Curate QM ESPs data generated in `00-generate-qm-reference.ipynb`.

`off_examples.offxml` is an example OpenForceField style [ForceField file](https://github.com/openforcefield/openff-forcefields/tree/main/openforcefields/offxml). We use the `<vdW>` handeler to label atoms with SMIRNOFF patterns and assign electrostatics parameters until we have a new handeler for polarizability.


In [2]:
off_forcefield = ForceField(resource_filename(
    "factorpol", os.path.join("data", "off_examples.offxml")
))

# Initialize a polarizability 
alphas0 = Polarizability()

Curate QM data and prepare ray workers to optimize polarizabilities

In [3]:
data = AlphaData(
    database_name="factorpol_examples",
    dataset=["CO", "C=C"],
    off_forcefield=off_forcefield,
    polarizability=alphas0,
    num_cpus=2,
)

2023-04-14 14:08:52,447	INFO worker.py:1553 -- Started a local Ray instance.
[2m[36m(pid=1848527)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1848527)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1848527)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1848527)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1848528)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1848528)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1848528)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1848528)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(create_worker pid=1848527)[0m   self._r_jk == 0.0, self._r_jk, np.power(self._r_jk, -3)
[2m[36m(create_worker pid=1848528)[0m   self._r_jk == 0.0, self._r_jk, np.power(self._r_jk, -3)
[2m[36m(create_worker pid=1848527)[0m   self._r_jk == 0.0, self._r_jk, np.power(se

Because we have 2 molecules in the training set, each of them has two sets of QM ESPs, which means 1 workers per molecule and a totoal of 2 workers.

In [4]:
print(f"Number of data in training set:\t {len(data.workers)}")

Number of data in training set:	 2


## Optimization

In [5]:
atrain = AlphasTrainer(
    workers=data.workers,
    prior=alphas0,
    working_directory=os.path.join(cwd, "data_alphas_2"),
)

Path exists, deleting


In [6]:
ret = optimize_alphas(worker_list=atrain.workers, solved=True, num_cpus=8)

2023-04-14 14:08:58,406	INFO worker.py:1553 -- Started a local Ray instance.
[2m[36m(pid=1850715)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1850715)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1850715)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1850715)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1850713)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1850713)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1850713)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1850713)[0m   return self._float_to_str(self.smallest_subnormal)


## Results

In [7]:
ret

{'[#6:1]': 1.7531861665519723 <Unit('angstrom ** 3')>,
 '[#8:1]': 0.5093480935921064 <Unit('angstrom ** 3')>,
 '[#1:1]': 0.17091150224579515 <Unit('angstrom ** 3')>}

## A faster method to optimize polarizabilities
**This method is extremely experimental and not recommended for production use**

In [8]:
ret2 = optimize_alphas_fast(worker_list=atrain.workers, solved=True, num_cpus=8)

2023-04-14 14:09:03,630	INFO worker.py:1553 -- Started a local Ray instance.
[2m[36m(pid=1853171)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1853171)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1853171)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1853171)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1853176)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1853176)[0m   return self._float_to_str(self.smallest_subnormal)
[2m[36m(pid=1853176)[0m   setattr(self, word, getattr(machar, word).flat[0])
[2m[36m(pid=1853176)[0m   return self._float_to_str(self.smallest_subnormal)


## Results

In [9]:
ret2

{'[#6:1]': 1.7531861665519999 <Unit('angstrom ** 3')>,
 '[#8:1]': 0.5093480935921196 <Unit('angstrom ** 3')>,
 '[#1:1]': 0.1709115022457837 <Unit('angstrom ** 3')>}