## Introduction

This notebook contains workflows to train BCC-dPol library from baseline QM ESPs and generate AM1-BCC-dPol

**Attention**: since there are only two molecules in the BCC training set, this example is meant to be a demonstration, resulting parameters are poorly defined. 

## Prepare training data

In [1]:
import os

import numpy as np
import pandas as pd
from openff.toolkit import ForceField
from pkg_resources import resource_filename

from factorpol.bcc_training import BccTrainer
from factorpol.charge_training import ChargeTrainer
from factorpol.qm_worker import QWorker
from factorpol.utilities import (BondChargeCorrections, calc_rrms,
                                 flatten_a_list, original_bcc_collections,
                                 Polarizability, retrieve_records,
                                 StorageHandler)

cwd = os.getcwd()
off_forcefield = ForceField(resource_filename(
    "factorpol", os.path.join("data", "off_examples.offxml")
))


  setattr(self, word, getattr(machar, word).flat[0])
  return self._float_to_str(self.smallest_subnormal)
  setattr(self, word, getattr(machar, word).flat[0])
  return self._float_to_str(self.smallest_subnormal)


Create a `sqlalchemy` session

In [2]:
st = StorageHandler(local_path=os.path.join(cwd, "data_tmp"))
ses = st.session("factorpol_examples")

In [3]:
dataset = ["CO", "C=C"]

In [4]:
esp_records = retrieve_records(my_session=ses, dataset=dataset)

In [5]:
print(f"Number of ESP records:\t {len(esp_records)}")

Number of ESP records:	 2


Load previously derived polarizability parameters.<br> 
**Attention**: Because it was derived from one single molecule, it is also poorly defined and should only be used as an example.

In [6]:
polarizability = Polarizability(data_source="ret_alphas.csv")

In [7]:
polarizability.data

Unnamed: 0_level_0,Polarizability (angstrom**3)
Type,Unnamed: 1_level_1
[#1:1],0.37714
[#6:1],1.440485
[#7:1],0.0
[#8:1],0.0


## BCC-dPol Training

In [8]:
bcc_workers = BccTrainer(
    training_set=esp_records.values(),
    polarizability=polarizability,
    reference_collection=original_bcc_collections,
    off_forcefield=off_forcefield,
)

  self._r_jk == 0.0, self._r_jk, np.power(self._r_jk, -3)
  self._r_jk == 0.0, self._r_jk, np.power(self._r_jk, -3)


In [9]:
ret = bcc_workers.training()

In [10]:
dt = pd.DataFrame([{"SMIRKS": k, "value": v} for k, v in ret["bcc_parameters"].items()])

In [11]:
dt

Unnamed: 0,SMIRKS,value
0,[#6X4:1]-[#1:2],0.014983
1,"[#6X4:1]-[#8X1,#8X2:2]",0.041692
2,"[#8X1,#8X2:1]-[#1:2]",-0.123213


Save a copy of bcc results to local

In [12]:
dt.to_csv("ret_bccs.csv", index=False)

In [13]:
ret_bccs = BondChargeCorrections(data_source="ret_bccs.csv")

### Generate AM1-BCC-dPol charges

In [14]:
methanol = ChargeTrainer(
    record=esp_records["CO"],
    polarizability=polarizability,
    off_forcefield=off_forcefield,
    coulomb14scale=0.5,
)

  self._r_jk == 0.0, self._r_jk, np.power(self._r_jk, -3)


In [15]:
am1bccdpol = BccTrainer.generate_charges(
    offmol=methanol.offmol,
    bcc_collection=ret_bccs.recharge_collection,
)

In [16]:
am1bccdpol

0,1
Magnitude,[0.02162148520430024 -0.4951345311396327 0.05045669880330044  0.05045669880330044 0.05045669880330044 0.3221429420748505]
Units,elementary_charge


In [17]:
am1bccdpol_esp = methanol.calc_Esps_dpol(partial_charge=am1bccdpol.magnitude)
rrms = calc_rrms(calc=am1bccdpol_esp, ref=methanol.esp_values)
print(f"Quality-of-fit RRMS = {rrms:.3f}")

Quality-of-fit RRMS = 0.804
