# The BindingParadise Class 

The binding paradise class was written with the goal of providing 
an easy to use interface between the BEEP database and the user. I this tutorial
I will explain the different methods contain in the class and how to access the 
different data contained in the database. Even though this class is 
a straightforward gateway to the database, the user can also use any QCPortal 
method to access the data in the database. 

The first thing we will do is to create a BindingParadies object which we will call
`bp`. The only thing the user needs to provide is a username and password. 

In [12]:
from beep.beep import BindingParadise 

bp = BindingParadise('guest','pOg_41tzuDxkTtAfjPuUq8WK5ssbnmN8QfjsApGXVYk')

Next we will list all the molecules that have been included in the database thus far using the `molecule_list()` method

In [13]:
bp.molecules_list()

['ch2o2_W22',
 'ch3_W22',
 'ch3o_W22',
 'ch3oh_W22',
 'ch4_W22',
 'co_W12',
 'co_W22',
 'h2_W22',
 'h2co_W12',
 'h2co_W22',
 'h2o_W22',
 'h2s_W22',
 'hcl_W22',
 'hcn_W22',
 'hco_W22',
 'hf_W12',
 'hf_W22',
 'n2_W22',
 'nh2_W22',
 'nh3_W12',
 'nh3_W22',
 'nh3_W37',
 'nh3_W60',
 'nhch2_W22']

As you can see all molecules contain the target molecule and the ice model on which the binding energies were computed for example W22, means they were computed on an a set of amorphous solid water cluster containing 22 molecules each. 

In this tuturial we will be interested in exploring the binding energies and binding sites of amonia (NH3). Therfore we set the molecule with the `set_molecule` method. 

In [14]:
bp.set_molecule('ch3oh_W22');

Molecule set: CH3OH_W22


Now that a molecule has been asigned to the `bp` object we can retrive the different optimization methods of the binding sites.

In [15]:
bp.get_optimization_methods();

List of optimization methods: ['PWB6K-D3BJ', 'HF3C']


Wow! three methods! Let's load the data of one of those three methods

In [16]:
bp.load_data("PWB6K-D3BJ")

100%|██████████| 19/19 [00:00<00:00, 40.84it/s]


Once we have the optimization method, we must list the levels of theory of the binding energy computations for these geomtries. 

In [17]:
bp.get_methods();

The available binding energy methods for PWB6K-D3BJ geometries are : ['tpssh-d3bj', 'tpssh', 'wpbe-d3bj', 'wpbe', 'pwb6k-d3bj', 'pwb6k']


We are finally ready to retrive the  binding energies!

In [18]:
df_be = bp.get_values('tpssh-d3bj', zpve=True)

100%|██████████| 10/10 [00:03<00:00,  2.62it/s]


Notice that the return object is a pandas dataframe and that the user can select if the values should be corrected for zero-point vibrational energy or not. Let's check out the resulting dataframe with the binding energies! The nomenclature of the entries is as follows: First the molecule name (e.g. NH3), than the ASW model (W22), the number of the cluster and finally the number of the binding site on the specific cluster.  

In [19]:
df_be

Unnamed: 0,TPSSH-D3BJ/def2-tzvp,TPSSH-D3BJ + ZPVE
ch3oh_W22_01_0001,-8.7963,-7.02989
ch3oh_W22_01_0002,-7.94529,-6.3333
ch3oh_W22_01_0003,-8.50918,-6.79487
ch3oh_W22_01_0005,-7.35282,-5.84833
ch3oh_W22_01_0006,-5.1027,-4.00649
...,...,...
ch3oh_W22_12_0024,-8.92082,-7.13182
ch3oh_W22_12_0026,-7.35545,-5.85048
ch3oh_W22_12_0027,-7.53145,-5.99455
ch3oh_W22_12_0028,-7.02799,-5.58243


Finally we can also access the xyz coordinates of all the different binding sites (up to 250 per species!) to do this we call the `get_molecules` methods which saves the molecule objects in a Pandas dataframe.

In [20]:
df_mol = bp.get_molecules(be_method='tpssh-d3bj')

100%|██████████| 10/10 [00:02<00:00,  4.34it/s]


In [21]:
df_mol

Unnamed: 0,TPSSH-D3BJ/def2-tzvp,molecule
ch3oh_W22_01_0001,-8.7963,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_01_0002,-7.94529,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_01_0003,-8.50918,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_01_0005,-7.35282,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_01_0006,-5.1027,"Molecule(name='CH48O23', formula='CH48O23', ha..."
...,...,...
ch3oh_W22_12_0023,-4.0841,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_12_0024,-8.92082,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_12_0026,-7.35545,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_12_0027,-7.53145,"Molecule(name='CH48O23', formula='CH48O23', ha..."


Finally form this dataframe we can use Pandas to filter specific molecules. For example if we want the coordinates of the binding sites that have a binding energy over 10 kcal/mol, we can filter those out:

In [26]:
mol_high = df_mol[df_mol["TPSSH-D3BJ/def2-tzvp"] < -12.0]
mol_high

Unnamed: 0,TPSSH-D3BJ/def2-tzvp,molecule
ch3oh_W22_03_0036,-12.0508,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_04_0021,-12.6029,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_04_0027,-13.2289,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_04_0033,-12.3037,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_05_0014,-12.5846,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_05_0025,-12.5008,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_05_0027,-12.6998,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_06_0027,-12.382,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_09_0010,-12.0825,"Molecule(name='CH48O23', formula='CH48O23', ha..."
ch3oh_W22_10_0040,-13.0881,"Molecule(name='CH48O23', formula='CH48O23', ha..."


So that we can chose one of these structures extract the xyz and use it as an initial structure for a reactivity study for example.

In [28]:
mol1 = mol_high['molecule'][0]
mol1



NGLWidget()

We can see that in this binding mode the methanol molecule is well inserted into the ASW surface and acts as a hydrogen donor and acceptor, as expected for a high binding energy mode. From the molecule object you can extract xyz coordianates and many other properties! 

In [32]:
print(mol1.to_string(dtype="xyz"))

72
0 1 CH48O23
O                     2.491643887740     2.767966357417     2.245585359080
H                     2.679107719259     3.294457279354     1.466797157884
H                     1.603238421264     3.012715098396     2.508937774701
O                    -4.430938873733    -0.781298501465    -0.352341363666
H                    -4.105756149252    -1.159562780278    -1.172222016809
H                    -4.479532989443    -1.498059407584     0.291186967565
O                    -1.497291846624     0.676195985643    -3.542687641246
H                    -1.883577996858     1.488920173191    -3.209922241261
H                    -2.128565984145    -0.024286132991    -3.363352474402
O                     2.929497886186    -2.477901436043     2.122958785379
H                     2.742544832386    -1.532450179413     2.099809536707
H                     3.344761039629    -2.666781361653     1.278304161358
O                     1.022018484045    -4.102497380736    -0.590678752314
H         

In this way every single binding site that was computed using beep can be easily accessed to be used in future 
studies of intersteller surface chemistry. 