Dr Oliviero Andreussi, olivieroandreuss@boisestate.edu

Boise State University, Department of Chemistry and Biochemistry

# Molecular Modeling Tools for the Computational Thermochemistry Lab {-}

Before we start, let us import the main modules that we will need for this lecture. 

In [None]:
# @title Notebook Setup { display-mode: "form" }
# Import the main modules used in this worksheet
import numpy as np
import matplotlib.pyplot as plt
# Load the google drive with your files 
#from google.colab import drive
#drive.mount('/content/drive')
# The following needs to be the path of the folder with all your datafile in .csv format
#base_path = '/content/drive/MyDrive/'
base_path = './'

Set the local path, in case you want to save some of the results and plots from this notebook

In [None]:
# @title Set Local Path { display-mode: "form" }
# The following needs to be the path of the folder with all your collected data in .csv format
local_path="Colab Notebooks/ParticleBox_Data/" # @param {type:"string"}
path = base_path+local_path

Most chemistry applications of quantum mechanics (a.k.a. quantum chemistry) relies on a powerful commercial software called Gaussian. This code was first developed by a forefather of quantum chemistry and Nobel prize winner John Pople. However, Gaussian is a Fortran 77 code that requires an expensive license to run. For our applications we can achieve the same results using Python-based codes, at the expense of some computing time. In the following we will be using [PySCF](https://pyscf.org/index.html) for our quantum chemistry calculations, so we will need to install it on our Colab instance.

In [None]:
# @title PySCF Setup { display-mode: "form" }
# Import the main components of PySCF used in this worksheet
!pip install pyscf
!pip install pyberny
!pip install pyscf\[geomopt\]
from pyscf import gto, scf, lo, tools
from pyscf.geomopt.berny_solver import optimize
#
from scipy.constants import physical_constants # we will need these for units conversion

## Visualize the Systems

The following module needs to be installed on Colab to visualize and generate the molecular systems that we will simulate. 

In [None]:
# @title Install and load RDKit, CirPy, and Py3DMol { display-mode: "form" }
!pip install rdkit
from rdkit import Chem
from rdkit.Chem import Draw
!pip install cirpy
import cirpy
! pip install py3Dmol
import py3Dmol

In particular we can use them to draw the molecules in our experiments. While for some molecules you can just write their names and RDKit will plot them, for most molecules you will need to provide their SMILES or their CAS numbers.  Luckily, CIRpy can usually find SMILES for you, if you type the common name correctly or if you know the CAS number. 

These are the CAS numbers for the molecules in the first part of the computational thermochemistry experiments:
* cas_list = ["106-98-9", "590-18-1", "624-64-6", "115-11-7"]


In [None]:
# @title Choose the molecule to draw { display-mode: "form" }
input = '590-18-1' # @param {type:"string"}
input_type = 'cas' # @param ["smiles", "name", "cas"] {allow-input: true}
if input_type != 'smiles' :
    smiles=cirpy.resolve( input, 'smiles')
else:
    smiles=input
img = Draw.MolToImage( Chem.MolFromSmiles(smiles), size=(300, 300) )
display(img)

Let's first go through the main steps of a QC calculation on a molecule. Before we run any simulation, we need to get some initial guess for the positions of the atoms of our molecule. Luckly we can use `cirpy` to convert our molecule into a `xyz` format, that contains the number of atoms, a comment line, followed by the element and Cartesian coordinates of all the atoms in the molecule. 

In [None]:
print(cirpy.resolve(input,'xyz'))

For our calculation we only need the atoms information, so we will use some `str`+`list` methods to remove the first two lines from the `xyz` format. 

In [None]:
xyz = ''.join(string+'\n' for string in cirpy.resolve(input,'xyz').split('\n')[2:])
print(xyz)

In [None]:
xyz = cirpy.resolve(input,'xyz')  # for a pyscf.gto.Mole object
view = py3Dmol.view(width=400, height=300)
view.addModel(xyz, 'xyz')
view.setStyle({'stick': {}})
view.zoomTo()
view.show()

## Model Chemistry

We can now create a `Mole` object in PySCF and setup the QC method to use. Part of the accuracy of your calculation will depend on the basis set adopted. The larger the basis set is, the more expensive and (hopefully) more accurate the calculation will be. The available basis sets are listed [here](https://pyscf.org/_modules/pyscf/gto/basis.html). Common choices for small organic molecules go include: `631g`, `631+g*`, `6311g`, and `6311++g**`.  