We now move on from using Python to do basic tasks, to see how these tools can make running and analyzing computational chemistry calculations easier!

To do this we will be using pymatgen to read Gaussian Output files and setup new calculations!

In [None]:
from pymatgen.io.gaussian import GaussianInput, GaussianOutput
from pymatgen.core.structure import Molecule

import numpy as np
import matplotlib.pyplot as plt

To start our analysis let's turn to the dithiophene example from Section 2 where the initial relaxation led to a saddle point instead of a global minimum. And see what type of results we can get

In [None]:
dithiophne_ouput = GaussianOutput(f"dithiophene.log")

print("properly_terminated")
print(dithiophne_ouput.properly_terminated, "\n")

print("stationary_type")
print(dithiophne_ouput.stationary_type, "\n")

print("1st vibrational frequency")
print(dithiophne_ouput.frequencies[0][0], "\n")

print("energies")
print(dithiophne_ouput.energies, "\n")

print("eigenvalues")
print(dithiophne_ouput.eigenvalues, "\n")

print("Mulliken_charges")
print(dithiophne_ouput.Mulliken_charges, "\n")

While this is a subset of different infromation we can get out the output files, the thing that stands out right away is that we can immediately see that the calculation did not reach a global minimum, but a saddle point. 

To see all possible outputs accessible with Outputs object refer the API: https://pymatgen.org/pymatgen.io.html#pymatgen.io.gaussian.GaussianOutput

Looking at this first vibrational mode we see a new data type for python indicated by {}. This is  a dictionary in python and it allows us to map one object to another in a  {key: value} manner, where value can be anything and the key is normally a string or numeric value. As an example lets make a new dictionary dct.

In [None]:
dct = {1: "hi", "a": True, "c": np.arange(10)}
print(dct[1])
print(dct["a"])
print(dct["c"])


We can now use this vibrational mode information to distort the structure like we did in Gaussian, but through the python interface with less time.

In [None]:
# Get the final structure
dithiophene = dithiophne_ouput.final_structure

# Get the normal mode of interest (first vibrational mode) and make it a [N x 3] matrix instead of a [N * 3] list
mode = np.array(dithiophne_ouput.frequencies[0][0]["mode"]).reshape((-1, 3))

# Get the displaced Coordinates
coords = dithiophene.cart_coords
coords += mode

# Create a new Molecuel with the updated coordinates
displaced_dithiophene = Molecule(
    species=dithiophene.species,
    coords=coords,
    charge=dithiophene.charge,
    spin_multiplicity=dithiophene.spin_multiplicity,
)

# Create the updated relaxation input file
file_prefix = "dithiophene_disp"
inputs = GaussianInput(
    displaced_dithiophene,
    title=file_prefix,
    functional=dithiophne_ouput.functional,
    basis_set=dithiophne_ouput.basis_set,
    dieze_tag = "#",
    route_parameters = {"opt": None, "freq": None},
    link0_parameters = dithiophne_ouput.link0,
)

# Write the input file
inputs.write_file(f"{file_prefix}.gjf")

We see that the initial input line is split up into 4 parts: basis_set, functional, dieze_tag, and route_parameters, with all preamble portions set in link0_parameters (see below). The `dieze_tag` is used to set the output level for (#P: verbose, #N: Normal, #T: Terse) and must be one of these or just "#" For normal. `basis_set` and `functional` are converted into the "{basis_set}/{functional}" framework of Gaussian, and `route_parameters` is all other keywords specified as a dictionary. When we just need a keyword flag like `opt` and `freq` the value for the dict should be `None`. In python `None` is an object that encapsulates the concept of nothing.

We can now see what the file will look like by printing the inputs object. In this case the coordinates are using interal coordinates, we can fix this using `cart_coords=True` in the `write_file` function.

In [None]:
print(inputs.link0_parameters)
print(inputs)
inputs.write_file(f"{file_prefix}.gjf", cart_coords=True)

We now need to run the calculation from python using a subprocess call. While it is possible to do this in python, it is easier to run this by hand for now.

Once done let's look at the outputs, in particular if we are at a minimum or not.

**Problem 1**

Load the output file into python and determine if the calculation is at a minimum.

In [None]:
# Your code goes here
output_displaced = GaussianOutput(f"{file_prefix}.log")
print(f"properly_terminated: {output_displaced.properly_terminated}")
print(f"stationary_type: {output_displaced.stationary_type}")

Now that we know our structure is fully relaxed, we can plot the relaxation trajectory of the structure.

**Problem 2**

Plot the total energy of each step of the relaxation for the displaced relaxation structure

In [None]:
# Your code here
# len() gives the length (size) of a list/array
plt.plot(np.arange(1, len(output_displaced.energies) + 1), output_displaced.energies)
plt.show()

Note for the solution how matplot lib automatically sets the scale to be shifted down by 1.1406e3 Ha. this is to highlight the minute differences between the the starting and final geometry of the structure.

Now that we know we are at a mininum let's plot the IR-spectra of dithiophene. To do this we will use the stick spectra generated from Gaussian and artficially broaden it with a 50 cm$^{-1}$ Lorentzian to mimic gas phase molecular spectroscopy.

Reminder the Lorentzian is:

$f\left(x\right)=\frac{1}{\pi} \frac{\frac{1}{2} \Gamma}{\left(x-x_0\right)^2 + \left(\frac{1}{2} \Gamma\right)^2}$

In [None]:
def lorentzian(x, x0, Gamma):
    """Calculate the probability distribution function (PDF) for a Lorentzian for a given set of x-points

    Args:
        x (np.array[float]): The set of x-points to get the PDF for
        x0 (float): The peak location
        Gamma (float): The width of the peak

    Returns:
        np.array[float]: The values of the PDF for the specified gaussian
    """
    return 0.5 * Gamma / (np.pi * ((x - x0)**2.0 + (0.5 * Gamma)**2.0))

wavenumbers = np.arange(450, 4250.01, 0.1)
ir = np.zeros(wavenumbers.shape)

# Here is a for loop this is a structre that will loop over a series of values and do an action on that
for vib in output_displaced.frequencies[0]:
    ir += lorentzian(wavenumbers, vib["frequency"], 50.0) * vib["IR_intensity"]

plt.plot(wavenumbers, ir)
plt.xlabel("Wavenumber (cm$^{-1}$)")
plt.ylabel("I. R. Absorption (A. U.)")
plt.show()

In the above code, we encountered our first for loop. This is a contruction in coding that allows us to loop over a series of values (the vibrational frequencies in this case) and do the same operation on that data. Let's play around with them as an aside.

In [None]:
# Looping over a list
lst = [1, "a", 12.3]
for el in lst:
    print(el)

We can also loop over a range of numbers to do some consistent numerical operation on them. Note what `x += y` is doing here as a shorthand for `x = x + y`

In [None]:
# Looping over an implicit value with the range function
sum_explicit = 0
sum_implicit = 0
for num in range(1, 10):
    sum_explicit = sum_explicit + num
    sum_implicit += num
print(f"The sum of all numbers from 1 to 9 is {sum_explicit}")
print(f"The sum of all numbers from 1 to 9 is {sum_implicit}")

If we want to acces both an index number and an element of a list we can use `enumerate`

In [None]:
# Looping over an index and element using enumerate
for ind, el in enumerate(lst):
    print(f"The {ind + 1}th element of lst is {el}")

Finally we can loop over multiple lists using the zip command

In [None]:
ordinal = ["1st", "2nd", "3rd"]
for ind, el in zip(ordinal, lst):
    print(f"The {ind} element of lst is {el}")

But if the lists are not the same size, it will stop at whatever value is smaller

In [None]:
ordinal = ["1st", "2nd", "3rd", "4th"]
for ind, el in zip(ordinal, lst):
    print(f"The {ind} element of lst is {el}")

print()

ordinal = ["1st", "2nd"]
for ind, el in zip(ordinal, lst):
    print(f"The {ind} element of lst is {el}")

With the relaxed structure we can now calculate the absorption sepctra of dithiophene

**Problem 3**

Use the final structure and same level of theory to do a TD-DFT cacluation and retrieve the outputs

In [None]:
# Your code goes here
file_prefix = "dithiophene_tddft"
inputs_td = GaussianInput(
    output_displaced.final_structure,
    title=file_prefix,
    functional=output_displaced.functional,
    basis_set=output_displaced.basis_set,
    dieze_tag="#",
    route_parameters = {"td": "(singlets, nstates=5)"},
    link0_parameters = dithiophne_ouput.link0,
)

filename = f"{file_prefix}.gjf"
inputs_td.write_file(filename)

Run the calculation on the cluster and load the ouputs

In [None]:
outputs_td = GaussianOutput(f"{file_prefix}.log")

We can now read the absorption spectra using the outputs `read_excitation_energies` function

This will output all transitions in the format of (energy, wavelength, oscillator strength) of the transistion. Each of these charecteristics are listed out in the form of a tuple (https://realpython.com/python-lists-tuples/#python-tuples), which is like a list but immutable. An immutable object is one that can not be changed once created 

In [None]:
transitions = outputs_td.read_excitation_energies()

print(transitions)

We can see that each element in the tuple can't be changed as this will through an error

In [None]:
transitions[0][0] = 1

**Problem 4**

Plot the absorption spectra of dithiophene using a peak width of 0.25.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

c = 299792458.0
h = 4.135667696e-15

# Set an arbitrary broadening width
sigma = 0.25

def gaussian(x, mu, sigma):
    """Calculate the probability distribution function (PDF) for a Gaussian for a given set of x-points

    Args:
        x (np.array[float]): The set of x-points to get the PDF for
        mu (float): The mean value of the distribution (center point)
        sigma (float): The standard devation of the distribution (width)

    Returns:
        np.array[float]: The values of the PDF for the specified gaussian
    """
    norm_fact = 1.0 / (np.sqrt(2 * np.pi) * sigma)
    return norm_fact * np.exp(-0.5 * ((x - mu) / sigma) ** 2.0)

# Get the wavelength and energy ranges in (nm and eV)
wavelengths = np.arange(150, 400, 0.01)
energy = h * c / (wavelengths * 1e-9)

# Initialize absorption spectra
abs = np.zeros(energy.shape)
for trans in transitions:
    abs += trans[2] * gaussian(energy, trans[0], sigma)

plt.plot(wavelengths, abs)
plt.xlabel("Wavelength (nm)")
plt.ylabel("Absorption (A. U.)")
plt.show()

We can now even see the how these peaks are composed of different excited state transitions, by adding the stick spectra ontop of it using the stem funciton in matplotlib

In [None]:
plt.plot(wavelengths, abs)

ex_lambda = [trans[1] for trans in transitions]
ex_abs = [trans[2] for trans in transitions]

# Complicated matplotlib formatting things we don't need to worry about
(markerline, stemlines, baseline) = plt.stem(ex_lambda, ex_abs, markerfmt="", basefmt="")
plt.setp(baseline, visible=False)

plt.xlabel("Wavelength (nm)")
plt.ylabel("Absorption (A. U.)")
plt.show()