# Assignment 3: Normal Modes Analysis

One of the general approaches in quantum chemistry is to change the representation, we descripe our system in, to simplify the equations we have to solve. For example, where the system is not described in terms of its usual cartesian coordinates, but instead in terms of mass weighted displacement coordinates. In this representation the complicated Hamiltonian can be simplified to an sum of independent harmonic oscillators (exact up to second order).
Besides this great feature, normal modes can be used to understand the characteristic motion of a molecule by decomposing it into the motion of independent normal modes. This is typically used to understand IR or Raman-Spectra.

In the assignemnt, you will write your own code to transform from cartesian coordinates to normal mode coordinates and compute the IR frequencies of the system.
Herefore, you have to read in an `hessian.dat` file, which contains the hessian of the given molecular system in the format:

--------------------------
$F_{x1,x1}$ $F_{x1,y1}$ $F_{x1,z1}$ <br>
$F_{x1,x2}$ $F_{x1,y2}$ $F_{x1,z2}$ <br>
... <br>
$F_{x2,x1}$ $F_{x2,y1}$ $F_{x2,z1}$ <br>
... <br>

--------------------------

Afterwards, the hessian is massweighted:

\begin{align}
 F^M_{IJ} = \frac{F_{IJ}}{\sqrt{m_i*m_j}}
\end{align}

Diagonlize the mass weighted hessian and compute the harmonic frequencies as a function of its eigenvalues $\lambda_i$.

\begin{align}
 \omega_i = \text{constant} * \sqrt{\lambda_i}
\end{align}

This assignment was taken from:

https://github.com/CrawfordGroup/ProgrammingProjects/blob/master/Project%2302/project2-instructions.pdf

In [None]:
import os
import json
import numpy as np
# Get our data from github
if not os.path.exists("../../python_lessons/"):
  #Move to the folder of assignment 3
  !git clone https://github.com/MFSJMenger/python_lessons.git
  os.chdir("python_lessons/assignment3/")

Import the database of atomic masses. Try to inspect its content. It will be useful to create a dictionary to associate the mass to the atomic symbol.

In [None]:
MASS_Z = json.load(open('MASSES.json', 'r'))     # open a JSON file named 'MASSES.json' in read mode ('r') using the 'open' function and then loads its contents into a python dictionary(MASS_Z) using 'json.load' function.

MASS = {                                         # create a new dictionary named 'MASS' and retrive the masses of the respective atomic symbol from the 'MASS_Z' dictionary.
    'h' : MASS_Z['1'],                           # assess the mass associated with atomic number 1 in the 'MASS_Z' dictionary.
    'c' : MASS_Z['6'],
    'o' : MASS_Z['8'],
    'cl': MASS_Z['17']
}

Write a general function to read the number of atoms, the masses, and the molecular coordinates from a xyz file.
The xyz file has the following structure:

> Number of atoms

> Text with information on the molecule

> $x_1$   $y_2$   $y_3$

> $x_2$   $y_2$   $z_2$

> ...



In [None]:
def read_xyz(xyzfile):                # Define a function that takes the arguement 'xyzfile' which is the file name of an XYZ file.
  with open(xyzfile, 'r') as fh:      # opens the XYZ file in read mode and further names it as fh. The'with' statement ensures that the file is properly closed after its suite finishes
    xyz = fh.readlines()              # reads the lines from fh and stores them as a list of strings in 'xyz'

  # Read the number of atoms
  nAtoms = int(xyz[0])                 # extracts the first line of the file and converts it into an integer. The first line contains the number of atoms in the molecule
  masses = np.zeros(nAtoms)            # creates an array named 'masses' which is initialized to zero for storing the masses of each atom.
  geom   = np.zeros((nAtoms,3))        # creates an array with dimensions ( , ), initialized to zero for storing the atomic coordinates
  i = 0                                # a counter to keep track of the atom being processed.

  for line in xyz[2:]:                                # iterates over each line in the 'xyz' list starting from the third line
    line_split = line.split()                         # splits each line into a list of strings based on whitespace. It seperates the atom symbol and its coordinates.
    masses[i] = MASS[line_split[0].lower()]           # looks up the mass of the atom symbol (converted to lowercase) in the 'MASS' dictionary and assigns it to the 'masses' array at index 'i'.
    geom[i] = [float(x) for x in line_split[1:]]      # converts the coordinates to floats and assigns them to the 'geom' array at index 'i'.
    i+=1                                              # increments the counter 'i' to move to the next atom.
  return nAtoms, masses, geom                         #returns the number of atoms, the array of masses and the array of atomic coordinates

print(read_xyz('benzene.xyz'))                        # reads the benzene.xyz file and returns the values. This demonstrates the use of the defined function.

(12, array([21874.66181995, 21874.66181995, 21874.66181995, 21874.66181995,
       21874.66181995, 21874.66181995,  1837.15258739,  1837.15258739,
        1837.15258739,  1837.15258739,  1837.15258739,  1837.15258739]), array([[ 0.        ,  2.62065942,  0.        ],
       [-2.26955763,  1.31032971,  0.        ],
       [-2.26955763, -1.31032971,  0.        ],
       [ 0.        , -2.62065942,  0.        ],
       [ 2.26955763, -1.31032971,  0.        ],
       [ 2.26955763,  1.31032971,  0.        ],
       [-4.04130651,  2.3332494 ,  0.        ],
       [-4.04130651, -2.3332494 ,  0.        ],
       [ 0.        , -4.6664988 ,  0.        ],
       [ 4.04130651, -2.3332494 ,  0.        ],
       [ 4.04130651,  2.3332494 ,  0.        ],
       [ 0.        ,  4.6664988 ,  0.        ]]))


Write a general function to read the Hessian matrix.

**Hint:** Use the the numpy functions `loadtxt` to convert a list of strings to an array, and the function `reshape` to reorder the elements of the array into a matrix with the desired dimensions.

In [None]:
def read_hessian(hessfile):                                              # defines a function that takes a hessian file.
  with open(hessfile, 'r') as hs:                                        # opens the hessian file in the read mode.
    hessian = hs.readlines()                                             # reads all the lines from the file and stores them as a list of strings in the variable 'hessian'.
  nAtoms = int(hessian[0])                                               # extracts the first line of the hessian file that contains the atomic number
  shaped_hessian = np.loadtxt(hessian[1:]).reshape(3*nAtoms, 3*nAtoms)   # loads the numerical data from the first excluding the first line and reshapes the loaded data into a 3N*3N square matrix
  return shaped_hessian

print(read_hessian('benzene_hessian.dat'))                               # demonstration for the function

[[ 0.83704759  0.          0.         ... -0.08149339  0.
   0.        ]
 [ 0.          1.02782495  0.         ...  0.         -0.47489408
   0.        ]
 [ 0.          0.          0.1884039  ...  0.          0.
  -0.05333218]
 ...
 [-0.08149339  0.          0.         ...  0.07696693  0.
   0.        ]
 [ 0.         -0.47489408  0.         ...  0.          0.48721616
   0.        ]
 [ 0.          0.         -0.05333218 ...  0.          0.
   0.03685694]]


Write a general function to construct the mass-weighted Hessian by combining the Cartesian Hessian with the array of masses.

Notice that the Hessian has dimensions $3N \times 3N$, while the array of masses has dimension $N$, so it should be "triplicated". To this end, it might be useful to check what the numpy functions `column_stack` and `flatten` do...


In [None]:
def build_mw_hessian(hess, mass):                             # define a function for creating the mass-weighted Hessian matrix based on a given Cartesian hessian matrix and array of masses
  mass3 = np.repeat(mass,3)                                   # creates a new array that repeats each element in the mass array three times. eg: [1,2,3] becomes [1,1,1,2,2,2,3,3,3]
  n3 = len(mass3)                                             # calculate the length of the 'mass3' array. Represents the total number of elements in the Hessian matrix.
  hess_mw = np.zeros_like(hess)                               # creates a new array with the same shape as the hessian matrix but filled with zero.
  for n in range(n3):                                         # nested loop to iterate over each element
    for m in range(n3):
      hess_mw[n,m] = hess[n,m]/np.sqrt(mass3[n]*mass3[m])     # each element in the cartesian hessian matrix is divided by the square root of product of masses of atoms 'n' and 'm'
  return hess_mw                                              # returns the computed mass-weighted Hessian matrix.

Once the mass-weighted Hessian has been built one can find the frequencies by diagonalization. To this end, use the numpy function `linalg.eigvalsh`.

Don't forget to convert the result from atomic units to $\mathrm{cm}^{-1}$

In [None]:
molecule = 'benzene'                                   # indicates the name of the molecule
nAtoms, masses, geom = read_xyz(molecule + '.xyz')     # reads the file with name 'benzene_xyz' and retrieves the number of atoms, masses and the atomic coordinates
hess = read_hessian(molecule + '_hessian.dat')         # reads the file with name 'benzene_hessian.dat' and retrieves the hessian matrix from the file.
hess_mw = build_mw_hessian(hess, masses)               # calls the function to construct the mass-weighted Hessian matrix
eigenvalues = np.linalg.eigvalsh(hess_mw)              # calculates the eigenvalues of the mass-weighted matrix

freq = np.sqrt(np.abs(eigenvalues))                    # takes the square root of the absolute values of the eigenvalues of the mass-weighted Hessian matrix and multiplies them by the conversion factor to get the frequencies
freq_cm = freq*219474                                  # conversion factor for atomic units to cm-1
rev_freq_cm = np.flip(freq_cm)                         # reverses the order of values in the array
print("Frequencies(cm-1):")
for i,f in enumerate(rev_freq_cm):                     # returns a tuple for each element in freq_cm where the first element is the index of the element and the second element is the element itself
  print(f"Mode {i+1}: {f:.2f}")                        # i is the index of the current element and f is the value of the current element.


Frequencies(cm-1):
Mode 1: 3747.38
Mode 2: 3736.53
Mode 3: 3736.53
Mode 4: 3722.84
Mode 5: 3722.84
Mode 6: 3704.42
Mode 7: 1932.46
Mode 8: 1932.46
Mode 9: 1772.58
Mode 10: 1772.58
Mode 11: 1595.28
Mode 12: 1377.00
Mode 13: 1371.34
Mode 14: 1371.34
Mode 15: 1225.86
Mode 16: 1225.86
Mode 17: 1214.69
Mode 18: 1190.65
Mode 19: 1190.65
Mode 20: 1172.43
Mode 21: 1156.72
Mode 22: 1154.29
Mode 23: 1037.80
Mode 24: 1037.80
Mode 25: 841.22
Mode 26: 811.07
Mode 27: 700.17
Mode 28: 700.17
Mode 29: 478.08
Mode 30: 478.08
Mode 31: 0.67
Mode 32: 0.67
Mode 33: 0.67
Mode 34: 0.02
Mode 35: 0.01
Mode 36: 0.02


In [None]:
normal_modes = rev_freq_cm[:(3*nAtoms - 6)]           # finds the normal modes
print("Normal modes(cm-1):")
for i,f in enumerate(normal_modes):
  print(f"Mode {i+1}: {f:.2f}")

Normal modes(cm-1):
Mode 1: 3747.38
Mode 2: 3736.53
Mode 3: 3736.53
Mode 4: 3722.84
Mode 5: 3722.84
Mode 6: 3704.42
Mode 7: 1932.46
Mode 8: 1932.46
Mode 9: 1772.58
Mode 10: 1772.58
Mode 11: 1595.28
Mode 12: 1377.00
Mode 13: 1371.34
Mode 14: 1371.34
Mode 15: 1225.86
Mode 16: 1225.86
Mode 17: 1214.69
Mode 18: 1190.65
Mode 19: 1190.65
Mode 20: 1172.43
Mode 21: 1156.72
Mode 22: 1154.29
Mode 23: 1037.80
Mode 24: 1037.80
Mode 25: 841.22
Mode 26: 811.07
Mode 27: 700.17
Mode 28: 700.17
Mode 29: 478.08
Mode 30: 478.08
