# Plotting outputs from Gaussian

This script is loosely based on one written by Alessio Petrone to apply a gaussian smoothing to frequency results from Gaussian outputs.

## Description

This Python script processes frequency data from Gaussian output files to create smooth and visually appealing vibrational spectra plots. It works by reading the frequency and intensity data from the files, applying a Gaussian smoothing function, and then plotting the results. The script includes a few key functions: one for setting a shared y-axis label across multiple subplots, and another for extracting and processing the data. Running this script in chunks with Jupyter Notebook is a great way to understand each step and see how the data transforms at each stage. This hands-on approach helps you learn Python and data visualization concepts interactively. 

In [7]:
#load libraries

import sys
import math
import re
import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from mpl_toolkits.axes_grid1.inset_locator import inset_axes

What is this first step doing?

In [2]:
#matplotlib.rcParams['axes.linewidth'] = 2.0
font = {'size'  : '18', 'weight'  : 'bold'}
matplotlib.rc('font', **font)
matplotlib.rcParams['axes.linewidth'] = 2

We won't have time to discuss functions in the basics workshop it might be good to add a little context here: 

## Functions in Python
Functions in Python are reusable blocks of code that perform a specific task, enabling modular and organized programming. They take input in the form of arguments, process the input, and return an output, making code more efficient and easier to maintain. Additionally, functions enhance code readability and facilitate debugging by isolating different functionalities into self-contained units.

Here we are coding some specific functions. Breifly: 
* `set_shared_ylabel` - This function sets a shared y-axis label for multiple subplots (axes) in a figure. It calculates the appropriate position for the shared label based on the coordinates of the tick labels on the y-axis of each subplot, ensuring proper alignment and avoiding overlap.
* `extract` - This function extracts frequency and intensity data from Gaussian output files. It processes either Raman or IR intensity data depending on the typeSpect parameter, smooths the data using a Gaussian function, and returns arrays of frequencies and corresponding intensities. The function reads temporary files, performs data cleanup, and ensures the presence of required data before proceeding with calculations.

In [3]:
def set_shared_ylabel(a, ylabel, labelpad = 0.01):
    """Set a y label shared by multiple axes
    Parameters
    ----------
    a: list of axes
    ylabel: string
    labelpad: float
        Sets the padding between ticklabels and axis label"""
    f = a[0].get_figure()
    f.canvas.draw() #sets f.canvas.renderer needed below
    # get the center position for all plots
    top = a[0].get_position().y1
    bottom = a[-1].get_position().y0
    # get the coordinates of the left side of the tick labels 
    x0 = 1
    for at in a:
        at.set_ylabel('') # just to make sure we don't and up with multiple labels
        bboxes, _ = at.yaxis.get_ticklabel_extents(f.canvas.renderer)
        bboxes = bboxes.inverse_transformed(f.transFigure)
        xt = bboxes.x0
        if xt < x0:
            x0 = xt
    tick_label_left = x0
    # set position of label
    a[-1].set_ylabel(ylabel)
    a[-1].yaxis.set_label_coords(tick_label_left - labelpad,(bottom + top)/2, transform=f.transFigure)

def extract(typeSpect, enMax, enMin, sigma, resolution, fileNM):
  itn = []
  freq = []
  Frequency = []
  Response = []
  numPts = int((enMax - enMin)/resolution)
  if typeSpect == "raman":
  # GREP HERE FOR RAMAN ACTIVITY
    bashCommand = 'grep "Raman Activ" ' + fileNM + '>tmpInt.txt'
    os.system(bashCommand)
  else:
  # GREP HERE FOR IR ACT
    bashCommand = 'grep "IR Inten" ' + fileNM + '>tmpInt.txt'
    os.system(bashCommand)
  
  #Grep here for Frequencies
  bashCommand = 'grep "Frequencies" ' + fileNM + '>tmpFreq.txt'
  os.system(bashCommand)
  
  #Open the temp files generated and read in the data
  try:
  #  print ("FREQUENCY")
    with open('tmpFreq.txt','r') as fileIn:
      for line in fileIn:
        tmp = line.split()
  #      print(tmp)
        Frequency.append(tmp)
  finally:
    fileIn.close()
  try:
  #  print("RESPONSE:")
    with open('tmpInt.txt','r') as fileIn:
      for line in fileIn:
        tmp = line.split()
  #      print(tmp)
        Response.append(tmp)
  finally:
    fileIn.close()
  ##Remove the tmp files
  os.system('rm tmpFreq.txt tmpInt.txt')
  
  #Ensure file has response
  if len(Frequency) == 0:
    print("No frequencies found please be sure input file is a frequency \
    calculation")
    return [np.asarray[0.], np.asarray[0.]]
  else:
    rangeFreq = [len(Frequency[0])-3,len(Frequency[0])]
    rangeRes = [len(Response[0])-3,len(Response[0])]
    for i in range(0,len(Frequency)):
      for j in range(rangeFreq[0],rangeFreq[1]):
        freq.append(float(Frequency[i][j]))
      for k in range(rangeRes[0],rangeRes[1]):
        itn.append(float(Response[i][k]))
    #if enMax <= freq[len(freq)-1]:
    #  enMax = 50 + freq[len(freq)-1]
  
  #Data holds the Frequency of the Excitation vs the Intensity
    data = [[0,0] for x in range(0,numPts)]
    for i in range (numPts):
      x = (i * resolution) + enMin
      for j in range(0,len(freq)):
        y = data[i][1] + itn[j]* (1 / (sigma * np.sqrt(2*np.pi))) * \
        (math.exp(-0.5*(((x-freq[j])**2)/sigma**2)))
        data[i] = [x,y]
    data = np.asarray(data)
    return [data[:,0], data[:,1]]

## The Main Block 
The main block of the script performs the following tasks:
* Initializes parameters such as resolution and sigma for smoothing.
* Specifies a list of Gaussian output files (fileList) and their corresponding legends for plotting.
* Iterates over the fileList, calling the extract function for each file to get smoothed frequency and intensity data, storing the results in plots.
* Sets up subplots for each dataset, normalizes the intensity data, and plots the data.
* Adds a shared y-axis label and customizes tick parameters for better visualization.
* Finally, saves the combined plot as "VibPlts.png" and closes the plot to free up resources.

The program provide a message when it is processing each of the log files we are passing it: ethane.log and benzophenone.log.

In [None]:
###MAIN PROGRAM BEGINS###
resolution = 1; sigma = 4 
fileList = ["data/ethane.log", \
            "data/benzophenone.log", \
              ]
legends = ["ethane", \
           "benzophenone", \
           ]
letters = ['(a)', '(b)', '(c)', '(d)', '(e)', '(f)']
plots = []

for fileNM in fileList:
  print("DOING: ", fileNM)
  plots.append(extract('ir', 3500, 0, sigma, resolution, fileNM))

## Output Plots

Lastly, the program is outputting an image called VibPlts.png which will appear in your current directory when the run completes. You can double-click VibPlts.png to view it in JupyterLab. 

Overall, the script processes Gaussian output files to produce and save a plot of vibrational spectra, either IR or Raman, with Gaussian smoothing applied to the data.

In [6]:
#PLOTTING:
fig, ax = plt.subplots(len(fileList), sharex=True, gridspec_kw={'hspace': 0},figsize=(9,9))
for i, p in enumerate(plots):
  #Normalize each to itself:
  maxLight = max(p[1])
  
  if maxLight > 0:
    p[1] = p[1] / maxLight
  
  ax[i].plot(p[0], p[1], color='k')
  ax[i].set_xlim([0, 3500])
  #ax[i].annotate(letters[i], [1749,1.00], ha='right',va='top')
  ax[i].annotate(legends[i], [0,1.00], ha='left',va='top')
  
  if i == len(fileList)-1:
    ax[i].set_yticks([0.0,0.5,1.0])
  else:
    ax[i].set_yticks([0.5,1])
  ax[i].set_ylim([0,1.02])

#Shared Y-axis label:
fig.add_subplot(111, frameon=False)
plt.tick_params(labelcolor='none', which='both', top=False, bottom=False, left=False, right=False)
plt.ylabel("Intensity (arb. units)", fontweight='bold')

plt.xlabel("Energy (cm$^{-1}$)",fontweight='bold')
fig.tight_layout()
#plt.show()
plt.savefig("VibPlts.png")
plt.close()