# PFAS Radicals: A Quantum Chemistry Perspective  
---
In this lab exercise, you will:  
  
>  
>**Module 1**  
>  
>1. Model the equilibrium geometry and IR frequencies of the CH3 radical and compare to the CF3 radical  
>2. Develop a Python function to parse data from ORCA output files  
>
>**Module 2**  
>  
>3. Scan the X-C-X bond angle from 120 degrees (trigonal planar) to 109.5 degrees (tetrahedral) and compare results between the CH3 radical and the CF3 radical  
>4. Develop a Python function to write and edit ORCA input files  
>
>**Module 3**  
>  
>5. Use Python to analyze results from your ORCA calculations  
>  
  
This module will cover items **3** and **4**.

## Module 2: Coordinate Scans  
---  
By now, you should have seen that the CH3 radical and the CF3 radical do not have the same symmetry. To explore this further, you will perform a relaxed coordinate scan along the X-C-X bond angle.  
  
## Coordinate Scans - The Basics  
Scanning bond lengths, bond angles or dihedrals is a common technique used in computational chemistry. This technique is often used to identify transition states or barriers to rotation, and some form of automated coordinate scan is implemented in most quantum mechanics modeling softwares. Scans can be performed as a **relaxed** coordinate scan, or a **rigid** coordinate scan. **Rigid** coordinate scans iteratively change only the bond length, angle or dihedral requested and maintain all other dimensions fixed during the course of the scan. In contrast, **relaxed** coordinate scans iteratively update the scanning dimension while allowing all other dimensions to 'relax" to the lowest energy structure under the given constraint. We will use relaxed scans to allow the C-X bond length to change to minimize the energy of the resulting structures.  

For your system, you will be using an angle constraint to force the X-C-X bond angle to your set input. Rather than using ORCA's built-in coordinate scanning tool, we will generate a separate input file for each angle in the sweep. This will force you to flex your matrix algebra and python skills, and eliminate the need for a new output file parsing function (ORCA coordinate scan output files are a different format). For most normal uses, ORCA's built-in coordinate scan works just fine, and it will result in faster computations because it will autiomatically use guess parameters fromt eh previous step's wavefunction.
  
To add a geometric constraint to your input file, you will need to modify your input file by adding the following code:  
  
```  
%geom Constraints  
    { A * 0 * C }   
end  
```  
  
This constraint sets the angle (A) between atom 0 (this should be your carbon atom) and all other atoms to be constant (C). Running an **opt freq** calculation with this constraint will allow the C-X bond lengths to change to minimize the energy of the system, but will not allow the X-C-X bond angle to change. This means that the bond angle of the starting geometry will be the bond angle of the optimized geometry.  
To perform a bond angle scan, you will need to write a python code that can generate starting geometries with any bond angle you select.  

## Creating starting geometries  
You will write a python function that takes an input X-C-X bond angle and write the coordinates for each atom.  
You know:  
* Your system contains 4 atoms  
* You know the equilibrium bond lengths  
* C-X bond will be 120 degrees apart when projected onto the plane formed by the three X atoms, regardless of the X-C-X bond angle  
  
You may find the figure below provides a useful starting point:  

![geometry image](./JChemEd%20geometry%20figure.png)

In [1]:
import numpy as np
from math import isclose
import os
import sys; sys.path.insert(0, '..')
from utilities import module1_functions as m1 # import the function you wrote from the previous module
import pandas as pd

# ----------------------------
# students write this function
# ----------------------------
def calculate_z_displacement(bond_angle_deg : float) -> float:
    bond_angle_rad = np.deg2rad(bond_angle_deg)
    z_displacement =  np.sqrt((0.5 * np.sqrt(3) / np.sin(bond_angle_rad / 2))**2 - 1)
    return z_displacement

# ----------------------------
# students write this function
# ----------------------------
def scale_bond_lengths(bond_length : float, carbon_xyz_position, heteroatom_xyz_position_list : list) -> list:
    new_bond_vectors = []
    for heteroatom_position in heteroatom_xyz_position_list:
            bond_vector = heteroatom_position - carbon_xyz_position
            bond_unit_vector =  bond_vector / np.linalg.norm(bond_vector)
            new_bond_vectors.append(bond_unit_vector * bond_length)
    return new_bond_vectors

def generate_geometry(bond_angle_deg : float, bond_length : float) -> list:

    # generate geometry using simple scaled diagram
    z_displacement = calculate_z_displacement(bond_angle_deg)
    assert z_displacement >= 0, "The z displacement is either nan or less than zero. Check that you are not taking the square root of a negative number."

    heteroatom_xyz_positions = [np.array([1, 0, 0]), np.array([-np.cos(np.deg2rad(60)), np.sin(np.deg2rad(60)), 0]), np.array([-np.cos(np.deg2rad(60)), -np.sin(np.deg2rad(60)), 0])]
    carbon_xyz_position = np.array([0, 0, z_displacement])

    # adjust bond lengths to the equilibrium lengths
    heteroatom_xyz_positions = scale_bond_lengths(bond_length, carbon_xyz_position, heteroatom_xyz_positions)
    assert False not in check_bond_angles(bond_angle_deg, heteroatom_xyz_positions), "There actual angle is not equal to the input angle.\n\t\t \
    Check the z displacement calculation and ensure you are not changing the angles when scaling bond lengths."

    return heteroatom_xyz_positions

def make_input_file_text(heteroatom : str, bond_angle_deg : float, bond_length : float) -> str:

    # generate coordinates
    heteroatom_xyz_positions = generate_geometry(bond_angle_deg, bond_length)

    # format coordinates
    formatted_lines = []
    for heteroatom_coordinates in heteroatom_xyz_positions:
        atom_line = heteroatom + " " + " ".join(([ "{:0.10f}".format(coordinate) for coordinate in heteroatom_coordinates ]))
        formatted_lines.append(atom_line)
    heteroatom_section = "\n".join(formatted_lines)

    angle = "{:0.1f}".format(bond_angle_deg)

    input_file_text = (f"""
    ! UKS TightSCF wB97x-D3 def2-TZVPD xyzfile opt freq

    # C{heteroatom}3 cation {angle} degrees fixed

    %maxcore 1000
    %pal
    nprocs 18
    end

    %geom Constraints
            {{ A * 0 * C }}
        end
    end
    * xyz 1 1
    C 0.00 0.00 0.00
    {heteroatom_section}
    *
    """).strip()
    
    input_file_text = format_text_indentation(input_file_text)

    return input_file_text

def write_input_file(heteroatom : str, bond_angle_deg : float, bond_length : float):

    filename = f"C{heteroatom}3_cation_{bond_angle_deg}_degrees.inp"
    input_file_text = make_input_file_text(heteroatom, bond_angle_deg, bond_length)

    with open(filename, "w") as f:
        f.write(input_file_text)

# ----------------------------
# students write this function
# ----------------------------
def write_coordinate_scan_input_files(heteroatom, bond_length, low : float, high : float, step : float):
    scan_angles = np.arange(low, high, step)
    for angle in scan_angles:
        write_input_file(heteroatom, angle, bond_length)

def parse_outfiles_from_folder(folderpath : str) -> list:
    folder_data = []
    files = os.listdir(folderpath)
    outfiles = list(filter(lambda ext: ".out" in ext, files))
    for file in outfiles:
        print(f"NOW PARSING {file}")
        filepath = os.path.join(folderpath, file)
        data_names, data = m1.parse_outfile(filepath)
        file_data = [file] + data
        folder_data.append(file_data)
    print("PARSING COMPLETE")
    folder_data = [['filename', *data_names]] + folder_data
    return folder_data

def write_data_to_csv(folder_data : list):
    dataframe = pd.DataFrame(folder_data[1:], columns=folder_data[0])
    dataframe.sort_values(by=['heteroatom', 'bond_angle[deg]'], inplace=True)
    dataframe.to_csv("./summary_data.csv", index=False)
 
# -----------------------------
# other custom helper functions
# -----------------------------
def check_bond_angles(bond_angle_deg : float, bond_vectors : float) -> list:
    n = len(bond_vectors)
    bond_vectors = bond_vectors + bond_vectors
    results = []
    for bond_index in range(n):
        dot_product = np.dot(bond_vectors[bond_index], bond_vectors[bond_index + 1])
        norms = np.linalg.norm(bond_vectors[bond_index]) * np.linalg.norm(bond_vectors[bond_index + 1])
        actual_bond_angle = np.rad2deg(np.arccos(dot_product / norms))
        results.append(isclose(bond_angle_deg, actual_bond_angle, abs_tol=4))
    return results

def format_text_indentation(text : str) -> str:
    newlines= []
    for line in text.split("\n"):
        if line[:4] == ' '*4 :
            newlines.append(line[4:])
        else:
            newlines.append(line)
    return "\n".join(newlines)


In [10]:
write_coordinate_scan_input_files('F', 1.315, 95, 120.5, 0.5)
write_coordinate_scan_input_files('H', 1.080, 95, 120.5, 0.5)

In [3]:
data = parse_outfiles_from_folder("./outfiles/")
write_data_to_csv(data)

NOW PARSING CF3_cation_100.0_degrees.out
NOW PARSING CF3_cation_100.5_degrees.out
NOW PARSING CF3_cation_101.0_degrees.out
NOW PARSING CF3_cation_101.5_degrees.out
NOW PARSING CF3_cation_102.0_degrees.out
NOW PARSING CF3_cation_102.5_degrees.out
NOW PARSING CF3_cation_103.0_degrees.out
NOW PARSING CF3_cation_103.5_degrees.out
NOW PARSING CF3_cation_104.0_degrees.out
NOW PARSING CF3_cation_104.5_degrees.out
NOW PARSING CF3_cation_105.0_degrees.out
NOW PARSING CF3_cation_105.5_degrees.out
NOW PARSING CF3_cation_106.0_degrees.out
NOW PARSING CF3_cation_106.5_degrees.out
NOW PARSING CF3_cation_107.0_degrees.out
NOW PARSING CF3_cation_107.5_degrees.out
NOW PARSING CF3_cation_108.0_degrees.out
NOW PARSING CF3_cation_108.5_degrees.out
NOW PARSING CF3_cation_109.0_degrees.out
NOW PARSING CF3_cation_109.5_degrees.out
NOW PARSING CF3_cation_110.0_degrees.out
NOW PARSING CF3_cation_110.5_degrees.out
NOW PARSING CF3_cation_111.0_degrees.out
NOW PARSING CF3_cation_111.5_degrees.out
NOW PARSING CF3_

In [136]:
write_data_to_csv(parse_outfiles_from_folder(os.getcwd()))

NOW PARSING CF3_opt.out
NOW PARSING CH3_opt.out
PARSING COMPLETE


In [2]:
def parse_outfiles_from_folder(folderpath : str) -> list:
    folder_data = []
    files = os.listdir(folderpath)
    outfiles = list(filter(lambda ext: ".out" in ext, files))
    for file in outfiles:
        print(f"NOW PARSING {file}")
        filepath = os.path.join(folderpath, file)
        data_names, data = m1.parse_outfile(filepath)
        file_data = [file] + data
        folder_data.append(file_data)
    print("PARSING COMPLETE")
    folder_data = [['filename', *data_names]] + folder_data
    return folder_data


data = parse_outfiles_from_folder("./outfiles/")
 

NOW PARSING CF3_opt.out
NOW PARSING CF3_radical_100.0_degrees.out
NOW PARSING CF3_radical_100.5_degrees.out
NOW PARSING CF3_radical_101.0_degrees.out
NOW PARSING CF3_radical_101.5_degrees.out
NOW PARSING CF3_radical_102.0_degrees.out
NOW PARSING CF3_radical_102.5_degrees.out
NOW PARSING CF3_radical_103.0_degrees.out
NOW PARSING CF3_radical_103.5_degrees.out
NOW PARSING CF3_radical_104.0_degrees.out
NOW PARSING CF3_radical_104.5_degrees.out
NOW PARSING CF3_radical_105.0_degrees.out
NOW PARSING CF3_radical_105.5_degrees.out
NOW PARSING CF3_radical_106.0_degrees.out
NOW PARSING CF3_radical_106.5_degrees.out
NOW PARSING CF3_radical_107.0_degrees.out
NOW PARSING CF3_radical_107.5_degrees.out
NOW PARSING CF3_radical_108.0_degrees.out
NOW PARSING CF3_radical_108.5_degrees.out
NOW PARSING CF3_radical_109.0_degrees.out
NOW PARSING CF3_radical_109.5_degrees.out
NOW PARSING CF3_radical_110.0_degrees.out
NOW PARSING CF3_radical_110.5_degrees.out
NOW PARSING CF3_radical_111.0_degrees.out
NOW PARSIN

In [3]:
def write_data_to_csv(folder_data : list):
    n_extra_columns = len(folder_data[1]) - len(folder_data[0])
    extra_columns = list(np.arange(n_extra_columns))

    dataframe = pd.DataFrame(folder_data[1:], columns=folder_data[0] + extra_columns)
    dataframe.sort_values(by=['heteroatom', 'bond_angle[deg]'], inplace=True)
    dataframe.to_csv("./electrostatics_data.csv", index=False)

In [4]:
write_data_to_csv(data)

## Modify input files  
You will now extend your function to make the input files 