# Introduction to Computational Chemistry
The title of the notebook should be coherent with file name. Namely, file name should be:    
*author's initials_progressive number_title.ipynb*    
For example:    
*EF_01_Data Exploration.ipynb*

## Purpose
This notebook is designed to help the reader display and understand some basic computational chemistry simulations. 

## Methodology
Quickly describe assumptions and processing steps.

## Results
Describe and comment the most important results.

## Suggested next steps
State suggested next steps, based on results obtained in this notebook.

# Setup

## Library import
We import all the required Python libraries

In [None]:
# Data manipulation
import numpy as np
import scipy as sp
import pandas as pd
import xarray as xr

# Options for pandas
pd.options.display.max_columns = 50
pd.options.display.max_rows = 30

# Visualizations
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# Parameter definition
We set all relevant parameters for our notebook. By convention, parameters are uppercase, while all the 
other variables follow Python's guidelines.

# Data import
We retrieve all the required data for the analysis.

# Data processing
Put here the core of the notebook. Feel free to further split this section into subsections.

# Problem 1

Begin by creating a list of basis functions, the number of basis functions in them, and the total energy calculated with that basis. 

In [None]:
# Fill in the values in the DataFrame below:
hf_data = pd.DataFrame({'Method'           : ['STO-3G', 'cc-pVDZ', 'cc-pVTZ', 'cc-pVQZ'],
                        'Total energy (Ha)': [,
                                              ,
                                              ,
                                              ],
                        'Basis set size':    [, , , ]})
hf_data

In [None]:
hf_data.plot(x='Basis set size', y='Total energy (Ha)')

In [None]:
## Import an external file (default is comma-separated). Manually define the 
# field separator as any grouping of whitespace (tabs, spaces, line endings, etc.).
method_comp = pd.read_csv('Problem_1/performance.dat', delim_whitespace=True)
method_comp 

In [None]:
# Create a set of axes to reuse for each plot
ax = plt.gca()

# Plot data from each 'Method' group on the `ax` object
for key, grp in method_comp.groupby(['Method']):
    grp.plot(x='#BS',y='E[Ha]', label=key, ax=ax)
plt.legend(loc='best')    
plt.show()

# Problem 2

After going through steps 1 and 2 in the manual, you should have a directory for each bond distance and a file named `dist_E.dat` filled with bond distances and energies. We will import that file the same way we imported the output from the script in the last problem (try the import yourself). Pressing "Shift + Tab" while inside a method will give you extended information on that method, including any additional options. For this import, you may want to use the option for `index_col` to set the `dist` column as your index for the dataframe. 

In [None]:
# Import data from `dist_E.dat`

To plot your results, take the dataframe you created and use the `dataframe.plot()` method on it, explicitely identifying your *x* and *y* variables. 

You may also use the Pandas method [`dataframe.idxmin()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmin.html) to explicitely return the minimum value of an axis (the Pandas term for individual columns of data) and the index of that axis. If you set `index_col` earlier, this should output a distance and energy for the minimum. 

In [None]:
#Plot the bond distance vs. total energy and output the minimum distance and energy

In [None]:
# Calculate \Delta H_{at} for the molecule, print out the result

In [None]:
## Grab data on dipoles from the original files. You can do this a few ways:
#     - Search through each file, copy and paste the results
#     - Run a loop using the last few lines in the `run_scan.sh` script
#       to grab the files more quickly. Enter the values into the list below.
#     - Modify the `run_scan.sh` script directly to output a column of values for 
#       the dipoles. Your dataframe will already contain the data in that case.
#
# If you used one of the first two options, you can add a new column (in the form 
# of a list)to an existing dataframe by calling the name of the new column and 
# setting it equal to your values (in list form) like so: df['ColName'] = valList
# 
# Finally, plot the bond distance vs. the dipole

Repeat these last few cells using the data you produced in Step 5 of the lab (the only difference will be the files where the PBE1PBE data are found). 

# Problem 3 - WORK IN PROGRESS

After running the planar H<sub>3</sub>O<sup>+</sup> molecule, we need to visualize the results. Many desktop programs are capable of doing this, including GaussView (part of the Gaussian software distribution), but there are also some Python libraries that can extract data from the output and help us visualize the information right here in the Jupyter notebook. 

In [None]:
#Tool to work with Gaussian output
import cclib

# Viewing module so we can see results inline
# import nglview as nv
import avogadro as avo
import openchemistry as oc
import json

# References
We report here relevant references:
1. author1, article1, journal1, year1, url1
2. author2, article2, journal2, year2, url2