<a href="https://colab.research.google.com/github/valsson-group/UNT-Chem5660-Fall2024/blob/main/Python-PlotDihedralData/PlotDihedralData.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Making Data Plots Using Python

In this notebook we show how we can use Numpy and Matplotlib to make plot of data series.

The following resources can be useful:
  - [Matplotlib Documentation](https://matplotlib.org/stable/index.html)
  - [Matplotlib Cheat Sheets](https://matplotlib.org/cheatsheets/)
  - [Matplotlib Tutorials](https://matplotlib.org/stable/tutorials/index.html)
  - [NumPy User Guide](https://numpy.org/doc/stable/user/index.html)

We first import the numpy and matplotlib packages

In [None]:
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['figure.dpi'] = 150

## Loading the Datasets

The first step is to load the datasets we want to plot.

Since we are here runnig this notebook on Google Colab, we need to upload the datafiles onto the runtime instance we are using. You can do this by selecting the folder icon here on the left and dragging the files there.

**Note these files are only temporarily saved there and will be deleted once this runtime is terminated. Thus, do not use this save or keep files**.

Once we have uploaded the data files, we will load the data into a numpy array using the `np.loadtxt("<name-of-datafile>")` function, where `<name-of-datafile>` is the name of a given data file that we want to load. The filename should be  enclosed in parenthesis.

### Example
Here we will use the `Benzamidine_Scan_HF_cc-pVDZ.relaxscanact.dat` data file from a ORCA that we downlaod from the course Github repo as an example. We will load this data file as
```
data_hf_ccpvdz = np.loadtxt("Benzamidine_Scan_HF_cc-pVDZ.relaxscanact.dat")
```
will load the data from the file `Benzamidine_Scan_HF_cc-pVDZ.relaxscanact.dat` into a numpy array with the variable name `data_hf_ccpvdz`.

We can access the first column of the data by using `data_hf_ccpvdz[:,0]` and the second column by using `data_hf_ccpvdz[:,1]` (note that python indexing starts from 0).

### Your Own Data

For your own data sets, you need to do a seperate `np.loadtxt` call for each data file, and use a seperate variable name for each data set.

In [None]:
# This is only needed to download example data
%%capture
!wget https://raw.githubusercontent.com/valsson-group/UNT-Chem5660-Fall2024/main/Python-PlotDihedralData/Benzamidine_Scan_HF_cc-pVDZ.relaxscanact.dat
!wget https://raw.githubusercontent.com/valsson-group/UNT-Chem5660-Fall2024/main/Python-PlotDihedralData/Benzamidine_Scan_HF_cc-pvdz_g16_tot_ener.txt


In [None]:
# This is just an example
data_hf_ccpvdz = np.loadtxt("Benzamidine_Scan_HF_cc-pVDZ.relaxscanact.dat")
data_hf_ccpvdz_g16 = np.loadtxt("Benzamidine_Scan_HF_cc-pvdz_g16_tot_ener.txt")

# For your own data, you need to write a seperate np.loadtxt call for each data file,
# something like
# data_blyp = np.loadtxt("<name-of-datafile>")
# where you replace <name-of-datafile> with the filename.


# This is just to show how you can print the datasets.
# This is not really needed, so you just comment it out
# by adding # in front of the line.
print("Column 1:")
print(data_hf_ccpvdz[:,0])
print("")
print("Column 2:")
print(data_hf_ccpvdz[:,1])




## Plotting Data

We then plot the data by using the `plt.plot(...)` function in matplotlib.

### The plt.plot function
You need to have a seperate `plt.plot(...)` call for each data set. For example,
```
plt.plot(data_hf_ccpvdz[:,0],
         (data_hf_ccpvdz[:,1]-np.min(data_hf_ccpvdz[:,1]))*Hartree2kJmol,
         "--x",
         markersize=5,
         label="HF/cc-pVDZ")
```
where we plot the first column of the data set on the x-axis (`x=data_b3lyp_ccpvdz[:,0]`) and the second column of the data on the y-axis (` y=data_b3lpy_ccpvdz[:,1]`.

### Aligning the minimum of the y-axis to zero
Note that here we are comparing energies obtained with different levels of theory so to make sure that the minimum of each curve is aligned with zero use the `np.min` function to find the minimum of the data set and substract that from that data vector for the y-axis. It will depend on the data if you want to do this or not.

### Converting units
Here the input data is given in Hartree so we will need to convert the values to kJ/mol. We do this by multiplying the `(data_hf_ccpvdz[:,1]-np.min(data_hf_ccpvdz[:,1]))` by a variable `Hartree2kJmol` that we have defined. It will depend on the data if you want to do this or not. If you want to use kcal/mol, you should use a differnet conversion factor.

### Labelling curves
You can label each curve with a seperate label by using the `label="<label-text>"` keyword. The legend that shows this labels is activated by the `plt.legend()` call.

### Axis labels
You can add a label to the x- and y-axis by using the `plt.xlabel("...")` and `plt.ylabel("...") functions.

### Range of y-axis
You change the range of the y-axis by using the `plt.ylim` function. For example, we set it to the range from 0 to 6 by using `plt.ylim([0,6])`. You will need to adjust this to your data.

### Location of x-axis tics
We can set the location of the tics on the x-axis by using the `plt.xticks(np.arange(0,390,30))` command. You might need to adjust to your data if you have a differnt range of the x-axis, for exmaple by using `plt.xticks(np.arange(-180,190,30))`.

### Saving figure
The figure will be shown in the notebook. We can also save the figure to a file by using the `plt.savefig("plot.png")` function. The extension used in filename determines the format (e.g., `.jpg` or `.png`). You can then download the file from the file manager here on the left side.

Note that this file is only temporarily saved there and will be deleted once this runtime is terminated. Thus, you should always download the figure to your computer right away.



In [None]:
# Define conversion factors
Hartree2kJmol = 2625.50
Hartree2kcalmol = 627.51
Hartree2eV = 27.211

# Plot different data sets.
# You need to add a seperate plt.plot call for each data set

plt.plot(data_hf_ccpvdz[:,0],
         (data_hf_ccpvdz[:,1]-np.min(data_hf_ccpvdz[:,1]))*Hartree2kJmol,
         "--x",
         markersize=5,
         label="HF/cc-pVDZ (ORCA 6)")

plt.plot(data_hf_ccpvdz_g16[:,0],
         (data_hf_ccpvdz_g16[:,1]-np.min(data_hf_ccpvdz_g16[:,1]))*Hartree2kJmol,
         "--o",
         markersize=5,
         label="HF/cc-pVDZ (Gaussian 16)")


# Show legend with labels
plt.legend()

# Set x- and y-axis labels
plt.xlabel("Dihedral Angle [Degrees] ")
plt.ylabel("Potential Energy [kJ/mol]")

# Set range of the y-axis, in this case from 0 to 6
plt.ylim([0,20])

# Set x-tics to be at -180, -150, -120, ..., 180
plt.xticks(np.arange(-180,210,30))

# Save the figure to a file
plt.savefig("plot_example.png")



## Your Own Data

You can now use the cell below to plot your own data.


### Saving figure
The figure will be shown in the notebook. We can also save the figure to a file by using the `plt.savefig("plot.png")` function. The extension used in filename determines the format (e.g., `.jpg` or `.png`). You can then download the file from the file manager here on the left side.

Note that this file is only temporarily saved there and will be deleted once this runtime is terminated. Thus, you should always download the figure to your computer right away.



In [None]:
# Define conversion factors
Hartree2kJmol = 2625.50
Hartree2kcalmol = 627.51
Hartree2eV = 27.211


# Load data sets
# Copy this to load more data sets
data_1 = np.loadtxt("<name-of-datafile-1>")
data_2 = np.loadtxt("<name-of-datafile-2>")


# Plot different data sets.
# You need to add a seperate plt.plot call for each data set

plt.plot(data_1[:,0],
         (data_1[:,1]-np.min(data_1[:,1]))*Hartree2kJmol,
         "--x",
         markersize=5,
         label="<label for data set 1>")

plt.plot(data_2[:,0],
         (data_2[:,1]-np.min(data_2[:,1]))*Hartree2kJmol,
         "--x",
         markersize=5,
         label="<label for data set 2>")



# Show legend with labels
plt.legend()

# Set x- and y-axis labels
plt.xlabel("<x-axis label. Remember to include units!>")
plt.ylabel("<y-axis label. Remember to include units!>")


# Set range of the y-axis, in this case from 0 to 20
plt.ylim([0,20])

# Set x-tics to be at -180, -150, -120, ..., 180
plt.xticks(np.arange(-180,210,30))

# Save the figure to a file
# You can then download the file from the file manager here on the left side
plt.savefig("plot.png")

