**<span style="color:#A03;font-size:14pt">
&#x1F528; Summary of Tutorial &#x1F528;
</span>** 

This tutorial teaches you:

> **1.** How to compute bond lengths using basic Python functionality.
>
> **2.** How to use [Numpy](https://numpy.org/) for a simpler and more efficient bond-length implementation.
>
> **3.** How to use [ASE](https://wiki.fysik.dtu.dk/ase/) to get all bond lengths at the same time (no implementation required).
>
> **4.** How to use [NGLView](http://nglviewer.org/nglview/latest/api.html) to visualize chemical structures!

**Notice:** 
 - Make sure conda environment `chem413` is activated, and the jupyter notebook is launched from that environment.

- The water molecule is used as an example, and its Cartesian coordinates are taken from the lecture notes (Week 3 --- Molecular Structure).

In [1]:
# Water: Use lists to store atomic numbers and Cartesian coordinates

numbers = [8, 1, 1]

geometry = [[1.00000, 1.00000, 1.00000], # (x, y, z) coordinates of O
            [1.95700, 1.00000, 1.00000], # (x, y, z) coordinates of H
            [0.75876, 1.92609, 1.00000]] # (x, y, z) coordinates of H

# compute the number of atoms
print("Number of atoms = ", len(numbers))
print("Number of atoms = ", len(geometry))

Number of atoms =  3
Number of atoms =  3


# <font color=blue>Compute Bond Length</font>



Given the Cartesian coordinates of two points, i.e., $\mathbf{c_1}=(x_1, y_1, z_1)$ and $\mathbf{c_2}=(x_2, y_2, z_2)$, the Euclidean distance between them is computed by:

\begin{equation*}
r = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2}
\end{equation*}

We will implement this formula to compute the O-H bond length of water.

In [2]:
# 1. Define a function to compute Euclidean distance between two points

def distance(coord1, coord2):
    # unpack whole list into variables (assuming 3D space)
    x1, y1, z1 = coord1
    x2, y2, z2 = coord2
    # compute the term under the square root
    value = (x1 - x2)**2 + (y1 - y2)**2 + (z1 - z2)**2
    # take square root of value
    value = value**0.5
    # round value to 4 decimals
    return round(value, 4)

# 2. Compute distance between atoms in water (does it match lecture notes?)
print(" O-H1 Bond Length = ", distance(geometry[0], geometry[1]))
print(" O-H2 Bond Length = ", distance(geometry[0], geometry[2]))
print("H1-H2 Bond Length = ", distance(geometry[1], geometry[2]))
print("--- " * 7)

# 3. Alternative definition of distance function that works in any dimension

def distance_alternative(coord1, coord2):
    value = sum([(item1 - item2)**2 for item1, item2 in zip(coord1, coord2)])
    value = value**0.5
    return round(value, 4)

print(" O-H1 Bond Length = ", distance_alternative(geometry[0], geometry[1]))
print(" O-H2 Bond Length = ", distance_alternative(geometry[0], geometry[2]))
print("H1-H2 Bond Length = ", distance_alternative(geometry[1], geometry[2]))

 O-H1 Bond Length =  0.957
 O-H2 Bond Length =  0.957
H1-H2 Bond Length =  1.5144
--- --- --- --- --- --- --- 
 O-H1 Bond Length =  0.957
 O-H2 Bond Length =  0.957
H1-H2 Bond Length =  1.5144


**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON Exercise!
</span>** 

Using the Formamide (HCONH2) geometry (from lecture notes) in the code cell below:
> - Define a list containing its atomic numbers.
>
> - Define a list containing Cartesian coordinate of its atoms.
>
> - Compute the Euclidean distance between various pairs of atoms.

To check the correctness of the computed values, refer to Formamide's Z-matrix in the lecture notes (Week 3 -- Molecular Structure). For example, The O-C bond length is expected to be 1.205 angstrom.


In [None]:
# Formamide's Cartesian Coordinates

# O          1.08315        0.22427        0.00004
# N         -1.18647        0.17565        0.00000
# C          0.04483       -0.38784       -0.00003
# H         -1.26832        1.17683        0.00007
# H         -2.01601       -0.38579       -0.00006
# H          0.00241       -1.49112       -0.00012

# 1. Define two lists to store the atomic numbers and coordinates of Formamide


# 2. Compute distance between various pairs of atoms and compare to Z-matrix values


# <font color=blue>Numpy Library</font>

[Numpy](https://numpy.org/) is a Python library supporting large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Here, we will use Numpy for a simpler and more efficient implementation of bond length using [numpy.lingalg.norm](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function, which computes the norm (i.e., length) of a vector. This is possible, because Euclidean distance is the norm of the vector connecting two points (i.e., difference vector).

Given the Cartesian coordinates of two points, i.e., $\mathbf{c_1}=(x_1, y_1, z_1)$ and $\mathbf{c_2}=(x_2, y_2, z_2)$, the Euclidean distance between them is computed by:

\begin{equation*}
r = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2} 
  = \|\mathbf{c_1} - \mathbf{c_2}\|
\end{equation*}


In [3]:
import numpy as np

# Define arrays of atomic numbers and coordinates

nums = np.array(numbers)
geom = np.array(geometry)

print(numbers, type(numbers))
print(nums, type(nums))
print("")
print(geometry, type(geometry))
print(geom, type(geom))

[8, 1, 1] <class 'list'>
[8 1 1] <class 'numpy.ndarray'>

[[1.0, 1.0, 1.0], [1.957, 1.0, 1.0], [0.75876, 1.92609, 1.0]] <class 'list'>
[[1.      1.      1.     ]
 [1.957   1.      1.     ]
 [0.75876 1.92609 1.     ]] <class 'numpy.ndarray'>


In [4]:
# Define dinstace function using numpy

def distance_numpy(coord1, coord2):
    # coord1 & coord2 are expected to be numpy arrays
    value = np.linalg.norm(coord1 - coord2)
    return round(value, 4)

print(" O-H1 Bond Length = ", distance_numpy(geom[0], geom[1]))
print(" O-H2 Bond Length = ", distance_numpy(geom[0], geom[2]))
print("H1-H2 Bond Length = ", distance_numpy(geom[1], geom[2]))


 O-H1 Bond Length =  0.957
 O-H2 Bond Length =  0.957
H1-H2 Bond Length =  1.5144


**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON Exercise!
</span>** 

> Write a for loop to compute the distance between unique pairs of atoms in Formamide.

In [None]:
# Write a for loop to compute unique bond distances


# <font color=blue>ASE Library</font>

[Atomic Simulation Environment (ASE)](https://wiki.fysik.dtu.dk/ase/) is a set of tools and Python modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.


Here, we will use [ase.Atoms](https://wiki.fysik.dtu.dk/ase/ase/atoms.html#module-ase.atoms) to define water molecule as a collection of atoms.


In [5]:
import ase

# Make an instance of Atoms class for water
water = ase.Atoms('OH2', positions=geometry)

# get atomic numbers
print("atomic numbers = ", water.numbers)
print("")

# Compute atomic distances (we don't need to reinvent the wheel!)
print("Distance Matrix:")
print(water.get_all_distances())

atomic numbers =  [8 1 1]

Distance Matrix:
[[0.         0.957      0.956995  ]
 [0.957      0.         1.51440476]
 [0.956995   1.51440476 0.        ]]


In [6]:
# Bonus: Using ASE, you can easily get atomic numbers, masses, & symbols
print("atomic numbers = ", water.get_atomic_numbers())
print("atomic masses  = ", water.get_masses())
print("atomic symbols = ", water.get_chemical_symbols())

atomic numbers =  [8 1 1]
atomic masses  =  [15.999  1.008  1.008]
atomic symbols =  ['O', 'H', 'H']


**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON Exercise!
</span>** 

Using the Formamide (HCONH2) geometry (from lecture notes) in the code cell below:
> - Make an instance of `ase.Atoms` for Formamide molecule.
>
> - Get the O-C bond length using `get_all_distances()` method (answer is 1.205).
>
> - Compute the molecular mass of Formamide! (answer is 45.04 g/mol)


# <font color=blue>NGLView</font>

[NGLView](http://nglviewer.org/nglview/latest/api.html) An IPython/Jupyter widget to interactively view molecular structures and trajectories. Here, we will use NGLView to visualize the structure of molecules and proteins!

In [7]:
# Visualize water

import nglview

view = nglview.show_ase(water)
view

**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON Exercise!
</span>** 


> Visualize the Formamide (HCONH2) molecule. 


In [None]:
# Visualize Formamide


**Cool Feature:** You can use NGLView to visualize proteins just knowing their PDB ID (e.g., [3PQR](https://www.rcsb.org/structure/3PQR)) and even save their image!

In [8]:
# Visualize 3PQR protein

import nglview as nv
view = nv.show_pdbid("3pqr")
view

In [16]:
view.download_image("3pqr_jupyter.png")