# The Hessian
### This tutorial covers the guessing and transformations of the Hessian in geometry optimizations.

Newton-Raphson and Newton-Raphson-like optimization methods require an energy second derivative (called the "Hessian").  If internal coordinates are used, then the starting Hessian in internal coordinates is either estimated or produced by transformation of a Cartesian Hessian.  The Cartesian Hessian is computed via analytic second derivative methods, or by finite differences of gradients or energies.  To reduce computational expense, this Cartesian Hessian may be computed at a lower level of theory than that at which the optimization is being performed.

However, the common empirical formulas for "guessing" the stretch, bend, and torsion force constants generally are sufficient.  In fact, for ordinary bonding situations initial computation of the Hessian results in little reduction in the number of geometry steps to convergence.  Each internal coordinate type has a method used to guess its own force constant using a variety of published methods, here we use the default 'SIMPLE' method.  Lets see what a "simple" guess Hessian for water looks like. 

In [None]:
import psi4
from psi4 import *
from psi4.core import *
import numpy as np
import os
sys.path.append('os.getcwd()')
from opt_helper import stre, bend, tors, intcosMisc, linearAlgebra

In [None]:
mol = psi4.geometry("""
O
H 1 0.9
H 1 0.9 2 104
""")
mol.update_geometry()
Natom = mol.natom()
Z = [int(mol.Z(i)) for i in range(Natom)]
xyz = np.array(mol.geometry())

# Manually create a list including both O-H stretches and 
# the H-O-H bend.
intcos = [stre.STRE(0,1), stre.STRE(0,2), bend.BEND(1,0,2)]
Nintco = len(intcos)
print("Internal Coordinates")
for intco in intcos: 
    print(intco) 

# Build a diagonal guess Hessian from a simple, empirical rule.
H = np.zeros((Nintco,Nintco), float)
for i,intco in enumerate(intcos):
    H[i,i] = intco.diagonalHessianGuess(xyz, Z, guessType="SIMPLE")
print("\nSimple Guess Hessian for Water (in au)")
print(H)

You can see that the very simple rule is 0.5 au for bond stretches and 0.2 au for angles (also, 0.1 au for dihedrals).  For many molecular configurations, this Hessian works nearly as well in optimizations as one determined by a second derivative computation!  However, there are commonly used atomic number- and geometry- dependent formulas that are generally more effective and have better asymptotic behavior (e.g., at long distances).  Here are the corresponding guesses using the formulae from Schlegel [_Theor. Chim. Acta_, 66, 333 (1984)] and from Fischer and Almlof [_J. Phys. Chem._, 96, 9770 (1992)].

The off-diagonal elements, being difficult to estimate, are typically set at zero.  Some have advocated a small non-zero value for them, but in any event, the Hessian update schemes (see other tutorial) will change them.

In [None]:
for i,intco in enumerate(intcos):
    H[i,i] = intco.diagonalHessianGuess(xyz, Z, guessType="SCHLEGEL")
print("Schlegel Guess Hessian for Water (in au)")
print(H)

for i,intco in enumerate(intcos):
    H[i,i] = intco.diagonalHessianGuess(xyz, Z, guessType="FISCHER")
print("\nFischer Guess Hessian for Water (in au)")
print(H)

Psi4 and other quantum chemistry programs can compute Cartesian Hessians using either analytic second derivatives, or finite differences of first derivatives (or energies).  Here, we assume the program has provided the Hessian in Cartesian coordinates.  Assuming we wish to carry out our optimization in internal coordinates, then we need to transform this Hessian into internal coordinates for use in our Newton-Raphson like algorithm.  

Starting with the definition of the hessian in Cartesian coordinates
$$ \textbf H_{ab} = \frac{\partial ^2 E}{\partial x_a\partial x_b}$$

we use the B-matrix elements and the chain rule
\begin{align}
\textbf H_{ab} &= \frac{\partial}{\partial x_a}\Big( \frac{\partial E}{\partial q_i}\frac{\partial q_i}{\partial x_b}\Big ) \\
 &= \frac{\partial ^2 E}{\partial x_a \partial q_i} \frac{\partial q_i}{\partial x_b} + \frac{\partial E}{\partial q_i}\frac{\partial ^2 q_i}{\partial x_a \partial x_b} \\
 &= \Big(\frac{\partial}{\partial x_a} \cdot \frac{\partial E}{\partial q_i}\Big) \frac{\partial q_i}{\partial x_b} + \frac{\partial E}{\partial q_i}\frac{\partial ^2 q_i}{\partial x_a \partial x_b}\\
 &= \Big(\frac{\partial q_j}{\partial x_a} \cdot \frac{\partial}{\partial q_j} \cdot \frac{\partial E}{\partial q_i}\Big) \frac{\partial q_i}{\partial x_b} + \frac{\partial E}{\partial q_i}\frac{\partial ^2 q_i}{\partial x_a \partial x_b}\\
 &= \frac{\partial q_j}{\partial x_a}\frac{\partial ^2 E}{\partial q_j \partial q_i} \frac{ \partial q_i}{\partial x_b} + \frac{\partial E }{\partial q_i} \frac {\partial ^2 q_i}{\partial x_a \partial x_b}\\
\end{align}

With the introduction of the derivative B-matrix for internal coordinate $i$ as $\textbf B^i$, we can write the above equation in matrix form for the transformation of the internal coordinate Hessian into Cartesian coordinates.

$$ \textbf{H}_{\rm{cart}} = \textbf{B}^T \textbf{H}_{\rm{int}} \textbf{B} + g_i \textbf{B}^i$$

At stationary points where the gradient is zero, the second term vanishes. The contribution of this term is generally small, and it may not be worth the expense of computing when generating Hessians for stationary point searches.

We still need to derive the formula for the inverse transformation.  See the first tutorial for the introduction of the $\mathbf A^T$ matrix, the generalized left-inverse of $ \mathbf{B}^T $, where 

$$ \textbf A^T = (\textbf{B} \textbf{u} \textbf {B}^T)^{-1} \textbf {B} \textbf{u}$$

so that 

$$ \textbf {A}^T \textbf{H}_{\rm{cart}} \textbf {A} = \textbf{A}^T \textbf{B}^T \textbf{H}_{\rm{int}} \textbf{BA} + \textbf {A}^T g_i \textbf{B}^i \textbf A$$

and

$$ \textbf{H}_{\rm{int}} = \textbf A^T \textbf {H}_{\mathrm{cart}} \textbf A - \textbf A^T  g_i \textbf {B}^i \textbf A $$

We can factor the terms to get the following:

$$ \textbf{H}_{\rm{int}} = \textbf A^T ( \textbf {H}_{\mathrm{cart}} - g_i \textbf {B}^i ) \textbf A $$

Now lets convert a Psi4 Cartesian Hessian into internals!

In [None]:
# Compute B and A matrices.
B = intcosMisc.Bmat(intcos, xyz)
G = np.dot(B, B.T)
Ginv = linearAlgebra.symmMatInv(G)
Atranspose = np.dot(Ginv, B)

# We'll use cc-pVDZ RHF.
psi4.set_options({"basis": "cc-pvdz"})

# Get gradient in cartesian coordinates, then convert to internals.
g_x = np.reshape( np.array( psi4.gradient('scf')), (3*Natom))
g_q = np.dot(Atranspose, g_x)

print("Gradient in internal coordinates")
print(g_q)

# Get Hessian in Cartesian coordinates.
H_cart = np.array( psi4.hessian('scf') )
# print("Hessian in Cartesian coordinates")
# print(H_cart)

In [None]:
# Convert Cartesian Hessian to internals.
# A^t (Hxy - Kxy) A;    K_xy = sum_q ( grad_q[I] d^2(q_I)/(dx dy) )
Ncart = 3 * mol.natom()
H_int = np.zeros( (Nintco,Nintco), float)
dq2dx2 = np.zeros((Ncart,Ncart), float)

for I, q in enumerate(intcos):
    dq2dx2[:] = 0
    q.Dq2Dx2(xyz, dq2dx2)   # d^2(q_I)/ dx_i dx_j

    for a in range(Ncart):
        for b in range(Ncart):
            H_cart[a,b] -= g_q[I] * dq2dx2[a,b]

H_int = np.dot(Atranspose, np.dot(H_cart, Atranspose.T))
print("Hessian in internal coordinates")
print(H_int)

This result may be compared with the guess Hessians from the formulas above.  It is verified that in these intuitive coordinates (stretches, bends, etc.) the Hessian is strongly diagonal, and it also may be readily estimated.

During the course of the optimization, the Hessian may be re-calculated at each step (or every fixed number of steps).  However, the computation of the Hessian is computationally costly, even when possible.  In most cases, updating the Hessian with first derivative information works nearly as well as recomputing the Hessian and requires no additional computation.  For Hessian updating, see another tutorial.