# The Hessian in Optimizations
### This tutorial covers the guessing, updating, and transformations of the Hessian (energy second derivative).

Newton-Raphson and Newton-Raphson-like optimization methods require an energy second derivative (Hessian).  The starting Hessian in internal coordinates is either guessed, or else produced by transformation of a Cartesian Hessian computed at some level of theory.  The simple "guessing" formulas for stretch, bend, and torsion force constants generally are sufficient.

Each internal coordinate has a method used to guess its own force constant using a variety of published methods, here we use the default 'SIMPLE' method.

Lets see what a "simple" guess Hessian for water looks like. 

In [1]:
import psi4
from psi4 import *
from psi4.core import *
import numpy as np

mol = psi4.geometry("""
O
H 1 0.9
H 1 0.9 2 104
""")

# Set some options
psi4.set_options({"basis": "cc-pvdz"})
mol.update_geometry()
 
from optking import stre, bend, tors
intcos = [stre.STRE(0,1), stre.STRE(0,2), bend.BEND(1,0,2)]
for intco in intcos: 
    print (intco) 
Natom = 3
Nintco = len(intcos)
Z = [int(mol.Z(i)) for i in range(Natom)]
xyz = np.array(mol.geometry())
H = np.zeros((Nintco,Nintco), float)
for i,intco in enumerate(intcos):
    H[i,i] = intco.diagonalHessianGuess(xyz, Z, guessType="SIMPLE")
print "Simple Guess Hessian for Water (in au)"
print (H)

 R(1,2)
 R(1,3)
 B(2,1,3)
Simple Guess Hessian for Water (in au)
[[ 0.5  0.   0. ]
 [ 0.   0.5  0. ]
 [ 0.   0.   0.2]]


You can see that the very simple rule is simply 0.5au for bond stretches and 0.2au for angles (and 0.1au for dihedrals).  For ordinary molecular configurations, this Hessian works nearly as well one determined by a second derivative computation.  There are commonly used geometry-dependent formulas the have better asymptotic behavior.  Here are the equivalent guesses using the formulae from Schlegel [Schlegel, _Theor. Chim. Acta_, 66, 333 (1984) and from Fischer [Fischer and Almlof, _J. Phys. Chem._, 96, 9770 (1992)].

In [2]:
reload(stre)
for i,intco in enumerate(intcos):
    H[i,i] = intco.diagonalHessianGuess(xyz, Z, guessType="SCHLEGEL")
print "Schlegel Guess Hessian for Water (in au)"
print (H)
for i,intco in enumerate(intcos):
    H[i,i] = intco.diagonalHessianGuess(xyz, Z, guessType="FISCHER")
print "Fischer Guess Hessian for Water (in au)"
print (H)

Schlegel Guess Hessian for Water (in au)
[[ 0.70672641  0.          0.        ]
 [ 0.          0.70672641  0.        ]
 [ 0.          0.          0.16      ]]
Fischer Guess Hessian for Water (in au)
[[ 0.46569723  0.          0.        ]
 [ 0.          0.46569723  0.        ]
 [ 0.          0.          0.29459425]]


Psi4 and other quantum chemistry programs can compute Hessians using either analytic second derivatives or finite differences.  Here, we assume the program has provided the Hessian in Cartesian coordinates.  We need to transform this Hessian into internal coordinates for use in our Newton-Raphson like algorithm.  

Starting with the definition of the hessian
$$ \textbf H_{ab} = \frac{\partial ^2 E}{\partial x_a\partial x_b}$$

substituting in the partial deriviative of internal coordinates
$$ \textbf H_{ab} = \frac{\partial}{\partial x_a}\Big( \frac{\partial E}{\partial q_i}
\frac{\partial q_i}{\partial x_b}\Big )$$

$$ = \frac{\partial ^2 E}{\partial x_a \partial q_i} \frac{\partial q_i}{\partial x_b} + \frac{\partial E}{\partial q_i}\frac{\partial ^2 q_i}{\partial x_a \partial x_b}$$

$$ \frac{\partial q_j}{\partial x_a}\frac{\partial ^2 E}{\partial q_j \partial q_i} \frac{ \partial q_i}{\partial x_b} + \frac{\partial E }{\partial q_i} \frac {\partial ^2 q_i}{\partial x_a \partial x_b}$$

With the introduction of the derivative B-matrix ($\textbf B'$), we can write the equation in matrix form for the transformation of the internal coordinate Hessian into Cartesian coordinates.

$$ \textbf{H}_{\rm{cart}} = \textbf{B}^T \textbf{H}_{\rm{int}} \textbf{B} + g_i \textbf{B}^i$$

At stationary points where the gradient is zero, the second term vanishes. The contribution of this term is generally small and may not be worth the expense of computing when generating Hessians for stationary point searches.

We still need to derive the formula for the inverse transformation.  See the first tutorial for the introduction of the $\mathbf {A}$ matrix, the generalized left-inverse of $ \mathbf{B}^T $, where 

$$ \textbf A = (\textbf{B} \textbf{u} \textbf {B}^T)^{-1} \textbf {B} \textbf{u}$$

$$ \textbf {A}^T \textbf{H}_{\rm{cart}} \textbf {A} = \textbf{A}^T \textbf{B}^T \textbf{H}_{\rm{int}} \textbf{BA} + \textbf {A}^T g_i \textbf{B}^i \textbf A$$

here

$$ \textbf{H}_{\rm{int}} = \textbf A^T \textbf {H}_{\mathrm{cart}} \textbf A - \textbf A^T  g_i \textbf {B}^i \textbf A $$

We can factor the terms to get the following:

$$ \textbf{H}_{\rm{int}} = \textbf A^T ( \textbf {H}_{\mathrm{cart}} - g_i \textbf {B}^i ) \textbf A $$

Now lets convert the Psi4 Cartesian Hessian into internals!

In [3]:
# Function to return a generalized inverse
import math
def symmMatInv(A):
    dim = A.shape[0]
    det = 1.0

    evals, evects = np.linalg.eigh(A)
    evects = evects.T
    for i in range(dim):
        det *= evals[i]

    diagInv = np.zeros( (dim,dim), float)
    for i in range(dim):
        if math.fabs(evals[i]) > 1.0e-10:
            diagInv[i,i] = 1.0/evals[i]
            
    # A^-1 = P^t D^-1 P
    tmpMat = np.dot(diagInv, evects)
    AInv = np.dot(evects.T, tmpMat)
    return AInv

Ncart = 3*Natom

# Compute B and A^T matrices
from optking import intcosMisc
B = intcosMisc.Bmat(intcos, xyz)
G = np.dot(B, B.T)
Ginv = symmMatInv(G)
Atranspose = np.dot(Ginv, B)

In [4]:
# Compute gradient and transform into internal coordinates.
g_x = np.reshape( np.array( psi4.gradient('scf')), (3*Natom))
g_q = np.dot(Atranspose, g_x)

# Compute cartesian Hessian
H_cart = np.array( psi4.hessian('scf') )

# A^t Hxy A
# A^t (Hxy - Kxy) A;    K_xy = sum_q ( grad_q[I] d^2(q_I)/(dx dy) )
H_int = np.zeros( (Nintco,Nintco), float)
dq2dx2 = np.zeros((Ncart,Ncart), float)

for I, q in enumerate(intcos):
    dq2dx2[:] = 0
    q.Dq2Dx2(xyz, dq2dx2)   # d^2(q_I)/ dx_i dx_j

    for a in range(Ncart):
        for b in range(Ncart):
            H_cart[a,b] -= g_q[I] * dq2dx2[a,b]

H_int = np.dot(Atranspose, np.dot(H_cart, Atranspose.T))
print "Hessian in internal coordinates transformed from Cartesians"
print (H_int)

Hessian in internal coordinates transformed from Cartesians
[[ 0.85052538 -0.00340936  0.03443988]
 [-0.00340936  0.85052538  0.03443988]
 [ 0.03443988  0.03443988  0.18987662]]


During the course of the optimization, the Hessian may be re-calculated at each step (or every fixed number of steps).  However, the computation of the Hessian is computationally costly, even when possible.  In most cases, updating the Hessian with first derivative information works nearly as well as recomputing the Hessian and requires no additional computation.