# Exercises
## Gradient Descent

In this exercise we will implement a routine to perform geometry optimizations using the gradient descent algorithm. This is the simplest optimization procedure and it requires knowledge only of the coordinates and gradient. 

Let's start by setting up a molecule, running an SCF calculation and setting up the gradient driver.

In [1]:
import veloxchem as vlx
import py3Dmol as p3d
from veloxchem.veloxchemlib import bohr_in_angstroms
import numpy as np

In [23]:
basis_set_label = 'sto-3g'

ethene_xyz = """6
 ethene
 C          0.000000    -0.703984    0.000000
 C          0.000000     0.663984   0.000000
 H          0.919796    -1.223061   0.000000
 H         -0.919796    -1.223061   0.000000
 H          0.919796     1.223061   0.000000
 H         -0.919796     1.223061   0.000000
"""
molecule = vlx.Molecule.from_xyz_string(ethene_xyz)
basis = vlx.MolecularBasis.read(molecule, basis_set_label)

In [24]:
view = p3d.view(linked=True, viewergrid=(1,1),width=500,height=300)
view.addModel(ethene_xyz, 'xyz', viewer=(0,0))
view.setStyle({'stick': {}})
view.zoomTo()
view.show()

In [25]:
scf_settings = {'conv_thresh': 1.0e-10}
method_settings = {} #{'xcfun': 'b3lyp', 'grid_level': 4}

# SCF
scfdrv = vlx.ScfRestrictedDriver()
scfdrv.update_settings(scf_settings, method_settings)
scfdrv.compute(molecule, basis)

                                                                                                                          
                                            Self Consistent Field Driver Setup                                            
                                                                                                                          
                   Wave Function Model             : Spin-Restricted Hartree-Fock                                         
                   Initial Guess Model             : Superposition of Atomic Densities                                    
                   Convergence Accelerator         : Two Level Direct Inversion of Iterative Subspace                     
                   Max. Number of Iterations       : 50                                                                   
                   Max. Number of Error Vectors    : 10                                                                   
                

In [26]:
# Set up a gradient driver
gradient_settings = {'numerical': 'no'}

grad_driver = vlx.ScfGradientDriver(scf_drv=scfdrv)
grad_driver.update_settings(gradient_settings, method_settings)
grad_driver.compute(molecule, basis)
cart_grad_array = grad_driver.gradient

                                                                                                                          
                                                   SCF Gradient Driver                                                    
                                                                                                                          
                                              Molecular Geometry (Angstroms)                                              
                                                                                                                          
                          Atom         Coordinate X          Coordinate Y          Coordinate Z                           
                                                                                                                          
                           C           0.000000000000       -0.703984000000        0.000000000000                         
                

Now let's write a routine which runs one gradient descent iteration:

In [27]:
def gradient_descent_iteration(coordinates, gradient, step):
    new_coordinates = coordinates - step * gradient
    return new_coordinates

And the routine that runs the optimization:

In [35]:
def gradient_descent(molecule, basis, scf_driver, gradient_driver,
                     step=0.1, threshold=1e-3, max_iter=10):
    # set ostream state to False, to avoid printout from every new scf calculation
    ostream_state = scf_driver.ostream.state
    scf_driver.ostream.state = False
    gradient_driver.ostream.state = False
    
    iteration = 0
    grad_norm = 100
    # atom labels (symbols)
    labels = molecule.get_labels()
    # initial atomc coordinates
    old_coords = molecule.get_coordinates()
    old_energy = scf_driver.get_scf_energy() 
    old_gradient = gradient_driver.gradient
    
    print("Starting gradient descent:\n")
    print("Iteration      Old Energy (H)       New Energy (H)         Difference (H)         Gradient norm (H/bohr)")
    
    while (grad_norm >= threshold) and (iteration <= max_iter):
        coords = gradient_descent_iteration(old_coords, old_gradient, step)
        
        # calculate the energy and gradient corresponding to the new coordinates
        new_mol = vlx.molecule.Molecule(labels, coords, units='au')
        scf_driver.compute(new_mol, basis, None)
        energy = scf_driver.get_scf_energy()
        gradient_driver.compute(new_mol, basis)
        gradient = gradient_driver.gradient
        grad_norm = np.linalg.norm(gradient)
        
        # calculate energy difference
        delta_e = abs(energy - old_energy)
        print("   %3d.     %15.7f      %15.7f       %15.7f          %15.7f" % (iteration, old_energy, energy, delta_e, grad_norm))
        
        # save 
        old_energy = energy
        old_gradient = gradient
        old_coords = coords
        iteration += 1
    if iteration <= max_iter:
        print("\n   *** Gradient Descent converged in %d iteration(s). *** " % iteration)
        return new_mol
    else:
        print("\n   !!! Gradient Descent did not converge  !!! ")
        return None
    scf_driver.ostream.state = ostream_state
    gradient_driver.ostream.state = ostream_state

In [36]:
opt_mol = gradient_descent(molecule, basis, scfdrv, grad_driver, threshold=1e-2, max_iter=25)

Starting gradient descent:

Iteration      Old Energy (H)       New Energy (H)         Difference (H)         Gradient norm (H/bohr)
     0.         -77.0738507          -77.0665655              0.0072852                0.1619443
     1.         -77.0665655          -77.0689606              0.0023951                0.1336843
     2.         -77.0689606          -77.0705872              0.0016266                0.1095736
     3.         -77.0705872          -77.0716772              0.0010900                0.0893285
     4.         -77.0716772          -77.0724003              0.0007232                0.0725710
     5.         -77.0724003          -77.0728772              0.0004769                0.0588760
     6.         -77.0728772          -77.0731912              0.0003139                0.0478101
     7.         -77.0731912          -77.0733985              0.0002073                0.0389586
     8.         -77.0733985          -77.0735364              0.0001380                0.03


Let's look at the optimized geometry and compare it to the starting geometry:

In [32]:
def get_xyz(molecule):
    natm = molecule.number_of_atoms()
    elements = molecule.get_labels()
    coords = molecule.get_coordinates() * bohr_in_angstroms()
    txt = "%d\n\n" % natm
    for i in range(natm):
        txt += elements[i] + " %15.7f %15.7f %15.7f\n" % (coords[i,0], coords[i,1], coords[i,2])
    return txt

In [33]:
opt_molecule_xyz = get_xyz(opt_mol)
print(opt_molecule_xyz)

6

C      -0.0000000      -0.6659314      -0.0000000
C      -0.0000000       0.6419271      -0.0000000
H       0.9231091      -1.2297824      -0.0000000
H      -0.9231091      -1.2297824      -0.0000000
H       0.9178287       1.2217846       0.0000000
H      -0.9178287       1.2217846       0.0000000



In [34]:
print("        Initial geometry:              Optimized geometry:")
view = p3d.view(linked=True, viewergrid=(1,2),width=500,height=300)
view.addModel(ethene_xyz, 'xyz', viewer=(0,0))
view.addModel(opt_molecule_xyz, 'xyz', viewer=(0,1))
view.setStyle({'stick': {}})
view.zoomTo()
view.show()

        Initial geometry:              Optimized geometry:


How does the step size affect the convergence? How does the gradient descent algorithm behave?

## Conjugate Gradient

The conjugate gradient algorithm improves upon the gradient descent by keeping track of the history of the optimization. Let's implement the conjugate gradient algorithm and compare the two.

## Newton-Raphson

## Comparison to geomeTRIC