# Hartree-Fock 

## Introduction
Hartree fock method is the simplest physically relevant method of solving the time-independent Schrodinger equation for a many-body system.

The genius of this method lies in the gradual solving of the problem for individual electrons. Problematic electron-electron interaction is dealth with by considering the electron as existing in an "averaged" potential of other electrons.

## Initial approximations
Hartree Fock method makes five major simplifications/assumptions to deal with the Schrodingers equation:

- **Born-Oppenheimer approximation**: separation of nuclear and electronic movement
- **Model of independent electrons**
- **Wavefunction in a Slater determinant form**
- **MO LCAO**: Molecular orbitals as Linear Combination of Atomic Orbitals

## SCF procedure
**0. Geometry and basis set**

- **Geometry**: The positions of the nuclei determine the potential field in which the electrons move. 

- **Basis Set**: Basis set consists of a predefined set of functions $\phi$ (atomic orbitals - AO) centered on atom nuclei that are used to construct molecular orbitals (MO). These functions typically Gaussian functions, Slater-type orbitals or exact solutions from the hydrogen atom. They serve as the building block for approximating the wavefunction of the system.

**1. Starting MO - set of coefficients $c_{iA}$**

Molecular orbitals are constructed as linear combinations of atomic orbitals using MO LCAO:

$\psi_i (r) = \sum_A c_{iA} \Psi_A (r)$

where $\psi_i(r)$ is the $i$-th molecular orbital, $c_{iA}$ are coefficients weighing the contribution of each basis function (atomic orbital) $\phi_A(r)$ to the total molecular orbital.

An initial guess for the molecular orbital coefficents $c_{iA}$ provides a starting point for the iterative SCF process. This guess can be based on simpler methods like Hucker theory. These estimated coefficients are then refined throughout the SCF process thanks to the *Variational theorem* which states:

$E_0 \le \langle \psi \hat{H} \psi \rangle$

that means that the eigenvalue (energy) of any trial wavefunction is greater or equal than the true ground state energy. The implications of this are massive and allow for the iterative minimization of energy with respect to the coefficients $c_{iA}$ to yield the lowest possible energy within the chosen basis set.

**2. Calculating the integrals $\langle AB|CD \rangle$, $S_{AB}$, $h_{AB}$**
- Overlap integrals $S_{AB}$ measure the overlap between basis functions. Basis sets are generally not orthonormal. And in order to write the Fock Equations equations in the form of an matrix eigenvalue problem, we need to orthogonalize our basis, hence why including the $S$ matrix. 

    $S_{AB} = \langle \phi_A | \phi_B \rangle = \int \phi_A \phi_B dr$

- Core Hamiltonian $h_{AB}$: Represents the kinetic energy of electrons and their attraction to the nucleic. It forms the starting point for building the Fock matrix.

    $h_{AB} = \int \phi_A (r) \bigg( - \frac{1}{2} \nabla^2 - \sum_N \frac{Z_N}{|r-R_N|}\bigg) \phi_B (r) dr$

- Two electron repulsion integrals $\langle AB|CD \rangle$ account for electron electron repulsion. These integrals are crucial for incorporating electron correlation effects within the mean-field approximation

**3. Create matrix P_{AB}**

Density matrix encodes the distribution of electrons over the molecular orbitals. It is used to calculate the electron density and thus the effective potential experienced by each electron.

4) Create matrix F_{AB}

5) Solving Fock equations 

6) Checking convergence criteria - back to item 3

In [202]:
import sys
import math
import numpy as np
import json
import os

class BasisFunction:
    def __init__(self, atom_number, element, unique_id, coordinates, angular_momentum, exponents, coefficients):
        self.atom_number = atom_number
        self.element = element
        self.unique_id = unique_id
        self.coordinates = coordinates
        self.angular_momentum = angular_momentum
        self.exponents = exponents
        self.coefficients = coefficients

    def __repr__(self):
        return f"Shell(atom_number:{self.atom_number}, element:{self.element},unique_id:{self.unique_id}, coordinates:{self.coordinates}, angular_momentum:{self.angular_momentum}, exponents:{self.exponents}, coefficients:{self.coefficients})"

def load_basis_set(atomic_number, element, unique_id, coordinates, basis_set_file):
    '''
    Load basis set from json file and creates a list of BasisFunction instances for a given atomic number
    Args:
        atomic_number (int): Atomic number of element
        basis_set_file (str): Path to basis set json file

    Returns:
       electron_shell (list): List of BasisFunction instances with exponents and coefficients for a given electron shell of a given atomic number

    Raises:
        FileNotFoundError: If the file does not exist
        ValueError: If the file content is not as expected or atomic number is not found
    '''
    if os.path.exists(basis_set_file):
        with open(basis_set_file, 'r') as f:
            basis_set = json.load(f)
    else:
        raise FileNotFoundError(f"File not found: {basis_set_file}")
    
    if 'elements' in basis_set:
        if str(atomic_number) in basis_set['elements']:
            element_data = basis_set['elements'][str(atomic_number)]
            electron_shell = []
            for shell in element_data.get('electron_shells', []):
                angular_momenta_array = [int(l) for l in shell.get('angular_momentum', [])]
                angular_momentum = angular_momenta_array[len(angular_momenta_array) - 1]
                exponents = [float(e) for e in shell.get('exponents', [])]
                coefficients = [[float(c) for c in coef] for coef in shell.get('coefficients', [])]
                electron_shell.append(BasisFunction(atomic_number, element, unique_id, coordinates, angular_momentum, exponents, coefficients))
            return electron_shell
        else:
            raise ValueError(f"Atomic number {atomic_number} not found in the basis set file.")
    else:
        raise ValueError("Invalid basis set file format.")

def load_basis_set_for_molecule(atomic_numbers, elements, unique_ids, coordinate_array, basis_set_file):
    '''
    Load basis set from json file and creates a dictionary of atomic number to list of BasisFunction instances
    Args:
        atomic_numbers (list): List of atomic numbers of elements in the molecule
        basis_set_file (str): Path to basis set json file

    Returns:
        basis_set_for_molecule (dict): Dictionary with atomic numbers as keys and lists of BasisFunction instances as values
    '''
    #TODO: Error handling
    
    basis_set_for_molecule = []
    for unique_id in unique_ids:
        atomic_number = atomic_numbers[unique_id]
        basis_set_for_molecule.append(load_basis_set(atomic_number, elements[unique_id], unique_id, coordinate_array[unique_id], basis_set_file))
    
    return basis_set_for_molecule

def read_xyz(file_path):
    atomic_numbers = {
        'H': 1,
        'He': 2,
        'Li': 3,
        'Be': 4,
        'B': 5,
        'C': 6,
        'N': 7,
        'O': 8,
        'F': 9,
        'Ne': 10
        # Add more elements as needed
    }
    atoms = []
    with open(file_path, 'r') as file:
        lines = file.readlines()
        atom_count = int(lines[0].strip())
        unique_id = 0
        for line in lines[2:2+atom_count]:
            parts = line.split()
            element = parts[0]
            coordinates = [float(x) for x in parts[1:4]]
            atomic_number = atomic_numbers.get(element, None)
            if atomic_number is None:
                raise ValueError(f"Element {element} not recognized.")
            atoms.append((atomic_number, element, unique_id, coordinates))
            unique_id += 1
    return atoms

In [203]:
#def primitive_gaussian(alpha, coeff, coordinates, l1, l2, l3)   
#    gto = (2.0 * alpha / math.pi) ** 0.75 # + other terms for angular momentum > 0
    
class primitive_gaussian():
    
    def __init__(self, exponent, coeff, coordinates, l1, l2, l3):

        self.exponent = exponent
        self.coeff = coeff
        self.coordinates = np.array(coordinates)
        self.normalization_const = (2.0 * exponent / math.pi) ** 0.75 # + additional terms for angular momentum > 0

Overlap Matrix

In [204]:
def calculate_overlap_exponent(p1, p2, c1, c2, r1, r2):
    N = c1 * c2
    p = p1 + p2
    q = p1 * p2 / p
    Q = np.array(r1) - np.array(r2)
    Q2 = np.dot(Q, Q)
    return N * math.exp(-q * Q2) * (math.pi / p)**(3/2)

def overlap_matrix(basis_set):

    n_basis = sum([len(atom) for atom in basis_set])

    S_matrix = np.zeros([n_basis, n_basis])
    
    basis_functions = []
    for atom in basis_set:
        for shell in atom:
            basis_functions.append(shell)
    
    for i in range(n_basis):
        for j in range(n_basis):
            shell_i = basis_functions[i]
            shell_j = basis_functions[j]

            n_primitives_i = len(shell_i.exponents)
            n_primitives_j = len(shell_j.exponents)

            for k in range(n_primitives_i):
                for l in range(n_primitives_j):
                    exp_i = shell_i.exponents[k]
                    exp_j = shell_j.exponents[l]
                    for coeff_i in shell_i.coefficients:
                        for coeff_j in shell_j.coefficients:
                            for coeff_ik in coeff_i:
                                for coeff_jl in coeff_j:
                                    S_matrix[i,j] += calculate_overlap_exponent(exp_i, exp_j, coeff_ik, coeff_jl, shell_i.coordinates, shell_j.coordinates)
    
    return S_matrix


F = T + V_ne + V_ee

In [205]:
molecule = read_xyz('water.xyz')

atomic_numbers = [molecule[i][0] for i in range(len(molecule))]
elements = [molecule[j][1] for j in range(len(molecule))]
unique_ids = [molecule[k][2] for k in range(len(molecule))]
coordinate_array = [molecule[l][3] for l in range(len(molecule))]

basis_set = load_basis_set_for_molecule(atomic_numbers, elements, unique_ids, coordinate_array, 'sto-3g_h_o.json')

print(basis_set)

S_matrix = overlap_matrix(basis_set)
print(S_matrix)

[[Shell(atom_number:8, element:O,unique_id:0, coordinates:[0.0, 0.0, 0.0], angular_momentum:0, exponents:[130.7093214, 23.80886605, 6.443608313], coefficients:[[0.1543289673, 0.5353281423, 0.4446345422]]), Shell(atom_number:8, element:O,unique_id:0, coordinates:[0.0, 0.0, 0.0], angular_momentum:1, exponents:[5.033151319, 1.169596125, 0.38038896], coefficients:[[-0.09996722919, 0.3995128261, 0.7001154689], [0.155916275, 0.6076837186, 0.3919573931]])], [Shell(atom_number:1, element:H,unique_id:1, coordinates:[0.0, 0.757, 0.586], angular_momentum:0, exponents:[3.425250914, 0.6239137298, 0.168855404], coefficients:[[0.1543289673, 0.5353281423, 0.4446345422]])], [Shell(atom_number:1, element:H,unique_id:2, coordinates:[0.0, -0.757, 0.586], angular_momentum:0, exponents:[3.425250914, 0.6239137298, 0.168855404], coefficients:[[0.1543289673, 0.5353281423, 0.4446345422]])]]
[[ 0.28084938  2.09977771  0.7150799   0.7150799 ]
 [ 2.09977771 81.27980687 56.16034695 56.16034695]
 [ 0.7150799  56.160