# PySCF Library Overview

This code utilizes the PySCF (Python-based Simulations of Chemistry Framework), an open-source computational chemistry library written in Python, designed for electronic structure calculations, particularly in quantum chemistry. PySCF supports a wide range of methods, including Hartree-Fock (HF), Density Functional Theory (DFT), Møller–Plesset perturbation theory (MP2), Coupled Cluster (CC), and Configuration Interaction (CI). It allows users to easily combine different computational methods and customize their workflows. PySCF is optimized for performance, utilizing efficient algorithms and parallel computing to handle large-scale calculations. It is also designed to be extensible, enabling researchers to implement and test new methods and algorithms with minimal effort. Common applications of PySCF include electronic structure calculations, molecular property computations (such as dipole moments, polarizabilities, and vibrational frequencies), studying reaction mechanisms, and investigating the electronic properties of solids and nanostructures.


In [None]:
# Import the necessary libraries
import pyscf
import matplotlib.pyplot as plt
import numpy as np
import json
from pyscf import gto, scf, tdscf
from geometry.water_geo import water
from helper import *


# Quantum Chemistry and the Hamiltonian

Solving the Hamiltonian is challenging even for the single-electron case. For systems with three or more electrons, considering all possible interactions becomes essential. A molecule comprises numerous atoms, each containing multiple subatomic particles, including electrons. Therefore, in addition to the multitude of variables and the complexity of the equations, the sheer dimension of the Hamiltonian required to account for all possible configurations is already too large to solve with simple calculations. Quantum chemistry has made significant progress in addressing such challenges. *Ab initio* methods, also known as wave function methods, aim to solve the Schrödinger equation from the beginning, constructing approximate wave functions based on the positions of nuclei and the number of electrons.

## Time-Independent Schrödinger Equation (TISE)

For simplicity, let's consider the time-independent Schrödinger equation (TISE), as shown in Equation (1), with a simplified electronic Hamiltonian depicted below:

$$
\hat{H}\ket{\Psi} = E \ket{\Psi} \tag{1}
$$

Assuming that the potential associated with the system is time-independent, we can apply the Born-Oppenheimer approximation, which assumes that the electrons move much faster than the nuclei, allowing us to treat the nuclear coordinates as fixed. In this simplification, hyperfine interactions are ignored, and the electronic states depend on electrons and parametrically on nuclear coordinates.

## Born-Oppenheimer Approximation and Electronic Hamiltonian

The electronic Hamiltonian under the Born-Oppenheimer approximation is given by:

$$
\hat{H}(r,R) = - \sum_{i=1}^N \frac{1}{2}\nabla_i^2 - \sum_{i=1}^N \sum_{A=1}^M \frac{Z_A}{r_{iA}} + \sum_{i=1}^N \sum_{j>i}^N \frac{1}{r_{ij}} \tag{2}
$$

Where:
- $ \hat{H}(r,R) $ is the electronic Hamiltonian,
- $ r $ and $ R $ represent the electronic and nuclear coordinates respectively,
- $ N $ is the number of electrons,
- $ M $ is the number of nuclei,
- $ \nabla_i^2 $ is the Laplacian operator acting on the $ i $-th electron,
- $ Z_A $ is the atomic number of nucleus $ A $,
- $ r_{iA} $ is the distance between electron $ i $ and nucleus $ A $,
- $ r_{ij} $ is the distance between electrons $ i $ and $ j $.

In this Hamiltonian:
- The first term $ - \sum_{i=1}^N \frac{1}{2}\nabla_i^2 $ represents the kinetic energy of the electrons.

- The second term $ - \sum_{i=1}^N \sum_{A=1}^M \frac{Z_A}{r_{iA}} $ represents the Coulomb attraction between the electrons and nuclei.

- The third term $ \sum_{i=1}^N \sum_{j>i}^N \frac{1}{r_{ij}} $ represents the Coulomb repulsion between electrons.

This formulation allows us to approximate the electronic wave function while treating the nuclear positions as fixed, greatly simplifying the computational complexity involved in solving the Schrödinger equation for multi-electron systems.


## Slater determinant

In the simplified electronic Hamiltonian shown in Equations (3), a single electron's behavior is described by its spatial wavefunction, $\phi(r)$, where $r$ includes Cartesian coordinates $x$, $y$, $z$, and its spin state, denoted by $w$ (either $\alpha$ or $\beta$). A spin-orbital, denoted as $\chi(r,w)$, combines spatial and spin coordinates as a product of functions. For a system of $N$ non-interacting electrons distributed across $K$ atomic orbitals, the Hamiltonian can be represented as the sum of single-electron Hamiltonians for individual electrons. The eigenfunction of this system is a product of spin orbitals for each electron, known as the Hartree product and denoted as $\Psi^{HP}$

$$
    \Psi^{HP}(x_1, x_2, \cdots, x_N) = \chi_i(x_1)\chi_j(x_2)\cdots\chi_k(x_N)\tag{3}
$$

Electrons follow the Pauli exclusion principle, where the electronic wave functions must exhibit antisymmetry upon the exchange of any two particles. This antisymmetric property can be achieved with the Slater determinant, $\ket{\Psi^{SD}}$, as shown in Equation (4). The Slater determinant is defined as a linear combination of Hartree products, mathematically expressed in terms of matrix determinants. It is the simplest antisymmetric wave function that can describe the ground state of an $N$-electron system.

$$
    \ket{\Psi^{SD}} = \frac{1}{\sqrt{N!}} \begin{vmatrix}  \chi_i(x_1) & \dots &\chi_k(x_1) \\ \vdots & \ddots & \vdots \\ \chi_i(x_N) &  \dots & \chi_k(x_N)  \end{vmatrix}= \ket{\chi_i\chi_j\cdots}\tag{4}
$$


## Hartree-Fock Method

The Hartree-Fock (HF) approach is the starting point for many \emph{ab initio} methods. It provides an approximation to the many-body problems posed by an N-electron system through single-particle states such that the solution can be a single Slater determinant. The single-particle states, $\{\chi_i\}$, can be chosen to be orthonormal. Then, using Lagrange's method of undetermined multipliers, with the constraint of orthonormality between spin orbitals and differentiating with respect to the molecular coefficients, it minimizes the expectation value of the Hamiltonian, providing the optimal set of orbitals with the lowest energy, $\epsilon_{ij}$ following Equation (5). 

$$
    \sum_i\epsilon_{ij} \chi_i(1) = \left[\hat{h} + \sum_j \int dx_2\chi_j(2)\frac{1}{r_{12}}\chi_j(2) - \sum_j \int dx_2\chi_j(2)\frac{1}{r_{12}}\hat{\mathcal{P}}\chi_j(2)\right] \chi_i(1) \tag{5} $$

Here, $\hat{\mathcal{P}}$ is an operator that interchanges electrons, for example, $\hat{\mathcal{P}} \chi_j(2) \chi_i(1) = \chi_i(2)\chi_j(1)$. Hartree-Fock (HF) is a mean-field method that captures the electron-electron interaction between the $i$-th and $j$-th electrons as an average potential from the other electrons. The second term in the second part of Equation (5) accounts for the Coulombic interaction, while the third term accounts for the exchange interaction arising from the antisymmetric nature of fermions. Therefore, we can define the Fock operator ($\hat{f}$), which combines the kinetic and potential energies of electrons as described by the singlet electron Hamiltonian ($\hat{h}$), along with operators for the Coulombic interactions between electrons ($\hat{J}$) and the exchange interactions ($\hat{K}$). For electron k, Equation (5) in terms of operators is written as follows:

$$
   \hat{f}(k) = \hat{h}(k) + \sum_{j=1}^{N} (\hat{J}_j(k) - \hat{K}_j(k)) 
$$

$$
    \hat{J}_j(1) = \sum_{j} \int dx_2\chi^*_j(2)\frac{1}{r_{12}}\chi_j(2) 
$$

$$
  \hat{K}_j(1) = \sum_{j} \int dx_2\chi^*_j(2)\frac{1}{r_{12}}\hat{\mathcal{P}}\chi_j(2)
$$


In [None]:
def pyscf_scf(self, molecule, spin, basis_set):
    """
    Perform SCF (Self-Consistent Field) calculation on the given molecule.
    """
    mol = gto.Mole()
    mol.atom = molecule
    mol.basis = basis_set
    mol.spin = spin
    mol.build()

    self.mean_field = scf.RHF(mol).run(verbose=4)
    self.mean_field.analyze()
    
    # Assign the values to instance variables
    self.core_hamiltonian = self.mean_field.get_hcore()
    self.mo_energy = self.mean_field.mo_energy
    self.mo_occ = self.mean_field.get_occ(self.mo_energy)
    self.overlap = self.mean_field.get_ovlp()
    self.coeff = self.mean_field.mo_coeff
    self.fock_matrix = self.mean_field.get_fock()
    self.density_matrix = self.mean_field.make_rdm1()
    self.n_electrons = 2 * round(0.5 * np.trace(self.density_matrix @ self.overlap))
    self.n_orbitals = len(self.mo_occ)

    # Return a dictionary instead of a tuple
    return {
        'core_hamiltonian': self.core_hamiltonian,
        'mo_energy': self.mo_energy,
        'n_orbitals': self.n_orbitals,
        'n_electrons': self.n_electrons,
        'mo_occ': self.mo_occ,
        'overlap': self.overlap,
        'coeff': self.coeff,
        'fock_matrix': self.fock_matrix,
        'density_matrix': self.density_matrix
    }


In [None]:
# Quantum chemistry calculation example
molecule = water
spin = 0
basis_set = 'sto-3g'

# Initialize Pyscf_helper and run SCF calculation
scf_water_init = Pyscf_helper()
scf_data = scf_water_init.pyscf_scf(molecule, spin, basis_set)
mean_field = scf_water_init.mean_field

## Configuration Interaction (CI) Methods

The Configuration Interaction (CI) methods involve a linear combination of different Slater determinants. The Full CI wave function, as depicted in Equation (6), encompasses linear combinations of all possible configurations and is therefore considered an exact solution within a given basis set. Here, $\Psi_0$ represents the ground state wave function, while $\Psi_i^a$ denotes determinants within the Fock space with singly excited states, achieved by switching the $i^{th}$ orbital with the $a^{th}$ orbital.

$$
  \ket{\Psi^{FCI}} = c_{0}\ket{\Psi^{(0)}} +\sum_{i,a} c_{i}^{a}\ket{\Psi_i^a} +\sum_{ijab}c_{ij}^{ab}\ket{\Psi_{ij}^{ab}}+...   \tag{6}
$$

Similar to the HF, the total CI wave functions are set to be orthonormal to each other; using a Lagrange multiplier with this constraint, the CI energy is to be obtained variationally. By optimizing expansion coefficients, one can lower the energy below the HF energy. Matrix elements can be evaluated with Slater's rule, and the matrix representation of the CI equation becomes Equation (7) where the solution is equivalent to the diagonalization of the CI matrix.

$$
Hc = ESc \tag{7}
$$


## Tamm-Dancoff Approximation (TDA) in Density Functional Theory (DFT)

The Tamm-Dancoff Approximation (TDA) is a simplified approach in the Time-Dependent Density Functional Theory (TDDFT). Despite some differences in implementation, both methods become analogous when it comes to Configuration Interaction Singles (CIS) and share the common goal of improving the accuracy of excited state calculations by considering electron interactions beyond the mean-field approximation.

### References

Dreuw, A., & Head-Gordon, M. (2005). Single-Reference Ab Initio Methods for the Calculation of Excited States of Large Molecules. *Chem. Rev.*, 105(11), 4009-4037. doi: [10.1021/cr0505627](https://pubs.acs.org/doi/10.1021


In [None]:

def configuration_interaction_singles(self, n_singlets, n_triplets):
    density_singlet = 0
    density_triplet = 0     

    # Compute singlets
    singlet_excitation = tdscf.TDA(self.mean_field)
    singlet_excitation.singlet = True
    singlet_excitation = singlet_excitation.run(nstates=n_singlets)
    singlet_excitation.analyze()
    cis_singlet_E = min(singlet_excitation.kernel()[0])
    for i in range(singlet_excitation.nroots):
        density_singlet += self.tda_density_matrix(singlet_excitation, i)
    
    # Compute triplets
    triplet_excitation = tdscf.TDA(self.mean_field)
    triplet_excitation.singlet = False
    triplet_excitation = triplet_excitation.run(nstates=n_triplets)
    triplet_excitation.analyze()
    cis_triplet_E = min(triplet_excitation.kernel()[0])
    for i in range(triplet_excitation.nroots):
        density_triplet += self.tda_density_matrix(triplet_excitation, i)
    
    return cis_singlet_E, cis_triplet_E, density_singlet, density_triplet

# Run configuration interaction singles
n_singlets = 1
n_triplets = 1
cis_singlet_E, cis_triplet_E, density_singlet, density_triplet = scf_water_init.configuration_interaction_singles(n_singlets, n_triplets)

print(f"CIS Singlet Energy: {cis_singlet_E:.4f} eV")
print(f"CIS Triplet Energy: {cis_triplet_E:.4f} eV")

### Multi-configuration methods

Active space methods not only provide a well-established solution for static correlation problems but also enhance computational efficiency. By partitioning the orbitals into three distinct spaces, calculations can exclusively focus on the relevant orbitals. The classification of these spaces determines the specific approach utilized.


#### Excited State Energy Analysis

The plots show excited state energies for singlet and triplet states under different active space configurations in a quantum system. Each state is analyzed based on the number of occupied and virtual orbitals, with variations highlighted between canonical and non-canonical configurations:

- **Singlet States**:
  - **Occupied Orbitals**: Energy stabilization observed with increasing number of orbitals.
  - **Virtual Orbitals**: Initial significant energy shifts stabilize over larger active spaces.

- **Triplet States**:
  - **Occupied Orbitals**: Pronounced energy decline, indicating stabilization trends.
  - **Virtual Orbitals**: Similar trends to singlet, but with sharper energy declines.

These insights demonstrate the capabilities of the PySCF library in handling complex quantum chemistry calculations.


In [None]:
# Active space generation and embedding potential calculation

def generate_active_space(self, C, active_space_type, overlap, n_electrons, mo_occ):
    HOMO_index = np.where(mo_occ == 2)[0][-1]
    LUMO_index = HOMO_index + 1
    n_orbitals = len(mo_occ)
    n_columns = C.shape[1]  # Number of columns in C (molecular orbitals)

    active_list = []
    virtual_list = []
    double_occupied_list = []

    for i in range(min(HOMO_index, LUMO_index)):
        if active_space_type == "Increasing both occupied and virtual orbital":
            double_occupied_list = list(range(HOMO_index - i))
            active_list = list(range(HOMO_index - i, LUMO_index + i + 1))
            virtual_list = list(range(LUMO_index + i + 1, n_orbitals))

        elif active_space_type == "Increasing occupied orbital":
            double_occupied_list = list(range(HOMO_index - i))
            active_list = list(range(HOMO_index - i, LUMO_INDEX))
            virtual_list = list(range(LUMO_INDEX + 1, n_orbitals))

        elif active_space_type == "Increasing virtual orbital":  
            active_list = list(range(0, n_orbitals - i))
            virtual_list = list(range(n_orbitals - i, n_orbitals))

        # Ensure indices are within bounds of C
        active_list = [a for a in active_list if 0 <= a < n_columns]
        virtual_list = [v for v in virtual_list if 0 <= v < n_columns]
        double_occupied_list = [d for d in double_occupied_list if d < n_columns]

        # Create orbitals from the coefficients
        occupied = C[:, double_occupied_list]
        active = C[:, active_list]
        virtual = C[:, virtual_list]

    n_active_orbitals = len(active_list)
    return active, virtual, occupied, len(active), len(virtual_list), len(double_occupied_list), n_active_orbitals


# Plot energy of singlet and triplet vs active space size
def plot(self, active_sizes, elists, e_s_cis, e_t_cis, elistt):
    plt.figure(figsize=(15, 6))

    plt.subplot(1, 2, 1)
    plt.plot(active_sizes, elists, marker='o', linestyle='-', label='CIS_act')
    plt.axhline(y=e_s_cis, color='blue', linestyle='--', label='CIS')
    plt.xlabel('# of orbitals in active space')
    plt.ylabel('Excited State energies (eV)')
    plt.title('Excitation energy of active space; Singlets')
    plt.grid(True)
    plt.legend()
    plt.tight_layout()

    plt.subplot(1, 2, 2)
    plt.plot(active_sizes, elistt, marker='o', linestyle='-', label='CIS_act')
    plt.axhline(y=e_t_cis, color='blue', linestyle='--', label='CIS')
    plt.xlabel('# of orbitals in active space')
    plt.ylabel('Excited State energies (eV)')
    plt.title('Excitation energy of active space; Triplets')
    plt.grid(True)
    plt.legend()
    plt.tight_layout()

    file_name = "no_HL_plots.png"
    plt.savefig(file_name)
    plt.show()

# Initialize the plot
plot = Plot()
plot.plot(active_sizes, elists, e_s_cis, e_t_cis, elistt)

## References
1. Szabo, A.; Ostlund, N. S. *Modern Quantum Chemistry*; McGraw-Hill: New York, 1989.
2. Sun, Q., Berkelbach, T. C., Blunt, N. S., Booth, G. H., Guo, S., Li, Z., Liu, J., McClain, J. D., Sayfutyarova, E. R., Sharma, S., Wouters, S., Chan, G. K.-L. (2017). PySCF: the Python-based simulations of chemistry framework. *WIREs Computational Molecular Science*, 8(1), e1340. doi:10.1002/wcms.1340
