# Ground State Energy Estimation for Photosynthesis with Quantum Circuits

An area where quantum chemistry can offer immediate solutions to the ever-growing energy demand and climate change challenges is by enhancing our understanding of artificial photosynthesis. Artificial photosynthesis aims to replicate and optimize the natural process of converting sunlight, water, and carbion dioxide into energy-rich fuels, resulting in a more sustainable and carbon-neutral energy cycle.

Artifical photosynthesis is a multi-step process that begins with the absorption of sunlight, leading to charge separation and the oxidation of water ($H_{2}0$), which produces oxygen ($O_2$), protons ($H^+$), and electrons ($e^-$).

The electrons and protons extracted from the previous step are used for $CO_2$ reduction to faciliate the production of fuels. This involves reducing $CO_2$ into either CO, hydrocarbons like methan, or other carbonhydrates such as glucose. 

Water oxidation reaction:
$$ 2H_20 \rightarrow O_2 + 4(H^+)+4e^- $$

CO2 reduciton reactions:

$$
\begin{align*}
\text{Reduction I}:   CO_2 + 2(H^+) + 2e^- \rightarrow CO + H_2O \\
\text{Reduction II}:  CO_2 + 6(H^+) + 6e^- \rightarrow CH_3OH + H_2O \\
\text{Reduction III}: CO_2 + 8(H^+) + 8e^- \rightarrow CH_4 +2H_2O
\end{align*}
$$

In this notebook, we dive into the process of down-selecting a pool of catalysts for water oxidation and catalysts for the $CO_{2} $ reduction reaction. For each mechanism, the ultimate goal is to estimate the ground state electronic structure per transition states to calculate the energy barrier for the reaction pathway to unconver superior catalysts. 

There are four main steps to estimating the ground state energy of the reaction:
1. Generate the electronic hamiltonian in the second-quantization form (also known as molecular hamiltonian) per state of the reaction
2. Prepare an initial state that provides sufficient overlap with the true ground states, boosting the success probability of the phase estimation for the reactive system. This can be achieved by the linear combination of the HF state and selected configuration
interaction singles (CIS) states. These multireference states can be easily prepared with rotations and entanglement gates.
3. Perform a mapping between Qubit operators and Fermionic operators. This can be achieved using either a Jordan-Wigner, Bravyi-Kitaev, or parity encoding.
4. Perform Ground State Energy Estimation on the mapped qubit hamiltonian to estimate the activation energy of the reaction.
5. With the qubit Hamiltonian and initial state, we are now ready to measure the energies of the reaction of interest.

Typical examples of water oxidation:
$CO_4O_4$ catalysis (https://pubs.acs.org/doi/10.1021/ja202320q)

In [2]:
import re
import sys
import time
import numpy as np
from openfermionpyscf import run_pyscf
from openfermion.chem import MolecularData
from pyLIQTR.PhaseEstimation.pe import PhaseEstimation
from qca.utils.utils import extract_number, circuit_estimate

# Hamiltonian Generation
First, we will define the functions necessary to grab the charge, multiplicity, and number of atoms for some molecule within a desired pathway that was specified from a catalyst of interest. Once we have such information, we can then use it to construct an electronic hamiltonian along the reaction pathway of interest. 

Our input for this is a file encoded in the XYZ file format is used for depicting molecular data. An XYZ file gives the number of atoms of the molecule on the first line followed by a number of lines with atomic symbols (or atomic numbers) and cartesian coordinates. This notebook comes shipped with an XYZ file that describes water oxidation by using $Co_4O_4$ as a catalyst and $CO_2$ reduction by using $CoPc$ as a catalyst. One can take a look at such files in the data/ directory of this repository.

In [3]:
t_init = time.perf_counter()
def grab_line_info(current_line:str):
    multiplicity = 0
    charge = 0
    multiplicity_match = re.search(r"multiplicity\s*=\s*(\d+)", current_line)
    if multiplicity_match:
        multiplicity = int(multiplicity_match.group(1))
    charge_match = re.search(r"charge\s*=\s*(\d+)", current_line)
    if charge_match:
        charge = int(charge_match.group(1))
    return multiplicity, charge

def grab_pathway_info(data: list[str], nat:int, current_line:str, coord_pathways:list, current_idx:int):
    coords_list = []
    multiplicity, charge = grab_line_info(current_line)
    coords_list.append([nat, charge, multiplicity])
    for point in range(nat):
        data_point = data[current_idx+1+point].split()
        aty = data_point[0]
        xyz = [float(data_point[i]) for i in range(1,4)]
        coords_list.append([aty, xyz])
    coord_pathways.append(coords_list)

In [4]:
def load_pathway(fname:str, pathway:list[int]=None) -> list:
    with open(fname, 'r') as f:
        coordinates_pathway = []
        data = f.readlines()
        data_length = len(data)
        idx = 0
        while idx < data_length:
            line = data[idx]
            if 'charge' in line or 'multiplicity' in line:
                geo_name = ''
                if len(line.split(',')) > 2:
                    geo_name = line.split(',')[2]
                nat = int(data[idx-1].split()[0])
                if geo_name and pathway:
                    order = extract_number(geo_name)
                    if order and order in pathway:
                        grab_pathway_info(data, nat, line, coordinates_pathway, idx)
                else:
                    grab_pathway_info(data, nat, line, coordinates_pathway, idx)
                idx += nat + 2
            else:
                idx += 1
    return coordinates_pathway

We then define the appropriate parameters for generating the electronic hamiltonian along a reaction pathway. The Python-based Simulations of Chemistry Framework (PySCF) is an open-source collection of electronic structure modules and we interface it through an openfermion plugin called openfermionpyscf. This library is actually what we use to generate the electronic hamiltonian.

The calculation parameters are used to indicate whether we want to perform a specific calculation. They are as follows:
- run_scf: boolean flag to indicate running an SCF calculation
- run_mp2: boolean flag to indicate running a MP2 calculation
- run_cisd: boolean flag to indicate running a CISD calculation
- run_ccsd: boolean flag to indicate running a CCSD calculation
- run_fci: boolean flag to indicate running a FCI calculation

We also choose which basis set we want our electronic hamiltonian to be in. For the purpose of minimizing computational complexity, we choose the basis set to be 'sto-3g' as its a common minimal basis set. 

In the selection of an active space, the key principle is that all strongly correlated orbitals must be identified and selected to active space. Given how the active space is selected, we can effectively reduce the size of the molecular hamiltonian and the subsequent qubit hamiltonian. 

In [5]:
molecular_hamiltonians = []
pathway = [1, 14, 15, 16, 24, 25, 26, 27]

# water oxidation via Co4O4 catalyst.
coordinates_pathway = load_pathway('../data/water_oxidation_Co4O4.xyz', pathway=pathway)

# co2 reduction via CoPc catalyst
# coordinates_pathway = load_pathway('../data/CO2_reduciton_CoPc.xyz')

# Set calculation parameters.
run_scf = 1
run_mp2 = 0
run_cisd = 0
run_ccsd = 0
run_fci = 0

# Set molecule parameters.
basis = 'sto-3g'
active_space_frac = 10# 1 over n

In [6]:
def generate_electronic_hamiltonians(coordinates_pathway:list) -> list:
    molecular_hamiltonians = []
    for coords in coordinates_pathway:
        t_coord_start = time.perf_counter()
        _, charge, multi = [int(coords[0][j]) for j in range(3)]

        # set molecular geometry in pyscf format
        geometry = []
        for coord in coords[1:]:
            atom = (coord[0], tuple(coord[1]))
            geometry.append(atom)
        
        molecule = MolecularData(geometry=geometry,
                                 basis=basis,
                                 multiplicity=multi,
                                 charge=charge,
                                 description='catalyst')
        t0 = time.perf_counter()
        molecule = run_pyscf(molecule,
                             run_scf=run_scf,
                             run_mp2=run_mp2,
                             run_cisd=run_cisd,
                             run_ccsd=run_ccsd,
                             run_fci=run_fci)
        t1 = time.perf_counter()

        print(f'Time it took to perform a scf calculation: {t1-t0}')
        print(f'number of orbitals          = {molecule.n_orbitals}')
        print(f'number of electrons         = {molecule.n_electrons}')

        print(f'number of qubits            = {molecule.n_qubits}')
        print(f'Hartree-Fock energy         = {molecule.hf_energy}')

        nocc = molecule.n_electrons // 2
        nvir = molecule.n_orbitals - nocc
        sys.stdout.flush()

        # get molecular Hamiltonian
        active_space_start =  nocc - nocc // active_space_frac # start index of active space
        active_space_stop = nocc + nvir // active_space_frac   # end index of active space

        print(f'active_space start = {active_space_start}')
        print(f'active_space stop  = {active_space_stop}')
        sys.stdout.flush()

        molecular_hamiltonian = molecule.get_molecular_hamiltonian(
            occupied_indices=range(active_space_start),
            active_indices=range(active_space_start, active_space_stop)
        )

        # shifted by HF energy
        molecular_hamiltonian -= molecule.hf_energy
        molecular_hamiltonians.append(molecular_hamiltonian)
        t_coord_end = time.perf_counter()
        print(f'Time it took to generate an electronic hamiltonian is {t_coord_end-t_coord_start}\n')
    return molecular_hamiltonians


if coordinates_pathway:
    molecular_hamiltonians = generate_electronic_hamiltonians(coordinates_pathway)

finished computing scf
finished running scf
Time it took to perform a scf calculation: 47.173789332970046
number of orbitals          = 100
number of electrons         = 148
number of qubits            = 200
Hartree-Fock energy         = -3479.3603932694523
active_space start = 67
active_space stop  = 76
Time it took to generate an electronic hamiltonian is 47.18345462495927

finished computing scf
finished running scf
Time it took to perform a scf calculation: 60.70359370892402
number of orbitals          = 99
number of electrons         = 147
number of qubits            = 198
Hartree-Fock energy         = -3478.738701637257
active_space start = 66
active_space stop  = 75
Time it took to generate an electronic hamiltonian is 60.725361166056246

finished computing scf
finished running scf
Time it took to perform a scf calculation: 60.68609670794103
number of orbitals          = 98
number of electrons         = 146
number of qubits            = 196
Hartree-Fock energy         = -3478.02

In the case of preparing the initial state, a single-determinant Hartree-Fock(HF) state serves as a good initial approximation, as it provides a reasonable starting point with a relatively large overlap with the actual ground state. 

In [None]:
"""
TODO: Perform HF into this section to prepare the initial state of each
electronic hamiltonian to perform QPE for GSEE. 
"""

Once the initial state has been prepared, we can now perform Quantum Phase Estimation to estimate the ground state energy of each electronic hamiltonian. Currently we are using a short evolution time and a second order trotterization with a single step. We will use scaling arguments to determine the final resources since generating the full circuit for a large number of trotter steps with many bits of precision is quite costly.

Once we have the GSEE circuit to estimate the ground state energy of the electronic hamiltonian, we convert the GSEE circuit to a Clifford + T circuit to grab resource estimates from. The generated JSON files containing the subcircuit resource estimates will be written to disk. 

In [98]:
E_min = -4000
E_max = -3000
omega = E_max - E_min
t = 2*np.pi/omega
phase_offset = E_max * t
trotter_order = 2
trotter_steps = 1
bits_precision = 1
gse_args = {
    'trotterize' : True,
    'ev_time'    : 1,
    'trot_ord'   : trotter_order,
    'trot_num'   : trotter_steps
}

for idx, molecular_hamiltonian in enumerate(molecular_hamiltonians):
    
    n_qubits = molecular_hamiltonian.n_qubits
    # note that we would actually like within chemical precision
    # which should take > 10 bits of precision, it just takes a 
    # really long time to run so a scaling argument will be needed

    gse_args['mol_ham'] = molecular_hamiltonian
    init_state = [0] * n_qubits
    
    t0 = time.perf_counter()
    gse_inst = PhaseEstimation(
        precision_order=bits_precision,
        init_state=init_state,
        phase_offset=phase_offset,
        include_classical_bits=False,
        kwargs=gse_args
    )
    gse_inst.generate_circuit()
    t1 = time.perf_counter()
    print(f'Co4O4 time to generate high level number : {t1 - t0}')
    gse_circuit = gse_inst.pe_circuit
    
    t0 = time.perf_counter()
    circuit_estimate(gse_circuit,
                     outdir='GSE/',
                     circuit_name=f'Co4O4_{idx}',
                     trotter_steps=trotter_steps,
                     write_circuits=True
                     )
    t1 = time.perf_counter()
    print(f'Time to estimate Co4O4: {t1-t0}')

starting
Co4O4 time to generate high level number : 7.084446292021312
Estimating Co4O4 circuit {i}
   Time to decompose high level <class 'cirq.ops.common_gates.HPowGate circuit: 0.00013062497600913048 seconds 
   Time to transform decomposed <class 'cirq.ops.common_gates.HPowGate circuit to Clifford+T: 2.320797648280859e-05 seconds
   Time to decompose high level <class 'cirq.ops.identity.IdentityGate circuit: 4.112499300390482e-05 seconds 
   Time to transform decomposed <class 'cirq.ops.identity.IdentityGate circuit to Clifford+T: 3.00002284348011e-06 seconds
   Time to decompose high level <class 'pyLIQTR.PhaseEstimation.pe_gates.PhaseOffset circuit: 0.00013041601050645113 seconds 
   Time to transform decomposed <class 'pyLIQTR.PhaseEstimation.pe_gates.PhaseOffset circuit to Clifford+T: 6.191595457494259e-05 seconds
   Time to decompose high level <class 'pyLIQTR.PhaseEstimation.pe_gates.Trotter_Unitary circuit: 1690.9945681659738 seconds 


In [7]:
t_end = time.perf_counter()
print(f'Total time to run through this notebook: {t_end-t_init}')

Total time to run through this notebook: 3867.341483874945
