# Part 2: Water–Gas Shift Reaction

## Introduction

In this notebook, you will apply the PySCF workflow from **Part 1** to a
chemically meaningful equilibrium: the **water–gas shift reaction**

$$
\mathrm{CO + H_2O \rightleftharpoons CO_2 + H_2}.
$$

The focus in Part 2 is on assembling *reaction* thermodynamics from
*molecular* thermochemical quantities. Concretely, you will compute standard
thermochemistry for each species and combine the results to obtain the
standard reaction Gibbs free energy and the corresponding equilibrium constant.

Your goals are to:

1. Compute standard reaction thermodynamics in the **gas phase** at
   $T = 298.15\,\text{K}$ and $p = 1\,\text{bar}$.
2. Repeat the analysis with a **continuum solvation model** (PCM water) to
   mimic aqueous solution.
3. Compare your calculated equilibrium constants to **reference** values and
   comment on likely sources of deviation (electronic-structure method,
   harmonic approximation, solvation model, and standard-state conventions).

Recommended strategy:

1. For each molecular species (CO, H$_2$O, CO$_2$, H$_2$), perform the full workflow:
   geometry optimization, Hessian, frequency analysis, and thermochemistry.
2. Extract the relevant thermodynamic quantity for each species
   (e.g. Gibbs free energy at the chosen $T, p$).
3. Form the reaction quantity, e.g.

   $$
   \Delta_r G^\circ =
   G^\circ_{\mathrm{CO_2}} + G^\circ_{\mathrm{H_2}}
   - G^\circ_{\mathrm{CO}} - G^\circ_{\mathrm{H_2O}},
   $$

   and then

   $$
   K^\circ = \exp\bigl(-\Delta_r G^\circ/(RT)\bigr).
   $$

4. Compare the calculated $K^\circ$ to values obtained from a data
   table (e.g. NIST or a physical chemistry handbook).


In [None]:
# Import the main PySCF modules used in this workflow.
from pyscf import dft, gto
from pyscf.geomopt.geometric_solver import optimize
from pyscf.hessian import thermo

In [None]:
# Apply a small PySCF fix for rotational constants (robust handling of near-linear cases).
import patch

## Build Molecules for the Water–Gas Shift Reaction

First, define molecular geometries for each species:

- CO (linear),
- H₂O (bent),
- CO₂ (linear),
- H₂ (linear).

For this exercise you may guess reasonable approximate geometries. Consistency is of
ultimate accuracy: use the **same level of theory and basis set** for all
species.

Below is a template showing how you might build CO. Extend it to H₂O, CO₂,
and H₂, either by copying cells or by writing a small helper function.

In [None]:
# CO molecule used in the WGS project.
co = gto.M(
    atom="""
    C 0.0000 0.0000 0.0000
    O 0.0000 0.0000 1.13
    """,  # CO bond length ≈ 1.13 Å
    basis="def2-TZVPPD",
    verbose=3,
)

# TODO: define h2o, co2, and h2 with consistent geometries and basis.

## Helper Function for Thermochemistry

To avoid repeating code, it is convenient to write a small helper function
that

1. takes a molecule object,
2. runs a geometry optimization,
3. performs a Hessian and frequency analysis, and
4. returns the thermochemistry dictionary.

The skeleton below illustrates this idea. You may adapt it to your own preferences.

In [None]:
def compute_thermo_for_molecule(mol, T=298.15, P=100000.0, use_pcm=False):
    """Run geometry optimization and RRHO thermochemistry for a molecule.

    Parameters
    ----------
    mol : pyscf.gto.Mole
        Molecule object.
    T : float
        Temperature in K.
    P : float
        Pressure in Pa.
    use_pcm : bool
        If True, include PCM water.

    Returns
    -------
    dict
        Thermochemistry dictionary from thermo.thermo.
    """

    mf = dft.RKS(mol)
    mf.xc = "PBE0-D4"

    if use_pcm:
        mf = mf.PCM()
        mf.with_solvent.eps = 78.3553

    mol_opt = optimize(mf)

    mf_opt = dft.RKS(mol_opt)
    mf_opt.xc = "PBE0-D4"
    if use_pcm:
        mf_opt = mf_opt.PCM()
        mf_opt.with_solvent.eps = 78.3553

    energy = mf_opt.kernel()

    hess_opt = mf_opt.Hessian().kernel()
    freq_info = thermo.harmonic_analysis(mol_opt, hess_opt)

    thermo_info = thermo.thermo(mf_opt, freq_info["freq_au"], T, P)
    return thermo_info

## Reaction Gibbs Free Energy and Equilibrium Constant (Gas Phase)

Now combine the molecular thermochemistry into reaction quantities.

1. Compute thermochemistry for each species in the **gas phase**
   (`use_pcm=False`).
2. Extract the Gibbs free energy for each species from the corresponding
   `thermo_info` dictionary.
3. Form the standard reaction Gibbs free energy
   $\Delta_r G^\circ(T)$ for

   $$
   \mathrm{CO + H_2O \rightleftharpoons CO_2 + H_2}.
   $$

4. Compute the equilibrium constant

   $$
   K^\circ(T) = \exp\bigl(-\Delta_r G^\circ(T)/(RT)\bigr)
   $$

   and compare to an experimental value at the same temperature.

Remember:

- All four species must be treated at the **same level of theory** and with
  the **same basis set**.
- Be clear about the **standard state**: here the model uses ideal-gas
  translational motion at 1 bar (or 1 bar) by default.

Use the template below as a starting point and fill in the missing pieces.


In [None]:
import math

T = 298.15  # K
P = 100000.0  # Pa (1 bar)
R = 8.314462618  # J mol^-1 K^-1
hartree_to_jmol = 2625.499748

thermo_co_gas = compute_thermo_for_molecule(co, T=T, P=P, use_pcm=False)
# TODO: thermo_h2o_gas = ...
# TODO: thermo_co2_gas = ...
# TODO: thermo_h2_gas  = ...

In [None]:
# Gibbs free energy is stored under the key "G_tot" as a (value, unit) pair.
G_key = "G_tot"

G_co_gas, _unit = thermo_co_gas[G_key]
# TODO: G_h2o_gas, _ = ...
# TODO: G_co2_gas, _ = ...
# TODO: G_h2_gas, _  = ...

delta_G_gas = (G_co2_gas + G_h2_gas) - (G_co_gas + G_h2o_gas)
delta_G_gas_Jmol = delta_G_gas * hartree_to_jmol

K_gas = math.exp(-delta_G_gas_Jmol / (R * T))
print(f"Gas-phase Δ_r G°(T={T} K) = {delta_G_gas_Jmol:.2f} J/mol")
print(f"Gas-phase K°(T={T} K)      = {K_gas:.3e}")

## Including Solvent Effects (PCM)

Repeat the analysis with the PCM water model:

1. Recompute thermochemistry for each species with `use_pcm=True`.
2. Form the reaction Gibbs free energy and equilibrium constant as before.
3. Compare the gas-phase and solution-phase values.

Conceptually, the PCM model adds an approximate solvation free energy for
each species. The reaction free energy then includes differences in solvation
between reactants and products.

Points for discussion in your report:

- Does the PCM model shift the equilibrium toward reactants or products?
- How large is the solvent effect relative to the gas-phase reaction free
  energy?
- Which approximations in the electronic structure model and in the
  thermochemistry treatment are likely to be most important for this
  reaction?

In [None]:
thermo_co_pcm = compute_thermo_for_molecule(co, T=T, P=P, use_pcm=True)
# TODO: thermo_h2o_pcm = ...
# TODO: thermo_co2_pcm = ...
# TODO: thermo_h2_pcm  = ...

G_co_pcm, _unit = thermo_co_pcm[G_key]
# TODO: G_h2o_pcm, _ = ...
# TODO: G_co2_pcm, _ = ...
# TODO: G_h2_pcm, _  = ...

delta_G_pcm = (G_co2_pcm + G_h2_pcm) - (G_co_pcm + G_h2o_pcm)
delta_G_pcm_Jmol = delta_G_pcm * hartree_to_jmol

K_pcm = math.exp(-delta_G_pcm_Jmol / (R * T))
print(f"PCM Δ_r G°(T={T} K) = {delta_G_pcm_Jmol:.2f} J/mol")
print(f"PCM K°(T={T} K)      = {K_pcm:.3e}")