# Part 1: PySCF Workflow for Molecular Thermochemistry

## Introduction

This notebook introduces the computational workflow used throughout **Thermo Lab** to connect
an electronic-structure calculation to thermodynamic quantities. The emphasis in Part 1 is on
*process*: you will run the end-to-end PySCF sequence that later parts reuse for reaction and
equilibrium calculations.

## Learning goals

After completing Part 1, you should be able to:

- set up a molecule in PySCF (geometry and basis set),
- run a DFT geometry optimization,
- compute a Hessian and perform a harmonic frequency analysis,
- compute RRHO thermochemistry at a specified temperature and pressure, and
- interpret and extract quantities from the `thermo_info` output dictionary.

:::{note} Conventions
Unless stated otherwise, we use $T = 298.15\,\mathrm{K}$ and $P = 1\,\mathrm{bar}$.
Pressure is provided to PySCF in Pa, so $1\,\mathrm{bar} = 100000\,\mathrm{Pa}$.
- Level of theory: `wB97X-V` / `def2-QZVP`.
:::


## Setup

Run the next cells once to import PySCF modules and apply a small utility patch used in this lab.



:::{note} Note on `import patch`

The `patch` module applies a small fix to PySCF’s handling of rotational
constants. For some (near-)linear molecules, numerical noise in the Hessian
can produce slightly negative or inconsistently ordered rotational constants.
Importing `patch` replaces PySCF’s routine with a more robust version that
enforces non-negative, sorted values.

You may reuse `patch.py` in your own projects: simply copy it into your
working directory and import it.
:::

In [None]:
from pyscf import dft, gto
from pyscf.geomopt.geometric_solver import optimize
from pyscf.hessian import thermo

In [None]:
import patch

## Workflow Overview

The calculation in this notebook follows a typical **ab initio thermochemistry
workflow**:

1. **Build the molecule** using Cartesian coordinates and select a basis set.
2. **Run an electronic-structure calculation** (here: DFT with wB97X-V).
3. **Optimize the geometry** to locate an energy minimum on the potential
   energy surface.
4. **Compute the Hessian** (matrix of second derivatives) at the optimized
   geometry.
5. **Perform a frequency analysis** to obtain normal modes and vibrational
   frequencies.
6. **Evaluate thermodynamic functions** at a given temperature and pressure
   using the rigid-rotor / harmonic-oscillator / ideal-gas models.

As you read and execute each code cell, identify which step of this workflow
it implements and which quantities are being approximated.


## Molecule setup

In this section you define the **molecular system** for the calculation.
We use hydrogen fluoride (HF) as a simple diatomic example.

Key ingredients:

- The `atom` block specifies element symbols and Cartesian coordinates (Å by default).
- The `basis` keyword chooses the one-electron basis set (here: `def2-QZVP`).
- The `verbose` flag controls how much output PySCF prints.


In [None]:
mol = gto.M(
    atom="""
    H 0.0 0.0 0.0
    F 0.0 0.0 1.0
    """,
    basis="def2-QZVP",
    verbose=3,
)

## Geometry optimization

Once the molecule is defined, the next step is to **optimize the geometry**.

Conceptually:

- We search the potential energy surface for a local minimum.
- At a minimum, the gradient (forces on all atoms) vanishes.
- The optimized structure is then used as the reference point for the Hessian and vibrational analysis.

In the next cell you will:

- set up a DFT calculation with the wB97X-V functional,
- optionally embed the molecule in a polarizable continuum model (PCM) to mimic solvent effects, and
- call the geometry optimizer to relax the structure.

:::{tip} PCM toggle
Set `use_pcm = True` to include PCM water (dielectric constant $\varepsilon=78.3553$).
For HF, the qualitative workflow is the same with or without PCM.
:::


In [None]:
use_pcm = False  # set to True to enable PCM water
eps_water = 78.3553

mf = dft.RKS(mol)
mf.xc = "wB97X-V"

if use_pcm:
    mf = mf.PCM()
    mf.with_solvent.eps = eps_water

mol_opt = optimize(mf)

## Thermochemistry (RRHO)

The next step is to compute thermodynamic quantities from the optimized geometry.

Workflow:

1. Run a single-point electronic-structure calculation at the optimized geometry.
2. Compute the Hessian and perform a harmonic frequency analysis.
3. Assemble RRHO thermochemistry at the chosen $T$ and $P$ with `thermo.thermo(...)`.

The result is stored in `thermo_info`, which contains final totals (e.g., $H$, $S$, $G$)
and intermediate contributions (e.g., zero-point and thermal corrections). In the following
sections, you will (i) print a formatted report and (ii) inspect selected entries programmatically.


In [None]:
mf_opt = dft.RKS(mol_opt)
mf_opt.xc = "wB97X-V"

if use_pcm:
    mf_opt = mf_opt.PCM()
    mf_opt.with_solvent.eps = eps_water

energy = mf_opt.kernel()

hess_opt = mf_opt.Hessian().kernel()
freq_info = thermo.harmonic_analysis(mol_opt, hess_opt)

T = 298.15
P = 100000.0  # Pa (1 bar)
thermo_info = thermo.thermo(mf_opt, freq_info["freq_au"], T, P)

## Reporting results with `dump_thermo`

`thermo.dump_thermo(...)` formats and prints a thermochemistry summary to the notebook output.
It does not recompute quantities; it reports what is stored in `thermo_info` for the chosen $T$ and $P$.


In [None]:
thermo.dump_thermo(mf_opt.mol, thermo_info)

## Understanding `thermo_info`

`thermo_info` is a Python **dictionary** (*dict*): it stores **key–value pairs**.
In this lab, many entries are stored as `(value, unit)` pairs.

You will use entries from `thermo_info` in Parts 2 and 3 to form reaction quantities
(e.g., $\Delta_r G^\circ$) and equilibrium constants.


In [None]:
thermo_info

In [None]:
thermo_info.keys()

In [None]:
thermo_info["H_tot"]