# Chapter 2: Reasoning About Code - A New Way

In Chapter 1, we implemented a working 1-D heat equation solver.
It works, but it’s fragile, opaque, and difficult to test or extend.
In this chapter, we'll take a step back and ask: can we design and reason about our code systematically, before we even run it?

We'll introduce preconditions, postconditions, and invariants, which are concepts borrowed from formal logic, and use them to refactor our code step by step into something more modular, testable, and robust.

---

### Goals

 - Identify preconditions, postconditions, and invariants for a scientific computing problem.
 - Apply backward reasoning and stepwise refinement to improve a messy, monolithic implementation.
 - Separate concerns in your code: boundary conditions, flux computations, and updates.
 - Produce a modular version of the heat equation solver that sets the stage for unit and property-based testing.

### Key Concepts

 - Abstraction: Viewing complex systems at a high-level, where irrelevant details are ignored.
 - Hoare Logic: reasoning about code with `{P} code {Q}` triples.
 - Backward reasoning: starting from a desired outcome and working backward.
 - Stepwise refinement: incrementally developing code.
 - Invariants: conditions that must always hold true during program execution.
 - Separation of concerns: organizing code into clean, testable components.

---


## Why Reason About Code?

Scientific software is both:
 - A *tool* that enables science,
 - A *research product* in its own right.

As such, it warrants a similar rigor and reasoning as the science it enables. 

We can even take inspiration from the scientific method:

- Form hypotheses -> attempt refutation → refine toward confidence.

Reasoning About Code:

- Specified desired properties and behaviors -> Test / Check code against specifications -> Refine implementation.

In practice, much scientific software evolves through a *code and fix* process:

> If the outputs look reasonable and don't crash, it must be correct. - A typical scientific programmer

But trusting results without understanding the code is risky.

> "We can't get systems right if we don't understand them" - Leslie Lamport


### Abstraction

Reasoning about code requires a level of abstraction, i.e., a way of focusing on high-level behavior while ignoring irrelevant implementation details.

Scientists and engineers already think this way:
 - We model complex systems by simplifying, removing extraneous detail, and breaking problems into components.

But what happens to the low-level details? Don't the subtle, low-level considerations
the most often cause of bugs? While that's true, a focus on high-level abstractions pay
off in the form of better isolation of low-level details, overall integrity, and a better
understanding. After all, many of the root causes of low-level bugs is due to a lack of understanding
and attention to the high-level design and intent, or the trickling down of the bad high-level decisions to the low-level implementation.

Abstraction pays off in two ways:
 - Reasoning: We can understand and predict behavior at a manageable scale.
 - Design: We can isolate low-level details, reducing complexity and making the system easier to extend.

> The hard part of building software is the conceptual construct, 
> not the labor of representing it.
> *-FP Brooks. “No Silver Bullet” (1987)*

> What matters is the fundamental structure of the design. If you get
> it wrong, there is no amount of bug fixing and refactoring that will
> produce a reliable, maintainable, and usable system
> *- D. Jackson. The essence of software. (2021)*


### Specification: Writing Down Intent

A *specification* is a precise statement of what software should do, independent of how it does it.

Specifications range from plain English descriptions to formal mathematical models.
Here, we focus on a practical level: preconditions and postconditions, expressed as simple `assert` statements in code.

This lightweight approach:
 - Guides implementation,
 - Facilitates testing and validation.

### Floyd-Hoare Triples

A Hoare triple expresses the contract for a code fragment:

`  {P} code {Q}`

where
 - `P` (precondition): what must be true before running the `code`.
 - `code` (implementation): the program fragment in question, e.g., a statement, block, function, etc.
 - `Q` (postcondition): what must be true after running the `code`.

Example 2.1:

Can you find inputs where this function fails the postcondition even when the precondition is satisfied?

In [8]:
def div(x, y):
    assert y != 0           # P    (precondition)
    res = x / y             # code (implementation)
    assert res * y == x     # Q    (postcondition)
    return res

In [9]:
div(4, 2)

2.0

Floating-point arithmetic introduces small errors.
To acknowledge this, we weaken the postcondition:

In [3]:
import numpy as np

def div(x, y):
    assert y != 0                   # P    (precondition)
    res = x / y                     # code (implementation)
    assert np.isclose(res * y, x)   # Q    (postcondition)
    return res

In [4]:
div(7, 25)

0.28

**Key Takeaways**: To avoid overpromises, postconditions should be realistic and account for computational realities, such as floating-point precision.

How about preconditions? 

Goals in determining and specifying `P` and `Q`:
- We wish to have `P` as general as possible while still being precise enough to be useful (weakest precondition). This allows us to maximize the applicability of our code fragment.
- We wish to have `Q` as specific as possible while still being general enough to cover all desired outcomes (strongest postcondition). This allows us to give stronger guarantees about the behavior of our implementation.


## Stepwise Refinement: From Abstract to Concrete

Stepwise refinement is the process of progressively developing a solution, starting with a high-level specification and gradually refining it into an implementation.

The process:
 1. Identify the What: Define the overall problem and desired outcome.
 2. Decompose into Components: Break into meaningful, manageable parts.
 3. Specify Interfaces and Contracts: Define each component precisely.
 4. Implement Incrementally: Fill in implementations step by step, maintaining the contracts.

Benefits:
 - Clear separation of concerns,
 - Local, testable invariants,
 - Flexibility to change or replace components,
 - Robustness that emerges naturally.

## Heat Equation Solver, Revisited

We now apply stepwise refinement to redesign our 1-D heat equation solver.

### Step 0 — Specification (the what)

We want to advance a 1-D temperature field $u_i$ on a uniform grid using a conservative finite volume method:

$
\qquad
u_i^{n+1} = u_i^{n} + \Delta t \, \frac{F_{i} - F_{i+1}}{\Delta x},
\qquad
F_i =
\begin{cases}
q_L & i = 0, \\[6pt]
-\kappa \dfrac{u_i - u_{i-1}}{\Delta x} & 1 \leq i \leq N-1, \\[10pt]
q_R & i = N.
\end{cases}
$

#### Preconditions

- `len(u) >= 3`
- `dx > 0`, `dt > 0`, `nt > 0`

#### Postconditions

- The total heat is conserved, that is
$$
\sum_{i=0}^{N} u_i^{n+1} \Delta x = \sum_{i=0}^{N} u_i^{n} \Delta x + \Delta t \left( q_L - q_R \right).
$$

- When `q_L` and `q_R` are both zero, the average temperature remains constant over time.
- When `q_L` and `q_R` are constant, the temperature profile converges a steady state.

### Step 1 - Define the conceptual construct

In this step, we define a small but complete core API (data structures and functions).
Here, we are attempting to have a complete design of the full machinery
at a high level. Our aim is therefore to create a set of abstractions
(in the form of data structures and behaviors) that encapsulate the key concepts
and operations involved in solving the heat equation.

In coming up with a high level design, one threat is focusing too much on features
(i.e., computational aspects) as opposed to the underlying concepts that define
the system, and interactions between the components. In the words of John Ousterhout:

> "[T]he units of development should be abstractions, not features. Once you discover
the need for an abstraction, don't create abstraction in pieces over time; design
it all at once (or at least enough to provide a reasonably comprehensive set of core functions).
This is more likely to produce a clean design whose pieces fit together well." (A Philosophy of Software Design, 2022)

While it is crucial to design abstractions thoughtfully and comprehensively from the start, this does not mean they are fixed forever. Abstractions should be treated as evolving hypotheses: as our understanding of the problem deepens, we may need to refine or even redesign them. A core goal of software design is to create abstractions that are flexible, adaptable, and easy to refactor.

The principles emphasized in this tutorial, e.g., clear specifications with pre- and post-conditions, testability, modularity, and separation of concerns, provide the safety nets and structure needed to revise and improve abstractions over time, without sacrificing correctness or adding unnecessary complexity.

Before we develop the core API, let's outline the key abstract entities, i.e.,
concepts, and relationships among them.

#### Concepts in the Heat Equation Solver

If we look at the 1-D heat equation solver through a concept-first lens, we don’t begin
with arrays, loops, or even numerical schemes.

We begin with the concepts, the fundamental things in the world of this problem that the
software must represent faithfully and consistently.

Here are the core concepts the solver should capture:

- **Domain (Mesh)**: Represents the spatial discretization of the 1-D domain, including grid spacing and number of points.

- **State Fields**: Represents the temperature distribution and fluxes within the domain.

- **Physics**: The rules that tell us how the state fields evolve over time.

- **Boundary Conditions**: External forcing that we apply at the boundaries of the domain.

- **Numerical Schemes**: The methods we use to discretize the equations in time and space.

Based on these principles, below is a proposed high-level design for the core API that
will guide our implementation of the heat equation solver. Let's examine.

[TODO: maybe add a relationship diagram]

***Why This Matters***:

When these concepts are explicit and separate, several good things follow:

- Clarity:
  The solver becomes easy to explain, extend, and test.
  (e.g., swap in a new time-stepping method without touching the mesh or physics.)

- Correctness:
  Conservation and stability rules can be expressed and tested explicitly.

- Evolution:
  Features like non-uniform meshes or variable 𝜅 can be added without breaking other concepts.

- Shared understanding:
  Everyone talks about the same conceptual entities.

In [5]:
def _stable(kappa, dx, dt):
    assert dx > 0 and dt > 0 and kappa >= 0
    return kappa * dt / (dx*dx) <= 0.5

def flux(u, kappa, dx, qL, qR):
    """Face fluxes F[0..N]; F[0]=qL, F[N]=qR; interiors via Fourier law."""
    N = len(u)
    assert N >= 2 and dx > 0 and kappa >= 0
    F = [0.0]*(N+1)
    F[0], F[-1] = qL, qR
    for i in range(1, N):
        F[i] = -kappa * (u[i] - u[i-1]) / dx
    assert len(F) == N+1 and F[0] == qL and F[-1] == qR
    return F

def divergence(F, dx):
    """Cell tendencies dudt[i] = (F[i]-F[i+1])/dx."""
    N = len(F) - 1
    assert N >= 1 and dx > 0
    dudt = [(F[i] - F[i+1]) / dx for i in range(N)]
    assert len(dudt) == N
    return dudt

def step(u, kappa, dx, dt, qL, qR):
    """One explicit Euler step."""
    assert len(u) >= 2 and dt > 0
    F    = flux(u, kappa, dx, qL, qR)
    dudt = divergence(F, dx)
    u2   = [ui + dt*gi for ui, gi in zip(u, dudt)]
    assert len(u2) == len(u)
    return u2

def solve(u0, kappa, dx, dt, qL, qR, nt):
    """Run nt steps; asserts stability; conserves sum(u) if qL=qR=0."""
    assert isinstance(nt, int) and nt >= 0
    assert _stable(kappa, dx, dt), "Unstable: kappa*dt/dx^2 > 0.5"
    u = list(u0)
    baseline = sum(u) if (qL==0.0 and qR==0.0) else None
    for _ in range(nt):
        u = step(u, kappa, dx, dt, qL, qR)
    if baseline is not None:
        # conservation under zero-flux BCs (up to FP noise)
        assert abs(sum(u) - baseline) <= 1e-9*max(1.0, abs(baseline))
    return u

In [7]:
solve(
    u0=[0.0, 100.0, 0.0],
    kappa=0.1,
    dx=1.0,
    dt=1.0,
    qL=0.0,
    qR=0.0,
    nt=1000
)

[33.333333333333314, 33.33333333333333, 33.333333333333314]

### Step 2 - Pre- and Postconditions as assertions (executable specs)

At each function boundary, let's define:

 - Preconditions: shapes, positivity (dx, dt), stability, finiteness.
 - Postconditions: output lengths, boundary fluxes match BCs, finiteness. 
    These asserts are guardrails for refactors and encode the spec near the code.

In [157]:
from typing import List, Sequence, TypedDict
import math

class Mesh(TypedDict):
    dx: float
    N: int

CellField = List[float]
FaceField = List[float]

class Physics(TypedDict):
    kappa: float

class BC(TypedDict):
    kind: str   # "neumann"
    qL: float
    qR: float

class TimeCtl(TypedDict):
    dt: float
    nt: int

# ---- flux: faces 0..N ----
def flux(mesh: Mesh, u: Sequence[float], phys: Physics, bc: BC) -> FaceField:
    # Preconditions
    assert mesh["N"] >= 2, "Mesh must have at least 2 cells."
    assert len(u) == mesh["N"], "u length must equal mesh.N."
    assert mesh["dx"] > 0.0, "dx must be > 0."
    assert phys["kappa"] >= 0.0, "kappa must be >= 0."
    assert bc["kind"] == "neumann", "Only 'neumann' supported here."
    for ui in u: assert math.isfinite(ui), "u must be finite."

    N, dx, kappa = mesh["N"], mesh["dx"], phys["kappa"]
    F: FaceField = [0.0]*(N+1)

    # Boundary faces
    F[0], F[-1] = bc["qL"], bc["qR"]

    # Interior faces: i=1..N-1
    for i in range(1, N):
        F[i] = -kappa * (u[i] - u[i-1]) / dx

    # Postconditions
    assert len(F) == N+1, "F must have N+1 faces."
    assert F[0] == bc["qL"] and F[-1] == bc["qR"], "BC mismatch."
    for Fi in F: assert math.isfinite(Fi), "Flux must be finite."
    return F

# ---- divergence: cells 0..N-1 ----
def divergence(mesh: Mesh, F: Sequence[float]) -> CellField:
    N, dx = mesh["N"], mesh["dx"]
    # Preconditions
    assert N >= 2 and len(F) == N+1, "F must be face-centered length N+1."
    assert dx > 0.0, "dx must be > 0."
    for Fi in F: assert math.isfinite(Fi), "Flux must be finite."

    dudt: CellField = [(F[i] - F[i+1]) / dx for i in range(N)]

    # Postconditions
    assert len(dudt) == N, "dudt must be cell-centered length N."
    for gi in dudt: assert math.isfinite(gi), "dudt must be finite."
    return dudt

# ---- one explicit Euler step ----
def advance_one(mesh: Mesh, u: Sequence[float], phys: Physics, bc: BC, dt: float) -> CellField:
    # Preconditions
    assert dt > 0.0, "dt must be > 0."
    assert len(u) == mesh["N"], "u length must equal mesh.N."
    for ui in u: assert math.isfinite(ui), "u must be finite."

    F    = flux(mesh, u, phys, bc)
    dudt = divergence(mesh, F)
    u2: CellField = [ui + dt*gi for ui, gi in zip(u, dudt)]

    # Postconditions
    assert len(u2) == mesh["N"], "result length must equal mesh.N."
    for ui in u2: assert math.isfinite(ui), "u_next must be finite."
    return u2

# ---- stability predicate (FTCS) ----
def stability_ok(mesh: Mesh, phys: Physics, dt: float) -> bool:
    assert mesh["dx"] > 0.0 and dt > 0.0 and phys["kappa"] >= 0.0
    r = phys["kappa"] * dt / (mesh["dx"] * mesh["dx"])
    return r <= 0.5

# ---- run orchestrator ----
def run(u0: Sequence[float], mesh: Mesh, phys: Physics, bc: BC, time: TimeCtl) -> CellField:
    # Preconditions
    assert mesh["N"] >= 2 and len(u0) == mesh["N"], "u0 size must match mesh.N >= 2."
    assert time["dt"] > 0.0 and time["nt"] >= 0, "Bad time control."
    assert phys["kappa"] >= 0.0, "kappa must be >= 0."
    for ui in u0: assert math.isfinite(ui), "u0 must be finite."
    assert stability_ok(mesh, phys, time["dt"]), \
        "Unstable explicit step (kappa*dt/dx^2 > 0.5)."

    u: CellField = list(u0)

    # Optional invariant: zero-flux Neumann => total heat ~ constant
    track = (bc["kind"]=="neumann" and bc["qL"]==0.0 and bc["qR"]==0.0)
    baseline = sum(u) if track else None

    for _ in range(time["nt"]):
        u = advance_one(mesh, u, phys, bc, time["dt"])

    # Postconditions
    assert len(u) == mesh["N"], "Final field size must match mesh.N."
    if track:
        assert abs(sum(u) - baseline) <= 1e-9 * max(1.0, abs(baseline)), \
            "Total heat not conserved with zero-flux BCs."
    return u


phys = Physics(kappa=1.0)
bc = BC(kind="neumann", qL=0.0, qR=0.0)
time = TimeCtl(dt=0.01, nt=1000)

run(s["u"], Mesh(dx=s["dx"], N=len(s["u"])), phys, bc, time)

[33.33333333333137, 33.333333333337265, 33.33333333333137]

## Resources

Jackson, Daniel. "Software Abstractions: logic, language, and analysis." MIT press, 2012.

Bradley, Aaron R., and Zohar Manna. "The calculus of computation: decision procedures with applications to verification." Berlin, Heidelberg: Springer Berlin Heidelberg, 2007.

Lamport, Leslie. "The TLA+ Video Course", https://lamport.azurewebsites.net/video/videos.html, accessed September 2025.

---

*This notebook is a part of the "Rigor and Reasoning in Research Software" (R3Sw) tutorial, led by Alper Altuntas and sponsored by the Better Scientific Software (BSSw) Fellowship Program. Copyright © 2025*