Skip to content

0111: Email Tutorials ‐ Type‐Based Formal Verification in FP

Bernard Sibanda edited this page Dec 17, 2025 · 2 revisions

Type-Based Formal Verification in FP

📚 Table of Contents

  1. Email 900 – What “Formal Verification” Means
  2. Email 901 – Where Types Fit Into Formal Verification
  3. Email 902 – Quick Haskell Background You Need
  4. Email 903 – The Problem: Types Sometimes Lie
  5. Email 904 – Two Classic Fixes and Why They’re Not Perfect
  6. Email 905 – Strengthening Types With GADTs
  7. Email 906 – Why GADTs Are Useful, and Their Costs
  8. Email 907 – The “Replicate” Problem
  9. Email 908 – Dependent Types
  10. Email 909 – What Dependent Types Give You
  11. Email 910 – Why Dependent Types Are Hard in Practice
  12. Email 911 – Refinement Types
  13. Email 912 – What Refinement Types Look Like
  14. Email 913 – What Happens With Real-World Inputs
  15. Email 914 – Higher-Order Functions Are Hard to Specify
  16. Email 915 – Intrinsic vs Extrinsic Verification
  17. Email 916 – Proofs and Why Compilers Ask for Them
  18. Email 917 – Proof Automation: Tactics and SMT Solvers
  19. Email 918 – Why People Don’t Use This Everywhere Yet
  20. Email 919 – Summary
  21. Glossary
  22. Quiz: 20 Questions + Options
  23. Answers + Explanations

Email 900 – What “Formal Verification” Means

Formal verification is the idea that you have two things:

  1. A program (the thing you run).
  2. A specification (the thing you want the program to satisfy).

Then you try to prove or check that the program always satisfies the specification.

A specification can be very small or very big. It might say, “this function never divides by zero,” or it might say, “this whole system never reaches an inconsistent state.”

You can do verification at different times. You can model early (before writing code), verify while coding, or write code first and verify later.

Email 901 – Where Types Fit Into Formal Verification

Types are often called a lightweight formal verification tool, because the compiler checks a lot for you.

A type system already prevents many mistakes. For example, if the compiler knows something is a String, it will complain if you try to treat it like an Int. That is a form of correctness checking.

But the talk’s main point is this: basic types don’t express enough invariants.

Knowing something is an Int is helpful, but it doesn’t tell you “this Int is not zero,” or “this list is not empty,” or “this list has exactly n elements.”

Type-based formal verification tries to push the type system further, so the compiler can check stronger correctness properties.

Email 902 – Quick Haskell Background You Need

The talk uses Haskell-like syntax, so here are the essentials.

A classic Haskell list is defined as:

  • Either the empty list (Nil)
  • Or a head element plus a tail list (Cons x xs)

Functions can be written using pattern matching. For example, length can be defined by matching on the list constructor.

Haskell also usually writes type signatures separately from function implementations. So you may see:

  • length :: List a -> Int

This means “length takes a list of any element type a and returns an Int.”

Email 903 – The Problem: Types Sometimes Lie

This is one of the most important beginner points.

A type signature can claim that a function is “safe,” while the function can still crash at runtime.

For example:

  • head :: [a] -> a

This looks like it always succeeds, but it fails on [].

Or:

  • div :: Int -> Int -> Int

This looks safe, but fails when the second argument is 0.

So even in a strongly typed language, you can still write programs that crash. That is why the speaker says: types are great, but sometimes types lie.

Email 904 – Two Classic Fixes and Why They’re Not Perfect

When a function can fail, developers usually do one of these:

Fix A: Restrict the input

You say: “head only accepts non-empty lists” or “div only accepts non-zero denominators.”

That sounds good, but standard types don’t naturally express “non-empty” or “non-zero” without extra tools.

Fix B: Put errors in the output

You change the type to return an error-aware type:

  • safeHead :: [a] -> Maybe a
  • safeDiv :: Int -> Int -> Either DivByZeroError Int

This works, but now you must pattern match everywhere. Even if you already know the number is non-zero, the type forces you to handle error cases again and again. This can clutter the “happy path.”

The deeper goal is: the compiler should learn from earlier checks, so you don’t keep re-checking forever.

Email 905 – Strengthening Types With GADTs

GADTs let you encode extra information inside the type.

The talk’s key example is: a list type that “remembers” its length.

Instead of:

  • “this is a list of a

you get:

  • “this is a list of a with length n

Then you can define head so the type forces the list to be non-empty, because its length must be at least 1.

The big win is that the compiler can reject invalid programs at compile time, not runtime.

Email 906 – Why GADTs Are Useful, and Their Costs

GADTs are powerful, but they come with costs:

  1. You may need custom number types at the type level (like Zero, Succ n).
  2. You may need custom list types instead of standard lists.
  3. You must decide in advance what property you want tracked (length, parity, sortedness, etc.).

So GADTs are useful when you know exactly what invariant you want and can afford specialized structures.

Email 907 – The “Replicate” Problem

The talk highlights a big limitation.

A function like:

  • replicate n x

creates a list whose length depends on a runtime value n.

With just GADTs (in the style shown), it’s hard to express “the output list length equals the runtime number n,” because n lives at runtime but the length index lives at compile-time type level.

This pushes us toward dependent types or refinement types.

Email 908 – Dependent Types

Dependent types let the type of a result depend on the value of an input.

That means you can write something like:

  • replicate : (n : Nat) -> a -> Vec a n

Here, the type Vec a n literally contains the value n.

This “blurs” the line between types and values. In many dependent type languages, types and values live in one unified world.

Email 909 – What Dependent Types Give You

Once values can appear in types, you can express very strong facts:

  • Appending Vec a m and Vec a n gives Vec a (m + n)
  • replicate n x produces a vector of length n

You can also use functions in types (like m + n), because + is just a function and types can contain values.

Email 910 – Why Dependent Types Are Hard in Practice

The talk warns about real-world pain points:

  1. Type checking may require evaluation, meaning compilation may need to run more computation.
  2. Type inference becomes difficult, so you often write more annotations.
  3. Proof obligations appear: the compiler may ask you to prove facts you consider “obvious.”

This is why dependent type systems can feel like programming plus doing math proofs.

Email 911 – Refinement Types

Refinement types take another route.

Instead of turning runtime values into type indices, you keep your normal types but add logical predicates.

So you can express:

  • “an Int where x ≠ 0
  • “a list where length xs > 0

This is often done in systems like Liquid Haskell.

Email 912 – What Refinement Types Look Like

A refinement type looks like:

  • x:Int | x /= 0

or for lists:

  • xs:[a] | length xs > 0

In Liquid Haskell style, you add annotations that specify these constraints, and a solver tries to prove your program respects them.

This makes it possible to keep using normal Haskell datatypes while still expressing stronger safety properties.

Email 913 – What Happens With Real-World Inputs

A beginner-friendly way to think about it:

If your denominator comes from the internet, the compiler cannot assume it is non-zero.

So you still need runtime checks.

But once you check and branch:

  • If d == 0, handle error.
  • Else, in the else branch, the system can treat d as non-zero.

That means the compiler learns facts from control flow and allows safe functions without repeated error-handling clutter.

Email 914 – Higher-Order Functions Are Hard to Specify

This is one of the most important warnings.

Specifying properties of functions like map can be tricky because:

  • You may care about length preservation.
  • Or about preserving ordering.
  • Or about mapping a predicate P to a predicate Q.
  • Or about algebraic laws like map id == id.

Different use cases want different specifications. Writing the “best” type for higher-order functions becomes complicated.

This is why some verified codebases sometimes avoid generic functions and instead write explicit recursion for clarity and provability.

Email 915 – Intrinsic vs Extrinsic Verification

The talk distinguishes two styles:

Intrinsic verification

You bake correctness into the types so incorrect programs do not type-check.

Extrinsic verification

You write a normal program first, then separately write proofs/theorems about it.

Some languages lean toward one style more than the other, and modeling choices can determine whether verification is easy or painful.

Email 916 – Proofs and Why Compilers Ask for Them

In dependent type systems, sometimes the compiler asks you to prove things like:

  • n + 1 equals Succ n in some representation”

These feel trivial to humans but are not obvious to the compiler unless the system is designed to rewrite and simplify automatically.

This is where you start writing proof code, often guided by the Curry–Howard idea: types are propositions, programs are proofs.

Email 917 – Proof Automation: Tactics and SMT Solvers

The talk names three “styles”:

  1. Do proofs by hand (some systems basically force this).
  2. Use tactics (automation scripts for common proof steps).
  3. Use SMT solvers (like Z3) to discharge routine arithmetic/logical obligations automatically.

Refinement-type tools often rely heavily on SMT solvers to reduce manual proof burden.

Email 918 – Why People Don’t Use This Everywhere Yet

The speaker gives practical reasons:

  • Tools can fail unpredictably (“it doesn’t prove it, and you don’t know why”).
  • Modeling is still exploratory; the same property may be easy in one encoding and hard in another.
  • There is a learning curve: you must learn proof thinking and solver limitations.

So the “best” current entry point is often refinement types in familiar languages, because it adds power with less disruption.

Email 919 – Summary

Type-based formal verification is about using stronger type systems to make correctness properties checkable by the compiler.

The talk presented three major families:

  1. GADTs: encode invariants directly into types (like list length indices).
  2. Dependent types: allow types to depend on values, enabling precise specs like “replicate returns length n.”
  3. Refinement types: keep normal types but add logical predicates checked by solvers.

Each approach increases correctness guarantees but also increases complexity, annotation burden, or proof obligations.

🧾 Glossary (Detailed)

Formal verification – Proving/checking that a program satisfies a specification, often for all possible inputs.

Specification – A precise statement of what correctness means (e.g., “never divides by zero,” “result is sorted,” “length preserved”).

Type system – A system that assigns types to values and checks whether programs use values consistently.

Static typing – Type errors are caught at compile time.

Runtime error – A failure that happens while the program runs (like exceptions from head []).

Invariant – A property that should always hold (e.g., “list is non-empty,” “denominator is non-zero”).

Maybe / Option – A type representing “a value may be missing” (Nothing or Just x).

Either – A type representing success or failure (Left err or Right value).

Sum type – A type with multiple constructors (e.g., User = PersonUser Person | EntityUser Entity).

Pattern matching – Selecting behavior based on which constructor a value has.

GADT (Generalized Algebraic Data Type) – A data type where constructors can refine the resulting type, letting you encode invariants in types.

Type index – Extra type parameters used to represent properties (like the length of a list).

Dependent type – A type that can depend on a runtime value, allowing very precise specifications.

Vector (Vec) – A length-indexed list, common in dependent type examples.

Refinement type – A type annotated with a predicate (e.g., Int where x != 0).

Predicate – A boolean condition about a value (e.g., length xs > 0).

Precondition – A condition that must be true before calling a function.

Postcondition – A condition that must be true after a function returns.

SMT solver – A tool that tries to determine if logical constraints are satisfiable; used to automatically prove many small obligations.

Z3 – A popular SMT solver used by many verification tools.

Curry–Howard correspondence – The idea that types correspond to logical propositions and programs correspond to proofs.

Proof obligation – A required proof step the system demands to accept your program.

Tactic – An automated proof strategy that fills in common proof steps.

Intrinsic verification – Correctness is enforced by types; invalid states are unrepresentable.

Extrinsic verification – Write code first, then write separate proofs about it.

✅ Quiz: 20 Questions With Options

1) In the simplest terms, what is formal verification?

A. Testing a program with random inputs B. Proving/checking a program satisfies a specification C. Making code run faster D. Removing all types from code

2) Why are types called “lightweight formal verification”?

A. They guarantee perfect security B. They prevent all runtime errors C. They enforce some correctness constraints automatically D. They replace documentation

3) Why does head :: [a] -> a demonstrate that “types sometimes lie”?

A. Because it returns Nothing B. Because it fails on empty lists even though the type doesn’t show failure C. Because it always returns the last element D. Because it is polymorphic

4) What is a common safe alternative to head?

A. safeHead :: [a] -> Maybe a B. safeHead :: [a] -> a C. safeHead :: a -> [a] D. safeHead :: Int -> a

5) What is one downside of returning Either or Maybe everywhere?

A. It makes programs impossible to compile B. It forces repeated pattern matching and can clutter the happy path C. It removes static typing D. It prevents recursion

6) What is the core idea of using GADTs for safety?

A. Remove types entirely B. Encode extra invariants in the type itself C. Replace all functions with macros D. Make all lists mutable

7) A “length-indexed list” stores which fact in its type?

A. The first element B. The memory address C. The length of the list D. The sorting order only

8) Why is replicate hard to type using only the simple GADT approach shown?

A. Because replicate has no recursion B. Because the output length depends on a runtime value n C. Because lists cannot store values D. Because Haskell has no numbers

9) What is a dependent type (basic definition)?

A. A type that depends on another type only B. A type that depends on runtime values C. A type that removes recursion D. A type that only exists at runtime

10) In dependent type systems, types and values often…

A. Are totally separate worlds B. Cannot mention each other C. Live in a unified world where values can appear in types D. Are replaced by comments

11) What is a practical cost of dependent types mentioned in the talk?

A. They forbid compilation B. Type checking may require running computations and more annotations C. They remove pattern matching D. They eliminate proof obligations completely

12) What is a refinement type?

A. A type rewritten in assembly B. A type plus a predicate/constraint about values C. A type that forbids any predicates D. A type that always equals Any

13) What does a refinement like “Int where x != 0” achieve?

A. It ensures the value is always negative B. It ensures the value is not zero C. It ensures the value is a string D. It ensures the value is prime

14) What tool commonly powers refinement checking (per the talk)?

A. CSS preprocessors B. SMT solvers (e.g., Z3) C. JPEG codecs D. GPU shaders

15) If a value comes from the internet, why can’t the compiler assume it satisfies a refinement?

A. Because the internet is slow B. Because input is unknown at compile time C. Because refinements only work for strings D. Because refinements forbid branching

16) After checking if d == 0 then ... else ..., what can refinement systems often do in the else branch?

A. Forget what happened B. Treat d as non-zero inside that branch C. Turn d into a list D. Make d polymorphic

17) Why are higher-order functions like map harder to specify?

A. Because they cannot be executed B. Because there are many meaningful properties one might want to express C. Because they have no types D. Because they only exist in Scala

18) What is intrinsic verification?

A. Verify after deployment only B. Encode correctness in types so invalid programs don’t type-check C. Use only unit tests D. Ignore invariants

19) What is extrinsic verification?

A. Encode everything in types only B. Write code, then separately prove properties about it C. Only verify performance D. Verification without any specification

20) Why don’t teams use formal verification everywhere today?

A. It is illegal B. Tools are always perfect C. Tools can fail opaquely and modeling has a learning curve D. It always makes code slower at runtime

✅ Answers + Explanations

  1. B – Formal verification is about checking/proving code satisfies a spec.
  2. C – Types prevent certain classes of errors automatically.
  3. Bhead can crash on [], but the type doesn’t show failure.
  4. AMaybe a expresses possible failure without exceptions.
  5. B – You often re-check impossible cases and clutter the happy path.
  6. B – GADTs let types carry extra correctness information.
  7. C – The length becomes part of the type.
  8. Bn is runtime, but the length index is checked statically.
  9. B – Dependent types allow result types to depend on values.
  10. C – Values can appear in types in dependent type systems.
  11. B – More computation during type checking + more annotations + proofs.
  12. B – A refinement type is a normal type plus a predicate.
  13. B – It guarantees the integer is non-zero.
  14. B – SMT solvers automate many boring proof obligations.
  15. B – Inputs aren’t known at compile time; you must validate.
  16. B – The system can learn d != 0 in the else branch.
  17. B – There are many valid specs: length, ordering, laws, predicates, etc.
  18. B – Intrinsic means correctness-by-construction in the type system.
  19. B – Extrinsic means code + separate proofs about it.
  20. C – Tools can fail with unclear reasons; modeling requires skill.

🎓 Final Step: Quiz & Progress Badge

Start Quiz

Clone this wiki locally