This repository contains materials used during Leo de Moura and David Thrane Christiansen's presentation at SSFT 2024.
To prepare for the Lean lectures, please install Lean using the official instructions. This will set up everything you need for the summer school.
If for some reason you can't follow these instructions, please make
sure that the way you install Lean ends up with an installation of
elan
, the Lean toolchain version manager, rather than a specific
version of Lean itself (e.g. from nixpkgs
). The installation of elan
worked if you can type lean +4.7.0 --version
in your home directory,
and it prints out Lean (version 4.7.0, ...)
, and typing
lean +nightly-2024-05-19 --version
prints something like
Lean (version 4.9.0-nightly-2024-04-19, ...)
.
In order to follow along in the final demo, you'll also need CaDiCaL -
the code has been tested with version 1.9.5. It is available in many
standard software repositories. For the final demo, you may also need
to increase your stack space - ulimit -s 65520
seems to reliably
work.
In case you want to follow along, the sample code in this repository is organized into branches that represent checkpoints in the interactive development:
lec1/start
- the initial state of the repositorylec1/step1
- the repository after the introductory code (Intro.lean
) has been added and the implementation forfilter
on lists is no longer a stublec1/step2
- the repository after the implementation fromlec1/step1
was verifiedlec1/step3
- the implementation using packed arrays has been added, but not verifiedlec1/step4
- the final statelec2/start
- the code from the first lecture is complete, and the second lecture is in its starting state (missing some proofs and definitions)lec2/step1
- the expression language has been implemented, with a verified optimization passlec2/step2
- the statements of Imp have been implemented, with a big-step semantics and a verified optimization passmain
andlec2/step3
- the final state of the imperative language
This lecture is an introduction to writing programs, and proving them correct, in Lean. After an introduction to Lean's syntax and the very basics of using its type theory as a logic, we interactively write a program that filters JSON objects, and prove it correct with respect to a number of specifications.
The following files are part of the first lecture:
Intro.lean
- The contents of the
tests/
directory Main.lean
- the entry point for the JSON filterFilter.lean
and the contents ofFilter/
:Filter/List.lean
- a verified implementation of filtering listsFilter/Array.lean
- a verified implementation of filtering arraysFilter/Input.lean
- utilities for reading JSON (not part of lectures)Filter/Query.lean
- a query language (not part of lectures)
Please experiment with the Lean features and the ideas that were presented in the first lecture. We don't expect you to have time to complete all these exercises, so please do the first one and then pick what you're interested in to work on:
- Define a function that appends two lists. Prove that the length of the output is the sum of the lengths of the input.
- Implement
map
forList
. What are some properties that it should satisfy? Prove that it does. - Define a binary tree datatype with two constructors: one represents
the empty tree, and the other represents a labelled branch with a
data item and two subtrees.
- Define a predicate
All : (α → Prop) → Tree α → Prop
such thatAll p t
is true whenever all data items int
satisfyp
- Define a predicate
Sorted : Tree Nat → Prop
such that empty trees are sorted and branches are sorted if their left and right subtrees are sorted, all elements of the left subtree are less than or equal to the data item, and all elements of the right subtree are strictly greater than the data item. - Define a function
Tree.insert : Nat → Tree Nat → Tree Nat
that inserts a natural number into the given tree. If the input tree is sorted, then the output should be. - Prove that if a predicate holds for all natural numbers in a tree, and it holds for some new number, then it also holds for the all numbers in the tree with the new number inserted
- Prove that inserting a number into a sorted tree yields a sorted tree.
- Write an inductive predicate that holds when a tree contains a
given element. It should have signature
Tree.Mem (x : α) : Tree α → Prop
. - Prove that if all data items in a tree satisfy a predicate and
some particular item is in the tree (according to
Tree.Mem
), then that particular item satisfies the predicate. - Write a function that determines whether a given
Nat
is contained within a sorted tree, with typeNat → Tree Nat → Bool
. - Prove that if all numbers in a tree
t
satisfy a predicate, andcontains n t = true
, thenn
satisfies the predicate. - Prove the correctness of
contains
: if a tree is sorted, thenTree.Mem n t
is logically equivalent tocontains n t = true
.
- Define a predicate
- State and prove some further properties of array filter (see the list filter file if you need inspiration)
- Implement
map
for arrays. State and prove some properties about it.
This lecture demonstrates how to use Lean for implementing languages and proving things about programs written in them. The programs are written in Imp, a minimal imperative language that features while loops, mutable variables, and conditionals.
The following files are part of the second lecture:
Imp.lean
- the top-level module that exists to import the others; also contains a final demo.Imp/Expr.lean
- definition of an expression datatype and convenient syntax for writing itImp/Expr/
- expressions and evaluation:Imp/Expr/Eval.lean
- Evaluating expressionsImp/Expr/Optimize.lean
- Constant-folding optimization of expressionsImp/Expr/Delab.lean
- Interactive display of expressions (not part of lectures)
Imp/Stmt.lean
- definition of a statement datatype and convenient syntax for writing itImp/Stmt/Basic.lean
- The core datastructure and syntax for statementsImp/Stmt/Optimize.lean
- Optimization of statementsImp/Stmt/BigStep.lean
- Operational semantics and proof that optimization is correctImp/Stmt/Delab.lean
- Interactive display of statements (not part of lectures)
These exercises are intended to be done in the context of the development from the second lecture.
- Add a bitwise xor operator to Expr. Lean's bitwise xor operator is
^^^
. - Add a new arithmetic identity to the optimizer for
Expr
and update the correctness proof accordingly. For example, it could replaceE + 0
withE
. Be careful to ensure that it in fact makes sure that optimizations of programs that divide by zero still divide by zero. - These exercises don't require modifications to the
Stmt
datatype:- Add a unary
if
statement (that is, one without anelse
clause) to the user-facing syntax forStmt
- Add a
do...while...
statement to the surface syntax that executes the body at least once
- Add a unary
- Add a failure handler expression to
Expr
. The expressiontry E1 then E2
should haveE1
's value, unless the value is undefined, in which case it hasE2
's value. Update the concrete syntax and the evaluator accordingly. - Add a random number generator to Imp:
- Add a new statement that assigns a pseudorandom value to a given variable, and give it a concrete syntax
- Define a pseudorandom number generator as a two-field structure with a state and a function from a state to a pair of a fresh state and a value. Implement a generator - there's no need for it to be secure, and a generator that just counts upwards is fine for the purposes of this exercise.
- Update the big-step semantics to thread a random number generator
through program execution. Provide a big-step rule for the
rand
statement and update the rest of Imp as needed so all the proofs go through.
These are a few of the proof tactics that may be useful for the lab sessions, along with summaries of the arguments and configuration options that we think are most relevant:
assumption
- if a local hypothesis solves the goal, use itcontradiction
- search the local context for an "obvious" contradiction, and close the goal if it's foundomega
- a Presburger arithmetic solver that can take care of many goals involving arithmeticintro x ...
- Transforms a goal likeA -> ... -> C
intoC
, withA ...
made into assumptions namedx ...
x
can be replaced with a pattern whenA
has exactly one constructor
unfold f
- replacesf
with its definition in the goalunfold f at h
- replacesf
with its definition in hypothesish
simp
- simplify the goal using a collection of rewrite rules. Ifsimp
makes the goal simple enough, it will go ahead and solve it.simp [r, ...]
- simplify the goal using additional rulesr, ...
. Rule possibilities include definition names, which causes them to be unfolded; proofs of equalities, which causes the left side to be replaced with the right side; and*
, which causes local assumptions to be taken into accountsimp at h
- simplify hypothesish
instead of the goal (can be combined with rules list)simp_all
- perform as much simplification as possible
rw [r1, ...]
- apply rewrites to the goal, one after the other. Each rewrite is the name of a theorem or assumption that proves an equality, and the left side is replaced with the right side. Preceding the rule with←
causes the right side to be replaced with the left. Additional assumptions of the equality theorems are added as new goals. Can also be used withat
to rewrite in an assumption.apply l
- apply the lemmal
to the goal, matching up the lemma's conclusion with the goal and creating new goals for each assumption of the lemma that must be satisfiedapply?
- search the Lean libraries for lemmas that might be relevant
exact l
- just likeapply
, except fails if it introduces new goalsexact?
- search the Lean libraries for lemmas that close the goal
constructor
- apply the first type-correct constructor of the goal typeT1 <;> T2
- runsT1
, then runsT2
in each subgoal that it createsrepeat T
- runsT
repeatedly until it failsrepeat' T
- runsT
repeatedly in each subgoal until it failstry T
- runsT
, resetting the proof state ifT
fails. TACS
- in a context with multiple goals, focuses on the first. Fails if tacticsTACS
doesn't result in zero open goals.next =>
- in a context with multiple goals, focuses on the first. Fails if included tactics don't result in zero open goals.next h ...=>
- in a context with multiple goals, focuses on the first, assigning the namesh ...
to its unnamed assumptions. Fails if included tactics don't result in zero open goals.case c h ... =>
- in a context with multiple goals, focuses on the one namedc
. Otherwise likenext h ... =>
.split
- in a goal that contains anif
ormatch
, creates one new goal for each possible path of execution.cases e
- create a new goal for each constructor of a datatype or inductively defined proposition that ise
's typecases e with | c h ... => TACS ...
- likecases
, but requires an explicit case for each goalc
with hypothesesh ...
induction x
- create a new goal for each constructor of a datatype or inductively defined proposition, assuming the current goal for each recursive occurrence (that is, with indution hypotheses).induction e with | c h ... => TACS ...
- likeinduction
, but requires an explicit case for each goalc
with hypothesesh ...
induction e using i
- likeinduction
, but uses the induction principlei
instead of the default one
have x : t := e
- introduce a new local assumptionx : t
, wheree
provest
(e
often begins withby
)x
can be omitted; in this case, the assumption is namedthis
: t
can be omitted whene
's type can be inferredx
can be replaced with a pattern whene
's type has at most one constructor