Lean 4 formal proofs of gradient descent convergence for smooth convex optimisation.
18 theorems. Zero sorry statements. Works over arbitrary real Hilbert spaces.
Gradient descent is the engine of modern machine learning. Its convergence guarantees -- smoothness implies quadratic upper bounds, convexity turns local information into global bounds, strong convexity gives geometric contraction -- are treated as elementary in textbooks, but the details (differentiability hypotheses, step-size conditions, telescoping estimates) are non-trivial to state precisely.
This library machine-checks those details in Lean 4. It provides a compact, importable proof spine for the O(1/k) convex rate and geometric strongly convex rate, over arbitrary real Hilbert spaces, not just finite-dimensional ℝⁿ. Future formal work on optimisation, machine learning, or AI safety can import these results rather than re-formalising them.
At the time of writing, Mathlib contains no standalone Lean 4 library packaging the classical gradient descent convergence proof in this form. This fills that gap.
GradientDescent/
├── Defs.lean — Core definitions (100 lines)
├── Smooth.lean — Quadratic upper bound & descent lemma (95 lines)
├── Convex.lean — Convexity properties (67 lines)
└── Convergence.lean — O(1/k) and geometric convergence (198 lines)
| # | Theorem | Statement |
|---|---|---|
| 1 | isLSmooth_def |
L-smoothness ↔ differentiable + gradient L-Lipschitz |
| 2 | lsmooth_upper_bound |
Quadratic upper bound: f(y) ≤ f(x) + ⟪∇f(x), y−x⟫ + L/2‖y−x‖² |
| 3 | descent_lemma |
f(x − α∇f(x)) ≤ f(x) − α/2‖∇f(x)‖² for α ≤ 1/L |
| 4 | stronglyConvex_def |
μ-strong convexity characterisation |
| 5 | isConvex_lower_bound |
First-order convexity condition |
| 6 | isConvex_minimiser_gradient_zero |
Gradient vanishes at minimiser |
| 7 | stronglyConvex_func_gap_lower |
f(x) − f(x*) ≥ μ/2‖x−x*‖² |
| 8 | stronglyConvex_inner_gradient_bound |
Inner product lower bound |
| # | Theorem | Statement |
|---|---|---|
| 9 | gdSeq |
Gradient descent sequence definition |
| 10 | norm_sq_gd_expand |
Norm-squared expansion identity |
| 11 | gd_step_distance_sq |
Distance identity after one step |
| 12 | gd_convex_step_bound |
Per-step distance decrease (convex case) |
| 13 | gd_descent_nonincreasing |
Function values are non-increasing |
| 14 | gd_convex_telescope |
Telescoping sum bound |
| 15 | gd_convex_convergence |
f(xₖ) − f(x) ≤ L‖x₀−x‖² / (2k)** — O(1/k) rate |
| 16 | gd_strongly_convex_step |
Per-step contraction |
| 17 | gd_strongly_convex_convergence |
‖xₖ−x‖² ≤ (1−αμ)ᵏ‖x₀−x‖²** — geometric rate |
lsmooth_upper_boundproved from first principles via the fundamental theorem of calculus -- not assumed as an axiom- Hilbert space generality: proofs hold over any
[NormedAddCommGroup E] [InnerProductSpace ℝ E] [CompleteSpace E], not just ℝⁿ - Standard axioms only:
propext,Classical.choice,Quot.sound-- no project-specific axioms - Zero
sorry, zeroadmit
Add to your lakefile.toml:
[[require]]
name = "GradientDescent"
scope = "velvetmonkey"
rev = "main"Then import what you need:
import GradientDescent.Convergence -- O(1/k) and geometric rates
import GradientDescent.Smooth -- descent lemma, quadratic upper bound
import GradientDescent.Convex -- convexity inequalities
import GradientDescent.Defs -- core definitions- Lean 4.28.0
- Mathlib v4.28.0
A companion paper is available in this repository: paper.md / paper.pdf
Zenodo preprint: 10.5281/zenodo.20472996
@software{cassie2026gradientdescentlean,
author = {Cassie, Ben},
title = {gradient-descent-lean: Formal Proofs of Gradient Descent Convergence in Lean 4},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.20472996},
url = {https://doi.org/10.5281/zenodo.20472996}
}- kuramoto-lean -- Lean 4 formalisation of Kuramoto synchronisation
- hopfield-lean -- Lean 4 Hopfield network energy descent and attractor convergence
- flywheel-universe -- companion Zenodo paper (v2)
Proofs in this library were generated using Aristotle, an AI proof assistant for Lean 4 and Mathlib. The proof discipline -- zero sorry, every Mathlib lemma name #checked before use -- was specified by the author and enforced by the Lean type checker.
Ben Cassie · @thevelvetmonke