Results and Findings

This page reports the concrete results of the normalization census of the closed untyped λ-calculus in the natural-size de Bruijn model. It is the empirical heart of the project. For the question being studied and why it matters, see The Research; for how the numbers are produced, see Engine and API; to download the raw tables, see Dataset; to re-run everything yourself, see Reproduce.

One honest sentence up front. Nothing here resolves the open Barendregt–Geuvers–Klop (BGK) conjecture — does weak normalization (WN) imply strong normalization (SN) in pure type systems? It remains open (TLCA problem #9). What we contribute is open data: an exact-then-sampled classification of closed λ-terms that pins down the smallest counterexample-shaped term (a "separator"), charts how separator and undecidable mass grow with size, and shows how this all sits between two opposed published theorems.

Throughout, we mark [PROVEN] for cited theorems and [EMPIRICAL] for our own measurements.

1. The model in one breath

Closed untyped λ-terms in de Bruijn notation, natural size: every constructor costs 1, so de Bruijn index $k$ has size $k+1$:

$$|\underline{k}| = k+1,\qquad |\lambda M| = 1+|M|,\qquad |M,N| = 1+|M|+|N|.$$

[EMPIRICAL] Sanity check passed. Our independent count of closed terms by natural size is term-for-term identical to OEIS A275057 (Lescanne, 2016):

0, 1, 1, 3, 6, 17, 41, 116, 313, 895, 2550, 7450, 21881, 65168, 195370, 591007, ...

This is the standard model, not a private variant — which is what makes the size constants below comparable to the literature.

Each closed term is sorted into one of four classes:

Class	Meaning
SN	strongly normalizing (every reduction path terminates)
SEP	a separator: weakly normalizing but not strongly (WN ∧ ¬SN)
NWN	not even weakly normalizing (no normal form exists)
UND	undecided within the engine's bounds (node_cap = 3000, size_cap = 200)

Separators (SEP) are the interesting class: they are exactly the untyped analogue of a BGK-style counterexample — a term where some strategy halts but another loops forever.

2. The smallest separator (natural size 13)

[EMPIRICAL] Headline result. The smallest closed WN-but-not-SN term has natural size 13. The witness recorded in the dataset is:

$$\lambda\lambda,0;\bigl(\lambda(0,0);\lambda(0,0)\bigr);=;(\lambda a.\lambda b.,b),\Omega.$$

Why it is a separator:

It is WN: the outer redex applies an erasing function $\lambda a.\lambda b.b$ (it ignores its first argument) to $\Omega$. Contracting that redex discards $\Omega$ entirely and leaves the normal form $\lambda b.b$. So a normal form is reachable.
It is not SN: if instead you keep reducing inside the argument, $\Omega = (\lambda x.x,x)(\lambda x.x,x)$ loops forever. An infinite reduction path exists.

The size decomposes cleanly:

$$\underbrace{3}_{\text{smallest closed erasing fn } \lambda a.\lambda b.b};+;\underbrace{9}_{\Omega};+;\underbrace{1}_{\text{application node}};=;13.$$

For comparison, the smallest closed non-WN term is $\Omega$ itself, at natural size 9 — the first non-SN term to appear at all.

Why this is the crux of the whole project. The separator is born from erasure. The function throws its argument away, so WN succeeds by avoiding the looping subterm while SN fails because the subterm is still there to loop. This is precisely the mechanism the conservation theory identifies as the obstruction to "WN ⟹ SN": in the erasure-free λI-calculus, WN ⟺ SN [PROVEN] (Church 1941; Barendregt Thm 9.1.5), and our BGK Lab finds 0 separators in 3.13M λI terms [EMPIRICAL]. Allow erasure (λK) and separators appear. Size 13 is the smallest place that mechanism can fit inside a closed term.

This size-13 constant appears to be undocumented in the prior literature. If you have a citation that predates it, please open an issue — see Contributing and Experiments.

Note on conventions: the BGK Lab's separate λK experiment reports a separator onset at size 12 under its own ad-hoc term enumeration. That is a different size convention from the census's closed-de-Bruijn natural size (13). Both are reported; they are not in conflict — they count different things.

3. Density trends (the census table)

Method: exact, exhaustive enumeration for $n \le 16$; uniform Monte-Carlo sampling ($K = 50{,}000$ terms per size) for $n \in {17,\dots,55}$ via a rank-unranking sampler over the A275057 counts. Run on a 16-core host. Densities below are reported over the decided subpopulation (SN + SEP + NWN), with the undecided fraction stated separately.

[EMPIRICAL] Compact results table (selected sizes; full data in the Dataset):

n	method	total / K	SN	SEP	NWN	UND	SN / decided	SEP / decided	undecided frac
8	exact	116	116	0	0	0	1.00000	0	0
9	exact	313	312	0	1	0	0.99681	0	0
12	exact	7,450	7,425	0	19	6	0.99745	0	0.00081
13	exact	21,881	21,803	3	51	24	0.99753	0.000137	0.00110
14	exact	65,168	64,930	7	137	94	0.99779	0.000108	0.00144
16	exact	591,007	587,927	146	1,630	1,304	0.99699	0.000248	0.00221
22	MC 50k	—	49,501	31	197	271	0.99541	0.000623	0.00542
28	MC 50k	—	49,195	52	196	557	0.99498	0.001052	0.01114
38	MC 50k	—	48,242	98	227	1,433	0.99331	0.002018	0.02866
45	MC 50k	—	47,127	170	223	2,480	0.99173	0.003577	0.04960
55	MC 50k	—	44,739	197	224	4,840	0.99068	0.004362	0.09680

Three clear trends as $n$ grows:

First appearances. The first non-WN term is $\Omega$ at $n=9$. The first separator is at $n=13$ — exactly the boundary analyzed above.
Separator density rises monotonically. As a fraction of decided terms it climbs from ≈0.014% at $n=13$ to ≈0.44% at $n=55$ — roughly a 30× increase. Separators are rare but become steadily less rare.
The undecided fraction climbs steeply. From 0 (small $n$) to ~2.9% at $n=38$ to ~9.7% at $n=55$. The hard terms are taking over.

Monte-Carlo points carry sampling error; the dataset includes Wilson 95% confidence intervals. For example, at $n=55$ the SN/decided fraction is $0.9907$ with CI $[0.98975, 0.99152]$, and SEP/decided is $0.004362$ with CI $[0.003795, 0.005014]$.

4. The two opposed theorems — and why both are right

Our decided-SN fraction sits stubbornly above 99% even at $n=55$. Naively that looks like "almost all terms are SN." The literature says this is convention-dependent, and proves both extremes:

[PROVEN] SN-density → 0 in exactly our model. Bendkowski–Grygiel–Lescanne–Zaionc (2017, Cor. 4): asymptotically almost every λ-term in the natural-size model is neither strongly normalizing, nor typeable, nor in normal form. Intuitively, deep de Bruijn indices are expensive in this model, so variables localize near their binders and looping subterms like $\Omega$ saturate almost all large terms.
[PROVEN] SN-density → 1 in a different model. David–Grygiel–Kozik–Raffalli–Theyssier–Zaionc (2013, LMCS 9(1):2): in the unary-binary-tree model where variables cost 0, asymptotically almost all λ-terms are strongly normalizing. With free variables, a variable can sit arbitrarily far from its binder, forcing head-normal/SN shapes.

The size convention controls the answer. Our census lives squarely in the A275057 (natural-size) regime, so the proven asymptote that applies to us is SN-density → 0.

[OPEN] Crucially, neither paper computes the asymptotic density of WN, nor the size of the WN ∖ SN separator gap. That asymptotic separator density is genuinely uncomputed — and it is the quantity our small-$n$ data is the first to tabulate.

5. The decidability horizon

How do we reconcile proven SN-density → 0 with our measured decided-SN ≥ 99%? Honestly, and it is itself a finding.

[EMPIRICAL] The resolution. SN-ness is undecidable; our bounded engine can only resolve the easy terms — which at these sizes are overwhelmingly SN. The non-SN mass that the theorem guarantees is hiding in the growing undecided tail. Our data shows that tail expanding monotonically (Section 3: 0 → ~9.7% by $n=55$). So our numbers are fully consistent with SN-density → 0; they simply cannot witness the asymptotic collapse directly. They map the decidable frontier and quantify where the difficulty concentrates.

We call this the decidability horizon: the boundary, moving outward with $n$, beyond which bounded analysis can no longer decide a term's fate, and behind which the proven non-SN mass accumulates.

[EMPIRICAL] A vivid accidental demonstration. The Monte-Carlo run itself became intractable at $n=70$ — per-term classification cost exploded precisely because the undecided fraction was growing. The experiment hit its own horizon. That is why we report through $n=55$.

A genuine engine subtlety underlies all of this, and we fixed a real bug around it: a node whose reducts were all dropped by the size cap is not a normal form. Counting such a capped node as a normal form would falsely flag a term as WN. The engine treats capped-out nodes as UND, not as evidence of WN. See Engine and API for the analysis routine.

6. Proven vs empirical — the ledger

To keep the honest line clear:

[PROVEN] (cited):

WN ⟺ SN in the erasure-free λI-calculus (Church 1941; Barendregt Thm 9.1.5).
Klop's ι-translation transports erasing λK terms into the non-erasing λI[,] memory calculus.
SN-density → 0 in the natural-size model (Bendkowski–Grygiel–Lescanne–Zaionc 2017, Cor. 4).
SN-density → 1 in the variables-cost-0 model (David et al. 2013).

[EMPIRICAL] (computed here):

Closed-term counts = OEIS A275057 (validated term-for-term).
Smallest closed non-WN term = $\Omega$ at natural size 9.
Smallest closed separator (WN ∖ SN) at natural size 13, witness $(\lambda a.\lambda b.b),\Omega$.
Separator density rising (~0.014% → ~0.44% of decided, $n=13 \to 55$); decided-SN ≥ 99% throughout; undecided fraction 0 → ~9.7%.
The decidability horizon, including the $n=70$ intractability wall.

[OPEN] (not settled by anyone):

The BGK conjecture itself (WN ⟹ SN in pure type systems).
The asymptotic density of WN, and of the WN ∖ SN gap, in the natural-size model.
The growth rate of the smallest separator under typed restrictions or richer term families, and whether the separator-density curve has a closed form.

7. Where to go next

Reproduce these numbers → Reproduce (pure Python 3.9+, stdlib only, no dependencies).
Download the tables (CSV/JSON, with Wilson CIs and the separator witness) → Dataset.
Understand the engine that decides SN/WN/NWN → Engine and API.
Run a new experiment or extend the census (e.g. typed restrictions, larger $K$, alternative size conventions) → Contributing and Experiments.
Background and motivation → The Research · Quick orientation → Home · Common questions → FAQ.

Contributions are very welcome — corrections, citations that predate our constants, and new experiments especially. Code is MIT, data and docs are CC BY 4.0. Contact: cabotage@pm.me. Newsletter and cross-project info: crabcc.app/about. Support the work: github.com/sponsors/peterlodri-sec.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results and Findings

Results and Findings

1. The model in one breath

2. The smallest separator (natural size 13)

3. Density trends (the census table)

4. The two opposed theorems — and why both are right

5. The decidability horizon

6. Proven vs empirical — the ledger

7. Where to go next

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally